ScrapingExtract
Extract Endpoint
Extract (Structured Data)
Extract structured data from one or more web pages using AI. Scrapes the provided URLs, combines content, and uses an LLM to return JSON matching your schema.
POST
/v1/extractScrape URLs and extract structured data using an LLM with schema-constrained output.
Request Body
| Parameter | Type | Description |
|---|---|---|
urlsrequired | string[] | URLs to extract data from (1-10) |
schemarequired | Record<string, any> | JSON Schema describing the desired output structure |
promptrequired | string | Natural language extraction instruction (e.g., "Extract product name, price, and rating") |
type | "html" | "markdown" | Content format fed to the LLM. Defaults to "markdown" |
proxy | { country: string } | Proxy for geo-targeted requests. See supported countries |
Example Request
Example Response
Response Structure
| Field | Type | Description |
|---|---|---|
success | boolean | Whether extraction succeeded |
data | any | Extracted structured data matching your schema. null on failure. |
sources | string[] | URLs that were successfully scraped and used for extraction |
cost | number | Cost in USD ($0.002 per successfully scraped URL + $0.005 per extraction) |
error | string | Error message if extraction failed |
timestamp | string | ISO timestamp of the response |
Using Zod Schemas — SDK Only
The Node.js SDK supports Zod schemas for type-safe extraction. Pass a Zod schema instead of JSON Schema and get fully typed results.
Pricing
Extraction is charged in two parts:
| Component | Cost (USD) | Description |
|---|---|---|
| Per URL scraped | $0.002 | Charged per successfully scraped URL (failed URLs are free) |
| Per extraction | $0.005 | One-time LLM extraction fee per request |
Example: extracting from 2 URLs costs
2 x $$0.002 + $$0.005 = $$0.009.