ScrapingSync
Fetch Endpoints
Fetch
Extract content from any webpage in markdown or HTML format. Handles JavaScript rendering, anti-bot protection, and dynamic content automatically.
POST
/v1/scrape/singleFetch a single URL and extract its content.
Request Body
| Parameter | Type | Description |
|---|---|---|
urlrequired | string | The URL to fetch |
type | "markdown" | "html" | Output format. Defaults to "markdown" |
onlyMainContent | boolean | Extract only main content (removes nav, ads, footers). Defaults to true |
extractMetadata | boolean | Extract page metadata (title, description, etc.) |
summary | { query: string } | LLM summarization - provide a query/instruction for content summarization |
pdfStrategy | "ocr" | "local" | "auto" | PDF extraction strategy. ocr: vision-based OCR. local: pdf-parse only. auto: pdf-parse first, OCR if needed. Defaults to "ocr" |
actions | Action[] | Browser actions to execute after page load (click, wait, type, press, scroll, waitForSelector, navigate, goBack). Skips lightweight HTTP when present. |
proxy | { country: string } | Proxy for geo-targeted requests (e.g., { country: "us" }). See supported countries |
Example Request
Example Response
POST
/v1/scrape/batchFetch multiple URLs in a single request for better performance.
Request Body
| Parameter | Type | Description |
|---|---|---|
urlsrequired | string[] | object[] | Array of URLs or fetch request objects |
type | "markdown" | "html" | Output format for all URLs. Defaults to "markdown" |
onlyMainContent | boolean | Extract only main content. Defaults to true |
summary | { query: string } | LLM summarization for all content |
pdfStrategy | "ocr" | "local" | "auto" | PDF extraction strategy (applied to all). Defaults to "ocr" |
proxy | { country: string } | Proxy for geo-targeted requests. Applies to all URLs. See supported countries |
Example Request
Example Response
Response Fields
| Field | Type | Description |
|---|---|---|
url | string | The fetched URL |
success | boolean | Whether the fetch was successful |
markdown | string | Extracted content as markdown (if type="markdown") |
html | string | Raw HTML content (if type="html") |
statusCode | number | HTTP status code of the response |
metadata | object | Page metadata (title, description, canonicalUrl, finalUrl, contentType, contentLength) |
content | string | Summarized content (when summary query provided) |
timestamp | string | ISO timestamp when fetching completed |
cost | number | Cost of the request in USD |
error | string | Error message if fetching failed |
LLM Summarization
You can ask the API to summarize the fetched content using an LLM. This is useful when you need specific information extracted from a page.
Example with Summary
The response will include a
content field with the summarized output:Pricing
Flat per-URL pricing. The API automatically selects the best fetching method, and the cost is returned in each response.
| Operation | Cost (USD) | Description |
|---|---|---|
| Fetch (any method) | $0.002/URL | Flat rate per URL (~$2.00 per 1000) |
| Fetch + Summary | $0.001/URL | Additional flat fee when summary is requested (~$1.00 per 1000) |
| PDF OCR | $0.003/page | OCR for scanned PDFs (~$3.00 per 1000 pages) |
| Cached | $0.002/URL | Reduced rate for cached content (~$2.00 per 1000) |