ScrapingAsync

Async Fetch Endpoints

Async Fetch

Submit fetch jobs asynchronously for large batches, sequences, or long-running scrapes. Jobs are queued and processed in the background — poll for status and retrieve results when ready.

All async endpoints return a job ID immediately (HTTP 202). Use the job management endpoints below to poll for status and retrieve results.

POST/v1/scrape/async/batch

Submit up to 30 URLs as an async batch job. Each URL is scraped independently and results are collected.

Request Body

Parameter	Type	Description
`urls`required	string[] \| object[]	Array of URLs or detailed request objects. Max 30.
`type`	"markdown" \| "html"	Output format for all URLs. Defaults to "markdown"
`onlyMainContent`	boolean	Extract only main content. Defaults to true
`summary`	{ query: string }	LLM summarization for all content
`pdfStrategy`	"ocr" \| "local" \| "auto"	PDF extraction strategy (applied to all). Defaults to "ocr"
`proxy`	{ country: string }	Proxy for geo-targeted requests. See supported countries

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/scrape/async/batch \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "urls": [
      "https://example.com/page1",
      "https://example.com/page2",
      "https://example.com/page3"
    ],
    "type": "markdown",
    "onlyMainContent": true
  }'

Response (HTTP 202)

json

{
  "success": true,
  "id": "job_abc123def456",
  "url": "/api/v1/scrape/async/job/job_abc123def456"
}

POST/v1/scrape/async/single

Submit a single URL for async scraping. Useful for long-running pages or when you want non-blocking behavior.

Request Body

Parameter	Type	Description
`url`required	string	The URL to fetch
`type`	"markdown" \| "html"	Output format. Defaults to "markdown"
`onlyMainContent`	boolean	Extract only main content. Defaults to true
`extractMetadata`	boolean	Extract page metadata
`summary`	{ query: string }	LLM summarization query
`pdfStrategy`	"ocr" \| "local" \| "auto"	PDF extraction strategy. Defaults to "ocr"
`actions`	Action[]	Browser actions to execute after page load (click, wait, type, press, scroll, waitForSelector, navigate, goBack)
`proxy`	{ country: string }	Proxy for geo-targeted requests. See supported countries

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/scrape/async/single \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "url": "https://example.com/heavy-page",
    "type": "markdown",
    "onlyMainContent": true,
    "extractMetadata": true
  }'

POST/v1/scrape/async/sequence

Execute a multi-step browser sequence asynchronously. Each step can be an action (click, type, wait, etc.) or a scrape breakpoint that captures page content.

Request Body

Parameter	Type	Description
`startUrl`required	string	Starting URL to navigate to
`steps`required	SequenceStep[]	Ordered list of steps — actions and scrape breakpoints
`type`	"markdown" \| "html"	Output format. Defaults to "markdown"
`summary`	{ query: string }	Summary query applied to each breakpoint
`proxy`	{ country: string }	Proxy for geo-targeted requests. See supported countries
`extractMetadata`	boolean	Extract page metadata at each breakpoint

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/scrape/async/sequence \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "startUrl": "https://example.com/login",
    "steps": [
      { "type": "type", "selector": "#email", "text": "[email protected]" },
      { "type": "type", "selector": "#password", "text": "secret" },
      { "type": "click", "selector": "button[type=submit]" },
      { "type": "wait", "milliseconds": 3000 },
      { "type": "scrape" }
    ],
    "type": "markdown"
  }'

Step Types

Field	Type	Description
`scrape`	{ type: "scrape" }	Capture page content at this point (breakpoint)
`click`	{ type: "click", selector: string }	Click an element matching the CSS selector
`type`	{ type: "type", selector: string, text: string }	Type text into an input element
`wait`	{ type: "wait", milliseconds: number }	Wait for a specified duration
`press`	{ type: "press", key: string }	Press a keyboard key (e.g., "Enter", "Tab")
`scroll`	{ type: "scroll", direction?: "up" \| "down", selector?: string }	Scroll the page or a specific element. Direction defaults to "down"
`waitForSelector`	{ type: "waitForSelector", selector: string, timeout?: number }	Wait for an element to appear on the page. Timeout in ms (max 30s)
`navigate`	{ type: "navigate", url: string }	Navigate to a different URL
`goBack`	{ type: "goBack" }	Navigate back to the previous page

POST/v1/scrape/async/sequence/batch

Fan out multiple independent sequences as a single batch job. Each sequence runs in its own browser.

Request Body

Parameter	Type	Description
`sequences`required	AsyncSequenceRequest[]	Array of sequence requests to execute in parallel
`type`	"markdown" \| "html"	Shared output format for sequences that omit their own
`proxy`	{ country: string }	Shared proxy for sequences that omit their own. See supported countries

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/scrape/async/sequence/batch \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "sequences": [
      {
        "startUrl": "https://example.com/page-a",
        "steps": [
          { "type": "click", "selector": ".load-more" },
          { "type": "wait", "milliseconds": 2000 },
          { "type": "scrape" }
        ]
      },
      {
        "startUrl": "https://example.com/page-b",
        "steps": [
          { "type": "scrape" }
        ]
      }
    ],
    "type": "markdown"
  }'

POST/v1/scrape/async/search/batch

Submit up to 20 search queries as an async batch job.

Request Body

Parameter	Type	Description
`queries`required	string[]	Search queries. Min: 1, Max: 20
`limit`	number	Results per query. Range: 1-50. Defaults to 10
`location`	string	ISO country code for geo-targeting. See supported countries
`includeDomains`	string[]	Only include results from these domains
`excludeDomains`	string[]	Exclude results from these domains
`engine`	"google" \| "bing" \| "chatgpt" \| "perplexity"	Search engine to use. Defaults to "google"

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/scrape/async/search/batch \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "queries": [
      "nvidia stock price 2026",
      "tesla earnings Q1 2026",
      "apple WWDC 2026 announcements"
    ],
    "limit": 10,
    "location": "US"
  }'

Job Management

These endpoints work for all async job types — fetch, sequence, and search.

GET/v1/scrape/async/job/:id

Get the status and results of an async job. Results are paginated.

Query Parameters

Parameter	Type	Description
`offset`	number	Results offset for pagination. Defaults to 0
`limit`	number	Number of results per page. Defaults to 10

Example Request

cURL

Node.js

curl https://api.builde2e.com/api/v1/scrape/async/job/job_abc123def456 \
  -H 'Authorization: Bearer uc-YOUR-API-KEY'

Response Fields

Field	Type	Description
`id`	string	Unique job ID
`status`	string	"pending" \| "processing" \| "completed" \| "failed" \| "cancelled"
`type`	string	"scrape-single" \| "scrape-batch" \| "sequence" \| "sequence-batch" \| "search-batch"
`total`	number	Total number of URLs/items
`completed`	number	Number of completed items
`expiresAt`	string	ISO timestamp when results expire
`data`	array	Array of results (paginated)
`next`	string	URL to next page of results, if more exist

Example Response

json

{
  "id": "job_abc123def456",
  "status": "completed",
  "type": "scrape-batch",
  "total": 3,
  "completed": 3,
  "expiresAt": "2026-03-03T12:00:00.000Z",
  "data": [
    {
      "url": "https://example.com/page1",
      "success": true,
      "markdown": "# Page 1 Content...",
      "statusCode": 200,
      "cost": 0.002
    },
    {
      "url": "https://example.com/page2",
      "success": true,
      "markdown": "# Page 2 Content...",
      "statusCode": 200,
      "cost": 0.002
    }
  ],
  "next": "/api/v1/scrape/async/job/job_abc123def456?offset=2&limit=2"
}

GET/v1/scrape/async/jobs

List all async jobs for the authenticated user. Supports pagination and filtering by job type.

Query Parameters

Parameter	Type	Description
`page`	number	Page number. Defaults to 1
`limit`	number	Items per page. Defaults to 20
`type`	string	Filter by job type: "scrape-single", "scrape-batch", "sequence", "sequence-batch", "search-batch"

Example Request

cURL

Node.js

curl 'https://api.builde2e.com/api/v1/scrape/async/jobs?page=1&limit=10&type=scrape-batch' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY'

Response Fields

Field	Type	Description
`jobs`	array	Array of job summary objects
`total`	number	Total number of jobs
`page`	number	Current page
`limit`	number	Items per page

Job Summary Object

Field	Type	Description
`id`	string	Job ID
`type`	string	Job type
`status`	string	Job status
`total`	number	Total items
`completed`	number	Completed items
`createdAt`	string	Creation timestamp
`expiresAt`	string	Expiry timestamp

DELETE/v1/scrape/async/job/:id

Cancel a pending or in-progress async job.

Example Request

cURL

Node.js

curl -X DELETE https://api.builde2e.com/api/v1/scrape/async/job/job_abc123def456 \
  -H 'Authorization: Bearer uc-YOUR-API-KEY'

Example Response

json

{
  "success": true,
  "id": "job_abc123def456",
  "status": "cancelled"
}

Common Response: Job Created

All async start endpoints return the same response shape:

Field	Type	Description
`success`	boolean	Whether the job was created successfully
`id`	string	Unique job ID for polling
`url`	string	Relative URL to poll for job status

waitForJob() — SDK Only

The Node.js SDK provides a convenience method that polls a job until it reaches a terminal state (completed, failed, or cancelled).

Node.js

// Wait with defaults (poll every 2s, timeout after 10min)
const result = await BuildE2E.waitForJob(job.id);

// Custom polling interval and timeout
const result2 = await BuildE2E.waitForJob(job.id, {
  pollIntervalMs: 5000,  // Poll every 5 seconds
  timeoutMs: 300000      // Timeout after 5 minutes
});

if (result.status === 'completed') {
  console.log('All results:', result.data);
} else {
  console.error('Job failed:', result.status);
}

Pricing

Async jobs use the same per-URL pricing as sync endpoints. There is no additional charge for async processing.

Operation	Cost (USD)	Description
Fetch (per URL)	`$0.002`	Same flat rate as sync fetch
Fetch + Summary	`$0.001`	Additional fee when summary is requested
Search (per query)	`$0.007`	Same flat rate as sync search