Crawl

Recursively crawl a website following links within the same domain. Uses BFS (breadth-first search) with configurable depth and path filtering. Returns scraped content for each page.

POST/v1/crawl

Recursively crawl a website and return content for each discovered page.

Request Body

Parameter	Type	Description
`url`required	string	Starting URL to crawl from
`limit`	number	Max pages to crawl (1-100). Defaults to 10
`maxDepth`	number	Max link depth from start URL (1-10). Defaults to 3
`includePaths`	string[]	Regex patterns — only crawl paths matching at least one pattern
`excludePaths`	string[]	Regex patterns — skip paths matching any pattern
`allowSubdomains`	boolean	Follow links to subdomains of the base domain. Defaults to false
`ignoreQueryParameters`	boolean	Strip query parameters when deduplicating URLs. Defaults to false
`type`	"html" \| "markdown"	Output content format. Defaults to "markdown"
`onlyMainContent`	boolean	Extract only main content (strips nav, ads, footers). Defaults to true
`proxy`	{ country: string }	Proxy for geo-targeted requests. See supported countries

Example Request

cURL

Node.js

curl -X POST https://api.builde2e.com/api/v1/crawl \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer uc-YOUR-API-KEY' \
  -d '{
    "url": "https://docs.example.com",
    "maxDepth": 2,
    "limit": 50,
    "type": "markdown"
  }'

Example Response

json

{
  "success": true,
  "data": [
    {
      "url": "https://example.com/blog",
      "html": null,
      "markdown": "# Blog\n\nLatest posts...",
      "statusCode": 200,
      "success": true,
      "depth": 0
    },
    {
      "url": "https://example.com/blog/hello-world",
      "html": null,
      "markdown": "# Hello World\n\nWelcome to our blog...",
      "statusCode": 200,
      "success": true,
      "depth": 1
    },
    {
      "url": "https://example.com/blog/getting-started",
      "html": null,
      "markdown": "# Getting Started\n\nHere's how to begin...",
      "statusCode": 200,
      "success": true,
      "depth": 1
    }
  ],
  "total": 3,
  "completed": 3,
  "failed": 0,
  "cost": 0.006,
  "timestamp": "2026-03-04T10:15:45.123Z"
}

Response Structure

Top-level Fields

Field	Type	Description
`success`	boolean	Whether the crawl completed successfully
`data`	CrawlPageResult[]	Array of results, one per crawled page
`total`	number	Total pages attempted
`completed`	number	Successfully scraped pages
`failed`	number	Failed pages
`cost`	number	Total cost in USD ($0.002 per successfully scraped page; failed pages are not charged)
`error`	string	Error message if crawl failed entirely
`timestamp`	string	ISO timestamp of the response

CrawlPageResult Object

Each item in the data array contains:

Field	Type	Description
`url`	string	Page URL
`html`	string \| null	HTML content (when type is "html", null otherwise)
`markdown`	string \| null	Markdown content (when type is "markdown", null otherwise)
`statusCode`	number \| null	HTTP status code
`success`	boolean	Whether scraping this page succeeded
`error`	string	Error message if page failed (optional)
`metadata`	object	Page metadata — title, description, etc. (optional)
`depth`	number	Link depth from start URL (0 = start page)

Pricing

Charged per page at $$0.002/page. Only successfully scraped pages are charged — failed pages incur no cost.

Example: crawling 20 pages where 18 succeed costs 18 x $$0.002 = $$0.036.