Try any of 919 endpoints — live.
Pick an endpoint, load a working example, tweak the params, and send — no signup to try. Results render the way the data deserves; raw JSON, headers & code are one tab away.
SYNC-bounded crawl from a seed URL (max 25 pages): BFS over same-domain internal links with depth + include/exclude regex, scraping each page. Large crawls are a phase-2 async job (not promised here). Firecrawl /crawl parity (bounded).
The page URL to extract. Full URL or bare domain (https:// assumed). Only http/https; private/internal/metadata targets are SSRF-blocked.
Max pages to crawl in this SYNC call (1-25). Larger crawls are a phase-2 async job. (1–25)
Max link depth from the seed URL (0-5, default 2). (0–5)
Only follow links on the seed's registrable domain (default true).
Regex patterns; only URLs whose path matches are crawled.
Regex patterns; URLs whose path matches are skipped.
Which outputs to return (array or comma-string). Any of: markdown, text, html (cleaned main-content), rawHtml, metadata, links, images, jsonld. Default: markdown+metadata. Unknown values are ignored. For screenshots use the web-capture engine.
Per-request timeout in seconds (3-60, default 25). (3–60)
curl -X POST https://api.reefapi.com/web-extract/v1/crawl \
-H "x-api-key: $REEF_KEY" \
-H "content-type: application/json" \
-d '{"url":"https://example.com","max_pages":"2","max_depth":"1"}'Hit Send to run this endpoint live.