Asynchronous site crawling

DeepCrawl API for turning websites into knowledge bases

Start a background crawl, track task status, and collect site content for RAG systems and agent memory workflows.

What this API helps you get

Whole-site knowledge capture

Collect content across a larger website in the background so a product can build a knowledge base without blocking the user.

Whole-site content collection

Crawl docs, blogs, and help centers.

Background job progress

Track long jobs in the background.

Knowledge-base material

Pages ready for RAG and search.

Useful for

Docs KBs / Search indexes / AI workflows

Asynchronous crawling for larger sites

DeepCrawl starts a task and returns a task ID, so your application can poll status while the crawl runs. Use sitemap mode for controlled crawling or all mode for broader link discovery.

Endpoint

POST /deepcrawl

Quickly Turn Websites into Documents

Simply enter a link to quickly crawl all linked content within a site and save it locally.

Markdown Output

Output in clean Markdown file format, directly usable for LLM knowledge base retrieval.

Asynchronous Processing

Tasks run in the background, and you can check the status at any time.

Recursive Link Following

Freely choose between crawling based on a sitemap or crawling all linked content within the site.

Implementation path

Asynchronous crawling for larger sites

Start a background crawl, track task status, and collect site content for RAG systems and agent memory workflows.

Start a DeepCrawl task with a URL and discovery mode.

Store the returned task ID and poll the status endpoint.

Process completed results into a knowledge base or document store.

Best for

Quickly turn websites into documents with our powerful deep crawling capabilities

Building RAG knowledge bases from docs, help centers, and blogs.

Refreshing internal search indexes from a full website.

Packaging large site content without blocking a user request.

Next links

Deep Crawl API

Quickly turn websites into documents with our powerful deep crawling capabilities

Read API docs Read implementation guide See pricing

FAQ

Why is DeepCrawl asynchronous?

Full-site crawling can take longer than a normal request. DeepCrawl returns a task ID so the job can run in the background while your app checks status.

How many credits does DeepCrawl API use?

Starting a DeepCrawl task costs 20 credits.

DeepCrawl API for turning websites into knowledge bases

Asynchronous crawling for larger sites

Quickly Turn Websites into Documents

Markdown Output

Asynchronous Processing

Recursive Link Following

Asynchronous crawling for larger sites

Best for

Deep Crawl API

FAQ

Why is DeepCrawl asynchronous?

How many credits does DeepCrawl API use?

Search1API

Product

Resources

Legal