⌬WorkflowAgentic capabilityFree
GPT Crawler
Crawl a website from a single URL and generate one consolidated knowledge file to power a custom GPT or RAG assistant.
GPT Crawler
GPT Crawler (by Builder.io) crawls a website starting from one or more seed URLs and packages the extracted content into a single knowledge file you can upload to a custom GPT, assistant, or RAG index. You give it a URL match pattern and a CSS selector for the main content, and it walks the site with a headless browser — respecting page limits and concurrency — then emits an output.json ready for ingestion.
Key features
- Point at a docs site or blog and get one clean knowledge file
- Configurable URL match globs, content selector, max pages, and depth
- Headless-browser rendering for JavaScript-heavy pages
- Run as a CLI, a dependency, or inside a container
- Output tailored for OpenAI custom GPTs and assistant uploads
A focused, practical tool for turning existing documentation into an LLM-ready corpus without building a full crawling stack from scratch.
Curated mirror of the open-source GPT Crawler (ISC). Get it from the source.