Help us improve
Share bugs, ideas, or general feedback.
From claude-knowledge-sdk
Parse and crawl llms.txt documentation indexes. Use this when you need to crawl docs, parse llms.txt, index documentation, fetch doc URLs, scrape docs, build a knowledge index, find what pages are in the docs, or work with llms.txt files from any site.
npx claudepluginhub jadecli/claude-knowledge-sdk-typescriptHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-knowledge-sdk:llms-txt-crawlerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Parses llms.txt files (the LLM-friendly documentation standard from llmstxt.org) and crawls
Creates p5.js generative art with seeded randomness, noise fields, and interactive parameter exploration. Use for algorithmic art, flow fields, or particle systems.
Share bugs, ideas, or general feedback.
Parses llms.txt files (the LLM-friendly documentation standard from llmstxt.org) and crawls the discovered documentation URLs to build a local knowledge index.
https://code.claude.com/docs/llms.txt — Claude Code docs (updates daily)https://platform.claude.com/llms.txt — Platform/API docs{domain}/llms.txtUse WebFetch to retrieve the llms.txt URL.
Example: WebFetch https://code.claude.com/docs/llms.txt
The llms.txt format is markdown with:
# Site Name — top-level heading> Description — site description## Section — doc sections- [Title](url): Description — doc page linksParse using the bundled parser:
npx tsx skills/llms-txt-crawler/scripts/parse-llms-txt.ts https://code.claude.com/docs/llms.txt
This outputs structured JSON with all sections and links.
Select which sections/URLs to crawl based on the user's request. For targeted crawling, pick specific sections. For full indexing, crawl all.
For quick single-page fetches (inside agent loop): Use WebFetch on each URL. Good for up to ~20 pages.
For bulk multi-page crawling (outside agent loop): Generate a Scrapy spider project:
python3 skills/llms-txt-crawler/scripts/generate-spider.py ./scrapy-output urls.json
The spider uses ClaudeBot user-agent, respects robots.txt, and rate-limits to 2s between requests.
Save crawled content to ~/.claude/knowledge/ using the SDK's knowledge index format.
The ck fetch-docs CLI command handles this automatically for known Anthropic doc sources.
For deeper crawling beyond what WebFetch handles, generate a full Scrapy project:
cd scrapy-output && uv pip install -e . && scrapy crawl docsdata/crawled.jsonlSee references/scrapy-config.md for ClaudeBot web scraping best practices.
ClaudeBot/1.0 (+https://claude.ai/bot; Anthropic)This skill includes an evaluation suite in evals/evals.json following the
agentskills.io format.
benchmark.json with pass_rate and token deltasllms-txt-crawler-workspace/iteration-N/
eval-name/{with_skill,without_skill}/
outputs/ — files produced by the run
timing.json — {total_tokens, duration_ms}
grading.json — assertion results
benchmark.json — aggregated comparison
Use the skill-creator skill to automate evaluation runs. See evals/README.md for details.