From claude-knowledge-sdk
Parse and crawl llms.txt documentation indexes. Use this when you need to crawl docs, parse llms.txt, index documentation, fetch doc URLs, scrape docs, build a knowledge index, find what pages are in the docs, or work with llms.txt files from any site.
npx claudepluginhub jadecli/claude-knowledge-sdk-typescriptThis skill uses the workspace's default tool permissions.
Parses llms.txt files (the LLM-friendly documentation standard from llmstxt.org) and crawls
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Automates semantic versioning and release workflow for Claude Code plugins: bumps versions in package.json, marketplace.json, plugin.json; verifies builds; creates git tags, GitHub releases, changelogs.
Parses llms.txt files (the LLM-friendly documentation standard from llmstxt.org) and crawls the discovered documentation URLs to build a local knowledge index.
https://code.claude.com/docs/llms.txt — Claude Code docs (updates daily)https://platform.claude.com/llms.txt — Platform/API docs{domain}/llms.txtUse WebFetch to retrieve the llms.txt URL.
Example: WebFetch https://code.claude.com/docs/llms.txt
The llms.txt format is markdown with:
# Site Name — top-level heading> Description — site description## Section — doc sections- [Title](url): Description — doc page linksParse using the bundled parser:
npx tsx skills/llms-txt-crawler/scripts/parse-llms-txt.ts https://code.claude.com/docs/llms.txt
This outputs structured JSON with all sections and links.
Select which sections/URLs to crawl based on the user's request. For targeted crawling, pick specific sections. For full indexing, crawl all.
For quick single-page fetches (inside agent loop): Use WebFetch on each URL. Good for up to ~20 pages.
For bulk multi-page crawling (outside agent loop): Generate a Scrapy spider project:
python3 skills/llms-txt-crawler/scripts/generate-spider.py ./scrapy-output urls.json
The spider uses ClaudeBot user-agent, respects robots.txt, and rate-limits to 2s between requests.
Save crawled content to ~/.claude/knowledge/ using the SDK's knowledge index format.
The ck fetch-docs CLI command handles this automatically for known Anthropic doc sources.
For deeper crawling beyond what WebFetch handles, generate a full Scrapy project:
cd scrapy-output && uv pip install -e . && scrapy crawl docsdata/crawled.jsonlSee references/scrapy-config.md for ClaudeBot web scraping best practices.
ClaudeBot/1.0 (+https://claude.ai/bot; Anthropic)This skill includes an evaluation suite in evals/evals.json following the
agentskills.io format.
benchmark.json with pass_rate and token deltasllms-txt-crawler-workspace/iteration-N/
eval-name/{with_skill,without_skill}/
outputs/ — files produced by the run
timing.json — {total_tokens, duration_ms}
grading.json — assertion results
benchmark.json — aggregated comparison
Use the skill-creator skill to automate evaluation runs. See evals/README.md for details.