Skill

Usage Examples

Crawls web pages using natural language instructions for link selection and content extraction, indexing into a knowledge store. Supports Playwright for JS sites or fast axios mode.

Node

Bash

Playwright

automation

developer-tools

npx claudepluginhub chris-xperimntl/agent-knowledge

Tool Access

This skill is limited to using the following tools:

Bash(node ${CLAUDE_PLUGIN_ROOT}/dist/index.js crawl:*)

Preview

**⚠️ IMPORTANT: Store name is a POSITIONAL argument, NOT an option!**

SKILL.md

Similar Skills

tavily-crawl

203

Crawls websites to extract content from multiple pages via Tavily CLI. Saves pages as local markdown files with depth/breadth limits, path filtering, and semantic instructions. Use for bulk doc downloads or site content collection.

1 tool

tavily

firecrawl-crawl

Crawls websites to bulk extract content from multiple pages or site sections like /docs, supporting depth limits, path filtering, concurrency, and JSON output via firecrawl CLI.

2 tools

firecrawl

Firecrawl Web Scraper Skill

Scrapes single pages or crawls sites using Firecrawl v2.5 API to LLM-ready markdown and structured data. Handles JS rendering, bot bypass, browser automation for dynamic content extraction.

6 files

firecrawl-scraper

Stats

Stars11

Forks3

Last CommitApr 14, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

⚠️ IMPORTANT: Store name is a POSITIONAL argument, NOT an option!

WRONG: crawl https://example.com --store=my-store
RIGHT: crawl https://example.com my-store

Crawling and indexing: $ARGUMENTS

node ${CLAUDE_PLUGIN_ROOT}/dist/index.js crawl $ARGUMENTS

The web pages will be crawled with Claude-driven intelligent link selection and optional natural language extraction, then indexed for searching. Requires Claude Code to be installed.

Note: The web store is auto-created if it doesn't exist. No need to create the store first.

Usage Examples

Intelligent crawl strategy:

/agent-knowledge:crawl https://code.claude.com/docs/en/ claude-docs --crawl "all Getting Started pages"

With extraction:

/agent-knowledge:crawl https://example.com/pricing pricing-store --extract "extract pricing and features"

Both strategy and extraction:

/agent-knowledge:crawl https://docs.example.com my-docs --crawl "API reference pages" --extract "API endpoints and parameters"

Fast mode (axios-only, no JavaScript rendering):

/agent-knowledge:crawl https://example.com/docs docs-store --fast --max-pages 20

Options

--crawl <instruction> - Natural language instruction for which pages to crawl (e.g., "all Getting Started pages")
--extract <instruction> - Natural language instruction for what content to extract (e.g., "extract API references")
--max-pages <number> - Maximum number of pages to crawl (default: 50)
--fast - Use fast axios-only mode instead of headless browser
- Default behavior uses headless browser (Playwright via crawl4ai) for JavaScript-rendered sites
- Use --fast when the target site doesn't use client-side rendering
- Much faster than headless mode but may miss content from JavaScript-heavy sites