From tavily
Extracts clean markdown or text content from up to 20 URLs using Tavily CLI. Handles JavaScript-rendered pages, LLM-optimized output, and query-focused chunking for targeted extraction.
npx claudepluginhub tavily-ai/skills --plugin tavilyThis skill is limited to using the following tools:
Extract clean markdown or text content from one or more URLs.
Scrapes clean, LLM-optimized markdown from URLs including JavaScript-rendered SPAs. Handles multiple concurrent URLs, main-content extraction, JS rendering waits, and optional queries.
Crawls websites to extract content from multiple pages via Tavily CLI. Saves pages as local markdown files with depth/breadth limits, path filtering, and semantic instructions. Use for bulk doc downloads or site content collection.
Extracts clean Markdown from any URL using ezycopy CLI. Handles JS-rendered pages with headless Chrome, retries on failure, and auto-installs tool if needed.
Share bugs, ideas, or general feedback.
Extract clean markdown or text content from one or more URLs.
If tvly is not found on PATH, install it first:
curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login
Do not skip this step or fall back to other tools.
See tavily-cli for alternative install methods and auth options.
# Single URL
tvly extract "https://example.com/article" --json
# Multiple URLs
tvly extract "https://example.com/page1" "https://example.com/page2" --json
# Query-focused extraction (returns relevant chunks only)
tvly extract "https://example.com/docs" --query "authentication API" --chunks-per-source 3 --json
# JS-heavy pages
tvly extract "https://app.example.com" --extract-depth advanced --json
# Save to file
tvly extract "https://example.com/article" -o article.md
| Option | Description |
|---|---|
--query | Rerank chunks by relevance to this query |
--chunks-per-source | Chunks per URL (1-5, requires --query) |
--extract-depth | basic (default) or advanced (for JS pages) |
--format | markdown (default) or text |
--include-images | Include image URLs |
--timeout | Max wait time (1-60 seconds) |
-o, --output | Save output to file |
--json | Structured JSON output |
| Depth | When to use |
|---|---|
basic | Simple pages, fast — try this first |
advanced | JS-rendered SPAs, dynamic content, tables |
--query + --chunks-per-source to get only relevant content instead of full pages.basic first, fall back to advanced if content is missing.--timeout for slow pages (up to 60s).--include-raw-content), skip the extract step.