From html-to-markdown
Fetches a live URL and converts it to Markdown with options for custom User-Agent, preprocessing noisy pages, and --json output with metadata/tables.
How this skill is triggered — by the user, by Claude, or both
Slash command
/html-to-markdown:fetching-and-converting-urlsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this when the user gives a URL instead of HTML and wants the page as
Use this when the user gives a URL instead of HTML and wants the page as
Markdown (or its metadata/tables). The CLI fetches the page over HTTP and
converts it in one step via --url. --url conflicts with a positional FILE.
For crawling many pages or following links, use kreuzcrawl instead — this
skill is for a single URL.
# Fetch a URL, print Markdown to stdout
html-to-markdown --url https://example.com
# Save to a file
html-to-markdown --url https://example.com -o page.md
# Custom User-Agent (default mimics a real browser)
html-to-markdown --url https://example.com --user-agent "MyBot/1.0"
--user-agent requires --url.
Real web pages carry navigation, ads, cookie banners, and forms. Preprocess before converting:
html-to-markdown --url https://example.com/article --preprocess --preset aggressive
# Keep the nav or forms if the page content lives there
html-to-markdown --url https://example.com --preprocess --keep-navigation
Presets: minimal, standard (default), aggressive. --preset and the
--keep-* flags require --preprocess.
Add --json to get the full structured result instead of plain Markdown:
html-to-markdown --url https://example.com --json
{
"content": "# Title\n\nContent\n",
"metadata": {
"document": { "title": "...", "language": "en" },
"headers": [],
"links": [],
"images": [],
"structured_data": []
},
"tables": [],
"images": [],
"warnings": []
}
Useful combinations (all require --json):
# Page title + outline, no Markdown body
html-to-markdown --url https://example.com --json --no-content \
| jq '{title: .metadata.document.title, headings: [.metadata.headers[].text]}'
# Include the document-structure tree
html-to-markdown --url https://example.com --json --include-structure | jq '.document'
# Extract inline image data
html-to-markdown --url https://example.com --json --extract-inline-images | jq '.images | length'
html-to-markdown --url https://example.com --show-warnings > page.md
# non-fatal warnings (truncation, malformed markup) go to stderr
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Conversion or I/O error (including a failed fetch) |
| 2 | Invalid arguments |
See ../html-to-markdown/references/cli-reference.md for the full flag set and
JSON shape.
npx claudepluginhub xberg-io/plugins --plugin html-to-markdownMines projects and conversations into a searchable memory palace. Activates on queries about MemPalace, memory palace, mining, searching, palace setup, wings, rooms, drawers, or recalling past work.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.