Skill

fetching-and-converting-urls

Fetches a live URL and converts it to Markdown with options for custom User-Agent, preprocessing noisy pages, and --json output with metadata/tables.

cli-tools

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/html-to-markdown:fetching-and-converting-urls

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this when the user gives a URL instead of HTML and wants the page as

SKILL.md

100 lines · ~685 tokens

Stats

LanguageShell

Parent stars5

MaintenanceExcellent

Last CommitJun 24, 2026

Actions

View Source View Plugin View on GitHub View README

Fetching and converting URLs

Use this when the user gives a URL instead of HTML and wants the page as Markdown (or its metadata/tables). The CLI fetches the page over HTTP and converts it in one step via --url. --url conflicts with a positional FILE.

For crawling many pages or following links, use kreuzcrawl instead — this skill is for a single URL.

Fetch and convert

# Fetch a URL, print Markdown to stdout
html-to-markdown --url https://example.com

# Save to a file
html-to-markdown --url https://example.com -o page.md

# Custom User-Agent (default mimics a real browser)
html-to-markdown --url https://example.com --user-agent "MyBot/1.0"

--user-agent requires --url.

Clean noisy pages

Real web pages carry navigation, ads, cookie banners, and forms. Preprocess before converting:

html-to-markdown --url https://example.com/article --preprocess --preset aggressive

# Keep the nav or forms if the page content lives there
html-to-markdown --url https://example.com --preprocess --keep-navigation

Presets: minimal, standard (default), aggressive. --preset and the --keep-* flags require --preprocess.

JSON output (ConversionResult)

Add --json to get the full structured result instead of plain Markdown:

html-to-markdown --url https://example.com --json

{
  "content": "# Title\n\nContent\n",
  "metadata": {
    "document": { "title": "...", "language": "en" },
    "headers": [],
    "links": [],
    "images": [],
    "structured_data": []
  },
  "tables": [],
  "images": [],
  "warnings": []
}

Useful combinations (all require --json):

# Page title + outline, no Markdown body
html-to-markdown --url https://example.com --json --no-content \
  | jq '{title: .metadata.document.title, headings: [.metadata.headers[].text]}'

# Include the document-structure tree
html-to-markdown --url https://example.com --json --include-structure | jq '.document'

# Extract inline image data
html-to-markdown --url https://example.com --json --extract-inline-images | jq '.images | length'

Surface warnings

html-to-markdown --url https://example.com --show-warnings > page.md
# non-fatal warnings (truncation, malformed markup) go to stderr

Exit codes

Code	Meaning
0	Success
1	Conversion or I/O error (including a failed fetch)
2	Invalid arguments

See ../html-to-markdown/references/cli-reference.md for the full flag set and JSON shape.

fetching-and-converting-urls

Popularity

Invocation

Context Preview

SKILL.md

fetching-and-converting-urls

Popularity

Invocation

Context Preview

SKILL.md

Fetching and converting URLs

Fetch and convert

Clean noisy pages

JSON output (ConversionResult)

Surface warnings

Exit codes

Similar Skills

Fetching and converting URLs

Fetch and convert

Clean noisy pages

JSON output (ConversionResult)

Surface warnings

Exit codes

Similar Skills