Skill

defuddle

Strips ads, navigation, headers, footers, and boilerplate from web pages, returning clean markdown. Cuts token usage by 40-60% on typical articles.

documentation

Popularity

Stars

9,026

Forks

1,033

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-obsidian:defuddle

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadBash

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Defuddle extracts the meaningful content from a web page and drops everything else: ads, cookie banners, nav bars, related articles, footers, social sharing buttons. What remains is the article body as clean markdown.

SKILL.md

106 lines · ~1.1k tokens

Stats

LanguagePython

Stars9,026

Forks1,033

MaintenanceExcellent

Last CommitMay 28, 2026

Actions

View Source View Plugin View on GitHub View README

#	Principle	Application here
1	OBSERVE (ext)	Which URL? What's actually on the page? Don't assume the title matches the content.
2	OBSERVE (int)	Am I assuming the page has the content the user expects? Verify before extracting.
3	LISTEN	Did the user say "the article" (main content only) or "the link" (everything visible)?
4	THINK	Strip boilerplate, preserve structure, capture metadata. Quote URLs in shell to avoid injection.
5	CONNECT (lat)	How does this domain typically render? Some sites mangle defuddle's heuristics; track those.
6	CONNECT (sys)	Shells out to defuddle-cli (kepano); output lands in `.raw/` for wiki-ingest pickup.
7	FEEL	Clean markdown that reads like the original, not boilerplate residue.
8	ACCEPT	Some pages don't extract well. Flag and move on; don't force when the heuristic loses.
9	CREATE	Markdown to stdout, redirected to `.raw/articles/<slug>-<date>.md`.
10	GROW	Extraction failures suggest defuddle-cli upgrade or alternative extractor — track them as backlog.

#	Principle	Application here
1	OBSERVE (ext)	Which URL? What's actually on the page? Don't assume the title matches the content.
2	OBSERVE (int)	Am I assuming the page has the content the user expects? Verify before extracting.
3	LISTEN	Did the user say "the article" (main content only) or "the link" (everything visible)?
4	THINK	Strip boilerplate, preserve structure, capture metadata. Quote URLs in shell to avoid injection.
5	CONNECT (lat)	How does this domain typically render? Some sites mangle defuddle's heuristics; track those.
6	CONNECT (sys)	Shells out to defuddle-cli (kepano); output lands in `.raw/` for wiki-ingest pickup.
7	FEEL	Clean markdown that reads like the original, not boilerplate residue.
8	ACCEPT	Some pages don't extract well. Flag and move on; don't force when the heuristic loses.
9	CREATE	Markdown to stdout, redirected to `.raw/articles/<slug>-<date>.md`.
10	GROW	Extraction failures suggest defuddle-cli upgrade or alternative extractor — track them as backlog.

defuddle

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

defuddle

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

defuddle: Web Page Cleaner

Install

Usage

Clean a URL directly

Save to .raw/

Add frontmatter header after saving

Clean a local HTML file

When to Use

Fallback

Integration with /wiki-ingest

How to think (10-principle mapping)

Similar Skills

defuddle: Web Page Cleaner

Install

Usage

Clean a URL directly

Save to .raw/

Add frontmatter header after saving

Clean a local HTML file

When to Use

Fallback

Integration with /wiki-ingest

How to think (10-principle mapping)

Similar Skills