Search everything...

Skill

parsing

Ingests, parses, normalizes to UTF-8, and previews supported document types for Retriever with per-file failure isolation. Use for extraction-dependent tasks.

developer-tools

npx claudepluginhub sdemyanov/retriever

Tool Access

This skill uses the workspace's default tool permissions.

Preview

> Operates under `retriever:routing`. If the user's intent actually fits a different tier — another `retriever:*` skill, a Tier 2 slash, a Tier 3 `tools.py` subcommand, or (last resort) direct DB access — stop and re-route against the ladder before continuing.

Supporting Assets

parsing.md

SKILL.md

Similar Skills

normalize-format

Normalizes single inbox files (plain markdown, Claude.ai JSON/JSONL exports, Readwise MD/CSV highlights, timestamped transcripts, link captures) to clean markdown body plus frontmatter (id, title, source, word_count) for substacker Librarian ingestion.

thinking-frameworks-skills

docling

Parses PDFs, DOCX, PPTX, HTML, images (20+ formats) to Markdown/HTML/JSON/text with layout/tables/OCR. Chunks for RAG pipelines; batch converts via DocumentConverter.

4 files

beagle-core

distill

Converts PDF, DOCX, PPTX, XLSX and 10+ formats to token-efficient Markdown/CSV digests with structural compression. Use for feeding documents to Claude without excessive context costs.

4 files

crucible

Stats

Stars10

Forks0

Last CommitApr 28, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

parsing | retriever | ClaudePluginHub

Back to Skills

Skill

parsing

From retriever

Ingests, parses, normalizes to UTF-8, and previews supported document types for Retriever with per-file failure isolation. Use for extraction-dependent tasks.

developer-tools

npx claudepluginhub sdemyanov/retriever

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Supporting Assets

parsing.md

SKILL.md

Operates under retriever:routing. If the user's intent actually fits a different tier — another retriever:* skill, a Tier 2 slash, a Tier 3 tools.py subcommand, or (last resort) direct DB access — stop and re-route against the ladder before continuing.

Retriever Parsing

Use this skill whenever a task changes or depends on document extraction behavior.

Required reference

Read parsing.md before changing ingest logic or parser dependencies. For spreadsheet parser redesign work, also read SPREADSHEET_PARSING_PLAN.md. For ingest execution-model redesign, also read parallel-ingest-plan.md.

Rules

Keep ingest transactional per file so one bad document never blocks the rest.
Normalize decoded text to UTF-8 before storing it.
Prefer preserving the original file and writing previews under .retriever/previews/.
Treat unsupported or corrupt files as structured failures, not fatal batch errors.

Similar Skills

normalize-format

thinking-frameworks-skills

docling

Parses PDFs, DOCX, PPTX, HTML, images (20+ formats) to Markdown/HTML/JSON/text with layout/tables/OCR. Chunks for RAG pipelines; batch converts via DocumentConverter.

4 files

beagle-core

distill

Converts PDF, DOCX, PPTX, XLSX and 10+ formats to token-efficient Markdown/CSV digests with structural compression. Use for feeding documents to Claude without excessive context costs.

4 files

crucible

Stats

Stars10

Forks0

Last CommitApr 28, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.