From retriever
Ingests, parses, normalizes to UTF-8, and previews supported document types for Retriever with per-file failure isolation. Use for extraction-dependent tasks.
npx claudepluginhub sdemyanov/retrieverThis skill uses the workspace's default tool permissions.
> Operates under `retriever:routing`. If the user's intent actually fits a different tier — another `retriever:*` skill, a Tier 2 slash, a Tier 3 `tools.py` subcommand, or (last resort) direct DB access — stop and re-route against the ladder before continuing.
Normalizes single inbox files (plain markdown, Claude.ai JSON/JSONL exports, Readwise MD/CSV highlights, timestamped transcripts, link captures) to clean markdown body plus frontmatter (id, title, source, word_count) for substacker Librarian ingestion.
Parses PDFs, DOCX, PPTX, HTML, images (20+ formats) to Markdown/HTML/JSON/text with layout/tables/OCR. Chunks for RAG pipelines; batch converts via DocumentConverter.
Converts PDF, DOCX, PPTX, XLSX and 10+ formats to token-efficient Markdown/CSV digests with structural compression. Use for feeding documents to Claude without excessive context costs.
Share bugs, ideas, or general feedback.
Operates under
retriever:routing. If the user's intent actually fits a different tier — anotherretriever:*skill, a Tier 2 slash, a Tier 3tools.pysubcommand, or (last resort) direct DB access — stop and re-route against the ladder before continuing.
Use this skill whenever a task changes or depends on document extraction behavior.
Read parsing.md before changing ingest logic or parser dependencies. For spreadsheet parser redesign work, also read SPREADSHEET_PARSING_PLAN.md. For ingest execution-model redesign, also read parallel-ingest-plan.md.
.retriever/previews/.