From document-to-markdown
Use when you need to fully process a PDF into Markdown, chunks, and table exports in one workflow
npx claudepluginhub danielrosehill/claude-code-plugins --plugin document-to-markdownThis skill uses the workspace's default tool permissions.
Orchestrate a complete document extraction pipeline. Converts PDF to Markdown, chunks the output, extracts tables, applies OCR if needed, and produces a manifest summarizing the entire workflow.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Orchestrate a complete document extraction pipeline. Converts PDF to Markdown, chunks the output, extracts tables, applies OCR if needed, and produces a manifest summarizing the entire workflow.
--ocr to force OCR on all documents (optional)--workspace (optional; defaults to the directory containing the PDF)<workspace>/<stem>/, where <stem> is the PDF filename without extension. If --workspace is not specified, use the directory containing the PDF.source.pdf.--ocr was not passed, ask the user; if --ocr is passed or confirmed, run ocr-scanned-pdf on the PDF and work from the OCR'd version.pdf-to-markdown on the (possibly OCR'd) PDF. Save output as <workspace>/<stem>/full.md.chunk-markdown on full.md. Output chunks go to <workspace>/<stem>/chunks/.extract-tables on the (possibly OCR'd) PDF. Output tables go to <workspace>/<stem>/tables/.<workspace>/<stem>/manifest.toon with fields:
source: relative path to source.pdfextractor: which extractor was used for PDF-to-markdown (marker, docling, etc.)ocr_applied: boolean, true if OCR was runchunk_count: number of chunks createdtable_count: number of tables extractedgenerated_at: ISO 8601 timestamptool_versions: dict with versions of key tools (ocrmypdf, marker, etc.)<workspace>/<stem>/source.pdf copied into workspacefull.md with complete document textchunks/ directory with chunked Markdown and indextables/ directory with extracted CSVs and indexmanifest.toon with extraction metadata