From pdf-to-markdown
Converts PDFs to structured Markdown preserving headings, tables, lists, reading order. Use for text extraction, batch processing, RAG ingestion, LLM context, or PDF analysis tasks.
npx claudepluginhub pspdfkit-labs/nutrient-skills --plugin pdf-to-markdownThis skill uses the workspace's default tool permissions.
Convert PDFs into structured, semantic Markdown that preserves the document's logical structure — headings, tables, lists, and reading order — rather than producing flat text. This is significantly higher quality than reading a PDF directly with the `read` tool, which only extracts raw text without structure.
Converts PDF files to Markdown using opendataloader-pdf, extracting text, tables, headings, lists, and images in correct reading order. For PDF parsing, document extraction, and AI/LLM/RAG data preparation.
Converts local PDF, DOCX, XLSX, PPTX, images via OCR, and audio files to clean Markdown using Microsoft's markitdown CLI. Best for text extraction from local documents.
Converts PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, images with OCR, audio with transcription, ZIP, YouTube URLs, EPub to Markdown using markitdown CLI for LLM processing.
Share bugs, ideas, or general feedback.
Convert PDFs into structured, semantic Markdown that preserves the document's logical structure — headings, tables, lists, and reading order — rather than producing flat text. This is significantly higher quality than reading a PDF directly with the read tool, which only extracts raw text without structure.
Before running any commands, set SKILL_DIR to the absolute path of the directory containing this SKILL.md file. Use $SKILL_DIR/bin/pdf-to-markdown in all commands below.
The $SKILL_DIR/bin/pdf-to-markdown wrapper automatically installs the platform-specific binary into ~/.local/share/nutrient/cli/ from the CDN. It caches the binary and only checks for updates every 6 hours, so subsequent runs are fast.
$SKILL_DIR/bin/pdf-to-markdown INPUT.pdf OUTPUT.md
If OUTPUT.md is omitted, the converter writes the Markdown to stdout instead.
For multiple files, pass directories instead of individual files. The converter processes all PDFs in the input directory in parallel, which is much faster than converting one at a time.
$SKILL_DIR/bin/pdf-to-markdown INPUT_DIR/ OUTPUT_DIR/
$SKILL_DIR/bin/pdf-to-markdown INPUT [OUTPUT]Free for processing up to 1,000 documents per calendar month.
Commercial license required for:
Contact sales@nutrient.io for commercial licensing.