Help us improve
Share bugs, ideas, or general feedback.
From core
Converts PDF files to structured Markdown with automatic mode selection. Supports simple text extraction (fast) and complex layouts with code/tables (vision). Triggers when the user wants to convert a PDF document, extract PDF content, or transform PDF to Markdown format.
npx claudepluginhub talent-factory/claude-plugins --plugin coreHow this skill is triggered — by the user, by Claude, or both
Slash command
/core:pdf-to-markdownThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Convert PDF documents to structured Markdown with automatic mode selection.
Routes PDF conversions through analysis to select the best extraction strategy and tools based on document type and output format.
Converts PDF, Word, PPTX, PPT, and TXT documents to Markdown, preserving titles, lists, tables, structure, and PPT slide sections. Adds frontmatter, annotates images, suggests output path for archiving or import.
Converts files and office documents (PDF, DOCX, PPTX, XLSX, images with OCR, audio with transcription, HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs) to Markdown using Microsoft MarkItDown.
Share bugs, ideas, or general feedback.
Convert PDF documents to structured Markdown with automatic mode selection.
When the user requests PDF conversion:
Ask the user which mode to use:
| Mode | Best For | Speed |
|---|---|---|
| fast | Simple text documents, reports | Very fast |
| vision | Complex layouts, code, tables, scans | Medium |
Run the converter script:
python ${CLAUDE_PLUGIN_ROOT}/skills/pdf-to-markdown/scripts/pdf_converter.py "<pdf_path>" --mode fast
The script outputs:
<name>.md - Markdown file with extracted text<name>_images/ - Extracted images (if any)Report the result to the user.
Step 1: Convert PDF to page images:
python ${CLAUDE_PLUGIN_ROOT}/skills/pdf-to-markdown/scripts/pdf_converter.py "<pdf_path>" --mode vision [--pages START-END] [--dpi 150]
This creates <name>_pages/ with PNG images.
Step 2: Read each page image using the Read tool and analyze the content.
Step 3: For each page, generate Markdown following these rules:
java, python, etc.)[Image: description]Step 4: Combine all pages into a single Markdown file:
# Document Title
<!-- Page 1 -->
[page 1 content]
---
<!-- Page 2 -->
[page 2 content]
Step 5: Write the final Markdown file using the Write tool.
Step 6: Clean up temporary page images (ask user first).
User: Convert /Users/me/report.pdf to Markdown
Claude: I will convert the PDF. Which mode do you prefer?
- **fast**: Quick text extraction (for simple documents)
- **vision**: Image analysis (for complex layouts, code, tables)
User: fast
Claude: [Runs script, reports result]
Done! Created:
- /Users/me/report.md
- /Users/me/report_images/ (3 images)
User: Convert java-book.pdf to Markdown, contains a lot of code
Claude: Since the document contains code, I recommend vision mode.
Should I convert the entire PDF or only specific pages?
User: Pages 10-30
Claude: [Converts pages to images]
[Reads each page image]
[Generates structured Markdown with code blocks]
[Writes output file]
Done! Created: java-book.md (21 pages)
Should I delete the temporary page images?
User: /pdf-to-markdown /path/to/document.pdf
Claude: [Asks for mode preference]
[Executes conversion]
[Reports result]
Ensure dependencies are installed before first use:
# Fast mode
pip install PyMuPDF Pillow --break-system-packages
# Vision mode (additional)
pip install pdf2image --break-system-packages
brew install poppler # macOS
| Option | Description | Default |
|---|---|---|
--mode fast | PyMuPDF text extraction | Default |
--mode vision | Prepare images for analysis | - |
--pages START-END | Process specific pages | All |
--dpi N | Image resolution | 150 |
--no-images | Skip image extraction (fast) | - |