Help us improve
Share bugs, ideas, or general feedback.
From claude-mods
Converts local PDF, DOCX, XLSX, PPTX, images via OCR, and audio files to clean Markdown using Microsoft's markitdown CLI. Best for text extraction from local documents.
npx claudepluginhub 0xdarkmatter/claude-mods --plugin claude-modsHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-mods:markitdownThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more.
Converts PDF, DOCX, PPTX, XLSX, images (OCR), audio (transcription), HTML, CSV, JSON, XML, ZIP, EPUB, and YouTube transcripts to clean Markdown using Microsoft MarkItDown. Useful for preparing documents for LLM ingestion or batch conversion.
Converts files and URLs to clean Markdown using MarkItDown. Supports PDF, DOCX, XLSX, PPTX, HTML, images (OCR), audio, CSV, and YouTube transcripts. Optimized for LLM ingestion pipelines.
Converts files (PDF, DOCX, images, audio, etc.) to Markdown for LLM-friendly text. Supports OCR and transcription.
Share bugs, ideas, or general feedback.
Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more.
| Use Case | Recommendation |
|---|---|
| Local files (PDF, Word, Excel) | ✅ Use markitdown - unique capability |
| Web pages | ❌ Use Jina (r.jina.ai/) - 5x faster |
| Blocked/anti-bot sites | ❌ Use Firecrawl |
| OCR on images | ✅ Use markitdown |
| Audio transcription | ✅ Use markitdown |
# Local files (primary use case)
markitdown document.pdf
markitdown report.docx
markitdown data.xlsx
markitdown slides.pptx
markitdown screenshot.png # OCR
# URLs (works, but Jina is faster)
markitdown https://example.com
# Save output
markitdown document.pdf > document.md
| Format | Extensions | Notes |
|---|---|---|
.pdf | Text extraction, tables | |
| Word | .docx | Formatting preserved |
| Excel | .xlsx | Tables to markdown |
| PowerPoint | .pptx | Slides as sections |
| Images | .jpg, .png | OCR text extraction |
| HTML | .html | Clean conversion |
| Audio | .mp3, .wav | Speech-to-text |
| Text | .txt, .csv, .json, .xml | Pass-through/structure |
| URLs | https://... | Works but slower than Jina |
| Tool | Avg Speed | Success Rate |
|---|---|---|
| Jina | 0.5s | 10/10 |
| markitdown | 2.5s | 9/10 |
| Firecrawl | 4.5s | 10/10 |
Verdict: For URLs, use Jina. For local files, markitdown is the only option.
# PDF to markdown (primary use case)
markitdown report.pdf > report.md
# Excel spreadsheet
markitdown financials.xlsx
# Image with text (OCR)
markitdown screenshot.png
# PowerPoint deck
markitdown presentation.pptx > slides.md
# Audio transcription
markitdown meeting.mp3 > transcript.md
| Task | markitdown | Alternative |
|---|---|---|
| PDF text | markitdown file.pdf | PyMuPDF, pdfplumber |
| Word docs | markitdown file.docx | python-docx |
| Excel | markitdown file.xlsx | pandas, openpyxl |
| OCR | markitdown image.png | Tesseract |
| Web pages | Use Jina instead | r.jina.ai/URL (5x faster) |
markitdown's advantage: One CLI for all local document formats. No code needed.