Skill

pdf-to-markdown

Install

Install the plugin

npx claudepluginhub psd401/psd-claude-plugins --plugin psd-productivity

Want just this skill?

Add to a custom plugin, then install with one command.

Description

Convert PDF to clean Markdown with image content described as text. Use when user wants to convert a PDF to markdown, extract content from PDF, or prepare PDF content for AI tools.

Tool Access

This skill is limited to using the following tools:

ReadBash

Supporting Assets

View in Repository

scripts/__pycache__/convert_to_markdown.cpython-313.pyc

scripts/convert_to_markdown.py

Skill Content

PDF to Markdown Converter

Convert PDF files to clean, well-structured Markdown. Tables become markdown tables. Images and graphics are described as text (no image files generated).

Quick Start

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py input.pdf

Output: ~/Desktop/{filename}.md

Options

Flag	Description
`--no-llm`	Skip LLM processing (faster, images become `[Image]` placeholders)
`--force-ocr`	Force OCR on all pages (for scanned PDFs)
`--page-range "0,5-10"`	Process specific pages only

Common Use Cases

Convert a PDF with default settings

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py ~/Documents/report.pdf

Specify output location

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py report.pdf ~/Documents/report.md

Fast conversion (no image descriptions)

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --no-llm report.pdf

Scanned PDF (force OCR)

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --force-ocr scanned_doc.pdf

Extract specific pages

uv run skills/pdf-to-markdown/scripts/convert_to_markdown.py --page-range "0-5" large_report.pdf

Output

Pure Markdown text (no embedded images)
Tables converted to Markdown table format
Images/charts described as text using LLM
Clean formatting suitable for AI processing

Requirements

GEMINI_API_KEY: Required for LLM image descriptions (see SECRETS-SETUP.md)
Use --no-llm flag if you don't have Gemini API access

First Run Note

The first run downloads ML models (~1-2GB) which are cached at ~/.cache/marker/. Subsequent runs are faster.

Technical Details

Uses Marker library:

31k+ GitHub stars
Best-in-class PDF conversion accuracy
Surya OCR for 90+ languages
Gemini LLM integration for image understanding

Links

Stats

Stars0

Forks2

Last CommitMar 13, 2026

Actions

Similar Skills

brainstorming

7 files

You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation.

102.8k

dispatching-parallel-agents

Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies

102.8k

executing-plans

Use when you have a written implementation plan to execute in a separate session with review checkpoints

102.8k