From partme-ai-full-stack-skills
Optimizes OCRmyPDF PDFs using compression levels, PDF/A output, JBIG2 encoding, PNG options, and Ghostscript tweaks. Use to reduce file size, create archival PDFs, or refine OCR results.
npx claudepluginhub partme-ai/full-stack-skills --plugin t2ui-skillsThis skill uses the workspace's default tool permissions.
[OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF) provides extensive optimization options to reduce file size, create PDF/A archival documents, and configure output quality.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
OCRmyPDF provides extensive optimization options to reduce file size, create PDF/A archival documents, and configure output quality.
For core OCR functionality, see the ocrmypdf skill. For image processing (deskew, rotate, clean), see ocrmypdf-image. For batch/Docker/scripting, see ocrmypdf-batch.
# Level 0 — no optimization (fastest)
ocrmypdf --optimize 0 input.pdf output.pdf
# Level 1 — lossless (default)
ocrmypdf --optimize 1 input.pdf output.pdf
# Level 2 — lossy (aggressive)
ocrmypdf --optimize 2 input.pdf output.pdf
# Level 3 — lossless, aggressive JPEG recompression
ocrmypdf --optimize 3 input.pdf output.pdf
PDF/A is an archival format with embedded fonts and colorspaces:
# PDF/A-1b (basic, default)
ocrmypdf --output-type pdfa input.pdf output.pdf
# PDF/A-2b (includes transparency)
ocrmypdf --output-type pdfa2b input.pdf output.pdf
# PDF/A-2u (Unicode)
ocrmypdf --output-type pdfa2u input.pdf output.pdf
# Standard PDF (no archival)
ocrmypdf --output-type pdf input.pdf output.pdf
JBIG2 provides excellent compression for monochrome (1-bit) images:
# Enable JBIG2 (requires jbig2enc)
ocrmypdf --jbig2-lossy input.pdf output.pdf # Lossy
ocrmypdf --jbib2-lossless input.pdf output.pdf # Lossless (v17+)
Requirements:
# Debian/Ubuntu
apt install jbig2enc
# macOS
brew install jbig2enc
Optimize embedded PNG images:
# Use pngquant for lossy compression
ocrmypdf --png-lossy input.pdf output.pdf
# Lossless PNG optimization
ocrmypdf --png-lossless input.pdf output.pdf
Fine-tune PDF processing with Ghostscript:
# Set PDF minor version
ocrmypdf --pdf-renderer hatch input.pdf output.pdf
# Use pdfimages for better image extraction
ocrmypdf --pdf-renderer img2pdf input.pdf output.pdf
Generate text file alongside PDF without modifying PDF:
# Generate sidecar only
ocrmypdf --output-type none --sidecar text.txt input.pdf output.pdf
# Typical sidecar workflow
ocrmypdf --sidecar text.txt --force-ocr input.pdf output.pdf
ocrmypdf --optimize 3 --jbig2-lossy --png-lossy input.pdf small.pdf
ocrmypdf --output-type pdfa --optimize 2 input.pdf archival.pdf
ocrmypdf --output-type pdf --optimize 1 --png-lossless input.pdf lossless.pdf
| Task | Command |
|---|---|
| No optimization | --optimize 0 |
| Lossless default | --optimize 1 |
| Aggressive lossy | --optimize 2 |
| Max quality | --optimize 3 |
| PDF/A-1b (default) | --output-type pdfa |
| PDF/A-2b | --output-type pdfa2b |
| JBIG2 lossy | --jbig2-lossy |
| PNG lossy | --png-lossy |
| Sidecar text | --sidecar text.txt |
--optimize 2 or --png-lossy.--output-type pdfa2b for better compatibility.