From partme-ai-full-stack-skills
Provides Python API for OCRmyPDF to add searchable text layers to PDFs programmatically, supporting deskew, optimization, page selection, and plugins like EasyOCR, PaddleOCR. Use for OCR pipelines in Python apps.
npx claudepluginhub partme-ai/full-stack-skills --plugin t2ui-skillsThis skill uses the workspace's default tool permissions.
OCRmyPDF provides a Python API for programmatic use and a plugin interface for extending or replacing OCR engines. This skill covers the Python API, integration patterns, and the plugin ecosystem.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
OCRmyPDF provides a Python API for programmatic use and a plugin interface for extending or replacing OCR engines. This skill covers the Python API, integration patterns, and the plugin ecosystem.
For CLI usage, see the ocrmypdf skill. For batch scripting, see ocrmypdf-batch.
import ocrmypdf
# Basic OCR
exit_code = ocrmypdf.ocr('input.pdf', 'output.pdf')
# With options
exit_code = ocrmypdf.ocr(
'input.pdf',
'output.pdf',
language='eng+fra',
deskew=True,
rotate_pages=True,
skip_text=True,
optimize=2,
jobs=4,
)
import ocrmypdf
result = ocrmypdf.ocr('input.pdf', 'output.pdf')
if result == ocrmypdf.ExitCode.ok:
print("OCR completed successfully")
elif result == ocrmypdf.ExitCode.already_done_ocr:
print("PDF already has OCR text")
elif result == ocrmypdf.ExitCode.input_file:
print("Input file issue")
else:
print(f"Error: {result}")
| Parameter | Type | Description |
|---|---|---|
language | str | Tesseract language(s), e.g. 'eng+fra' |
deskew | bool | Straighten crooked pages |
rotate_pages | bool | Auto-rotate pages |
skip_text | bool | Skip pages that already have text |
force_ocr | bool | Force OCR on all pages |
redo_ocr | bool | Replace existing OCR |
optimize | int | Optimization level (0-3) |
output_type | str | 'pdfa', 'pdf', 'auto', 'none' |
jobs | int | Number of parallel workers |
sidecar | str | Path for sidecar text file |
image_dpi | int | DPI for image inputs |
clean | bool | Clean pages with unpaper (OCR only) |
clean_final | bool | Clean pages and use in output |
remove_background | bool | Remove noisy backgrounds |
oversample | int | Oversample DPI for low-res images |
pages | str | Page range, e.g. '1,3,5-10' |
title | str | Output PDF title |
author | str | Output PDF author |
from flask import Flask, request, send_file
import ocrmypdf
import tempfile
import os
app = Flask(__name__)
@app.route('/ocr', methods=['POST'])
def ocr_endpoint():
"""OCR a PDF via HTTP POST."""
if 'file' not in request.files:
return {'error': 'No file uploaded'}, 400
uploaded = request.files['file']
with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as inp:
uploaded.save(inp.name)
out_path = inp.name.replace('.pdf', '_ocr.pdf')
try:
result = ocrmypdf.ocr(
inp.name, out_path,
language='eng',
skip_text=True,
optimize=2,
)
if result == ocrmypdf.ExitCode.ok:
return send_file(out_path, as_attachment=True,
download_name='ocr_output.pdf')
return {'error': f'OCR failed: {result}'}, 500
finally:
os.unlink(inp.name)
if os.path.exists(out_path):
os.unlink(out_path)
if __name__ == '__main__':
app.run(port=5000)
OCRmyPDF provides an optional Streamlit-based web UI:
pip install ocrmypdf[webservice]
# See OCRmyPDF docs for launching the web service
OCRmyPDF's plugin interface allows replacing the OCR engine. Available plugins:
Replaces Tesseract with EasyOCR (PyTorch-based). GPU strongly recommended.
pip install ocrmypdf-easyocr
# Usage
ocrmypdf --plugin ocrmypdf_easyocr -l en input.pdf output.pdf
Replaces Tesseract with PaddleOCR. Powerful GPU-accelerated engine.
pip install ocrmypdf-paddleocr
# Usage
ocrmypdf --plugin ocrmypdf_paddleocr input.pdf output.pdf
Replaces Tesseract with Apple Vision Framework. macOS only.
pip install ocrmypdf-appleocr
# Usage
ocrmypdf --plugin ocrmypdf_appleocr input.pdf output.pdf
paperless-ngx uses OCRmyPDF internally for searchable document management. See paperless-ngx docs for configuration.
Create a custom OCR plugin by implementing the OCRmyPDF plugin interface:
# my_ocr_plugin.py
from ocrmypdf import OcrEngine, hookimpl
class MyOcrEngine(OcrEngine):
"""Custom OCR engine implementation."""
@staticmethod
def version():
return "1.0.0"
@staticmethod
def creator_tag(options):
return "MyOCR"
def recognize(self, input_file, output_file, output_text, options):
# Implement OCR logic here
pass
@hookimpl
def get_ocr_engine():
return MyOcrEngine()
# Use custom plugin
ocrmypdf --plugin my_ocr_plugin input.pdf output.pdf
| Task | Code / Command |
|---|---|
| Python API basic | ocrmypdf.ocr('in.pdf', 'out.pdf') |
| With options | ocrmypdf.ocr('in.pdf', 'out.pdf', language='eng', deskew=True) |
| Check result | if result == ocrmypdf.ExitCode.ok: ... |
| EasyOCR plugin | ocrmypdf --plugin ocrmypdf_easyocr in.pdf out.pdf |
| PaddleOCR plugin | ocrmypdf --plugin ocrmypdf_paddleocr in.pdf out.pdf |
| AppleOCR plugin | ocrmypdf --plugin ocrmypdf_appleocr in.pdf out.pdf |
pip install ocrmypdf in your Python environment.pip install ocrmypdf-easyocr).jobs=1 for large files; process in batches.