Help us improve
Share bugs, ideas, or general feedback.
From reducto-cli
This skill should be used when the user asks to "parse a document", "extract data from PDF", "process invoices", "convert PDF to markdown", "extract structured data", "edit a document", mentions "reducto", or discusses document processing, OCR, PDF parsing, invoice extraction, form filling, or data extraction from files.
npx claudepluginhub reductoai/claude-plugins --plugin reducto-cliHow this skill is triggered — by the user, by Claude, or both
Slash command
/reducto-cli:reducto-document-parsingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides guidance for using the Reducto CLI to parse, extract, and edit documents.
Parses PDF, Office, and image files into structured Markdown using the MinerU API. Supports OCR, formula/table recognition, batch processing, and multi-format export (DOCX/HTML/LaTeX).
Parses local files (PDF, DOCX, XLSX, HTML, etc.) into clean markdown on disk. Offers AI summaries and Q&A over document content.
This skill should be used when the user says "process documents", "extract text from PDF", "OCR this document", "convert PDF to markdown", "extract emails from documents", "parse document", "document conversion", "batch OCR", "extract structured data from PDF", "read PDF", "extract tables from PDF", "convert Word document", "convert docx to markdown", or wants to extract, convert, or process documents and scanned images.
Share bugs, ideas, or general feedback.
This skill provides guidance for using the Reducto CLI to parse, extract, and edit documents.
Reducto CLI is a powerful document processing tool that uses AI to:
Before using reducto commands, ensure the user is authenticated:
uvx --from reducto-cli reducto login
This opens a browser for device code authentication.
.pdf.png, .jpg, .jpeg.doc, .docx, .ppt, .pptx.xls, .xlsxConvert documents to Markdown with YAML front matter containing metadata.
Basic usage:
uvx --from reducto-cli reducto parse path/to/document.pdf
Parse entire directory:
uvx --from reducto-cli reducto parse ./documents/
Output: Creates <filename>.parse.md files with parsed content.
| Flag | Description |
|---|---|
--agentic | Enables all agentic options for tables, text, and figures. Increases accuracy but also increases latency. Use for complex layouts or when maximum accuracy is needed. |
--change-tracking | Returns <s> tags around strikethrough text, <u> tags around underlined text, and <change> tags around colored adjacent strikethrough and underlined text. Useful for documents with revision history. |
--highlights | Include highlighted text in output |
--hyperlinks | Include embedded hyperlinks |
--comments | Include document comments |
Examples:
# Maximum accuracy (slower)
uvx --from reducto-cli reducto parse document.pdf --agentic
# Contract with change tracking
uvx --from reducto-cli reducto parse contract.pdf --change-tracking
# All metadata
uvx --from reducto-cli reducto parse document.pdf --hyperlinks --comments --highlights
# Combined flags
uvx --from reducto-cli reducto parse legal_doc.pdf --agentic --change-tracking --comments
Extract specific fields from documents into JSON using a schema.
Basic usage:
uvx --from reducto-cli reducto extract document.pdf --schema schema.json
With inline schema:
uvx --from reducto-cli reducto extract invoice.pdf --schema '{"type": "object", "properties": {"total": {"type": "number"}}}'
Output: Creates <filename>.extract.json files.
{"type": "object", ...}){
"type": "object",
"properties": {
"invoice_number": {"type": "string"},
"date": {"type": "string"},
"vendor": {
"type": "object",
"properties": {
"name": {"type": "string"},
"address": {"type": "string"}
}
},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "number"},
"unit_price": {"type": "number"},
"total": {"type": "number"}
},
"required": ["description", "quantity", "unit_price", "total"]
}
},
"subtotal": {"type": "number"},
"tax": {"type": "number"},
"total": {"type": "number"}
},
"required": ["invoice_number", "items", "total"]
}
Modify documents using natural language instructions.
Basic usage:
uvx --from reducto-cli reducto edit document.pdf --instructions "Fill in the client name as 'Acme Corp'"
Output: Creates <filename>.edited.<extension> files.
Examples:
# Fill out a form
uvx --from reducto-cli reducto edit application.pdf -i "Fill out: Name: John Doe, Email: john@example.com"
# Modify contract details
uvx --from reducto-cli reducto edit contract.pdf -i "Set the contract date to January 15, 2024 and fill in the client name as 'Acme Corporation'"
# Process directory of forms
uvx --from reducto-cli reducto edit ./forms/ -i "Check 'Approved' box and add today's date"
Processing invoices from a folder:
uvx --from reducto-cli reducto parse ./invoices/
uvx --from reducto-cli reducto extract ./invoices/ --schema invoice_schema.json
*.extract.json files.parse.md files for extraction--agentic only when needed (complex layouts, tables, figures)