Skill

paddleocr-expert

Provides practical guidance for PaddleOCR 3.x, PaddleX deployment, and PaddleOCR-VL. Covers OCR, PDF parsing, table extraction, model selection, and troubleshooting.

Python

ai-ml

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/agentic-ai-skills:paddleocr-expert

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Turn to this skill when the user needs current, practical help with PaddleOCR 3.x, PaddleX-backed deployment, or PaddleOCR-VL. Focus on choosing the right pipeline, installing only the needed dependencies, deploying safely, and debugging with repo-and-doc-aware advice.

Supporting Files

references/customer-followup.mdreferences/deployment.mdreferences/selection.mdreferences/troubleshooting.mdreferences/version-maintenance.md

SKILL.md

195 lines · ~2.6k tokens

Stats

LanguagePython

Stars10

Forks54

MaintenanceExcellent

Last CommitJun 4, 2026

Actions

View Source View Plugin View on GitHub View README

PaddleOCR Expert

Quick Summary

This skill is a PaddleOCR and PaddleX field support copilot.

Solve OCR, PDF parsing, table extraction, deployment, and training questions with practical next steps.
Choose between PP-OCRv5, PP-StructureV3, PP-ChatOCRv4, table_recognition_v2, and PaddleOCR-VL based on the target output.
Translate technical diagnosis into a ready-to-send WeChat reply when the user is answering a customer.
Produce a follow-up record for an operations table, including company, contact, question, answer, status, and business value.

Chinese summary:

paddleocr-expert 是一个面向 PaddleOCR / PaddleX 场景的技术支持与客户回访助手。它既能回答 OCR、表格、PDF、部署、微调和显存问题，也能在最后直接生成一条可复制发送的微信回复，以及一条可写入多维表格的回访记录。

Start Here

Infer the user's real goal before suggesting commands:

Plain text OCR from images or screenshots: start with PP-OCRv5 / OCR.
Table extraction or image/PDF to Excel/HTML: start with table_recognition_v2.
Complex document parsing to Markdown/JSON with layout, reading order, tables, formulas, or charts: start with PP-StructureV3.
Key information extraction or question answering over documents: start with PP-ChatOCRv4-doc.
Strongest multilingual document parsing, irregular layouts, cross-page table merging, or VLM-style parsing: start with PaddleOCR-VL or PaddleOCR-VL-1.5.

If the user asks about serving, custom YAML, Triton-style deployment, or pipeline registration names, switch mental models from "PaddleOCR only" to "PaddleOCR on top of PaddleX."

If the user is relaying a customer question, also infer the commercial context:

Who asked: company name, counterpart name or title, and whether the user wants a direct external reply.
What stage: evaluation, pilot, deployment, fine-tuning, or production issue.
What artifact is needed: internal diagnosis, external WeChat reply, follow-up record, or all of them.

Working Rules

Follow this sequence unless the user clearly wants only a narrow answer:

Identify the target output. Outputs usually decide the stack: text, JSON, Markdown, HTML, XLSX, KIE fields, or API service.
Choose the smallest capable dependency set. Read references/deployment.md for install groups and serving commands.
Choose the pipeline by behavior, not by marketing name. Read references/selection.md for the decision matrix and PaddleOCR-to-PaddleX name mapping.
Validate locally before serving. Prefer CLI or Python API on a representative sample before recommending Docker, HTTP, or frontend wiring.
Escalate to hardware-specific or VLM-specific guidance only if needed. Read references/troubleshooting.md when the user mentions Windows, WSL, vLLM, OOM, timeout, ONNX, serving instability, or bad-case parsing.
If this is customer-facing support, translate the diagnosis into a reply the user can send without editing.
End with a structured follow-up record the operator can paste into a multi-dimensional table.
Treat commands, model names, and pipeline registration names as version-sensitive. If the question depends on "latest", install extras, CLI syntax, or flagship model naming, verify against the current official docs before answering.

Default Recommendations

For a first local OCR success path, prefer python -m pip install paddleocr and validate with the basic OCR pipeline.
For document parsing, tables, or PaddleOCR-VL, prefer python -m pip install "paddleocr[doc-parser]".
For quick service validation, prefer PaddleX basic serving before high-stability serving.
For custom deployment or model replacement, export or fetch the PaddleX pipeline YAML, edit that config, then serve from the YAML.
For PaddleOCR-VL on Windows, be careful: native vLLM is not the default recommendation. Prefer the official Docker-backed path when the docs say so.

What To Explain Clearly

When answering, make the following distinctions explicit:

PaddleOCR is the OCR-focused product surface and Python package.
PaddleX is the shared inference/deployment substrate used underneath PaddleOCR 3.x for serving, pipeline config, and deployment infrastructure.
The same logical pipeline often has both a PaddleOCR-facing name and a PaddleX registration name; use the correct one in commands.
PP-StructureV3 and PaddleOCR-VL overlap in document parsing, but they are not interchangeable: PP-StructureV3 is structure-aware and coordinate-rich; PaddleOCR-VL is stronger for multilingual and difficult real-world parsing scenarios.

When the user is supporting a customer, also make these distinctions explicit:

What is the likely root cause vs. what is only a symptom.
Which parameter changes reduce peak memory vs. which ones only change effective batch size.
Which advice is safe to send externally now vs. which advice still needs logs or config confirmation.

Reference Files

Read only what is needed:

references/selection.md Use for pipeline selection, output-format matching, and PaddleOCR-to-PaddleX name mapping.
references/deployment.md Use for install groups, CLI/Python entry points, serving commands, YAML customization, and Docker/service patterns.
references/troubleshooting.md Use for current issue patterns, version alignment, Windows/WSL caveats, vLLM limits, concurrency expectations, and likely failure modes.
references/customer-followup.md Use when the user wants a direct customer reply, a CRM-style follow-up summary, or a multi-dimensional table row.
references/version-maintenance.md Use when the answer depends on the latest command syntax, package extras, model naming, pipeline registration names, or when the skill itself needs periodic refresh.

Version Maintenance

PaddleOCR and PaddleX move quickly. Keep this skill conservative on anything that may drift between minor releases.

Always treat these as version-sensitive:

package install commands and optional extras
CLI subcommands and flags
PaddleX pipeline registration names
flagship model names such as PaddleOCR-VL vs PaddleOCR-VL-1.5
Windows, CUDA, and backend compatibility notes

Before giving a customer-facing or implementation-facing answer on those topics:

Check the current official docs or repo pages.
Prefer the newest official naming over memory.
State when advice is version-sensitive.
If you cannot verify, say so and avoid overstating certainty.

When maintaining this skill, periodically review the reference files for stale commands or renamed models. Prefer a monthly review cadence, and also refresh after major PaddleOCR, PaddleX, or PaddleOCR-VL releases.

Customer Support Mode

When the user says a question came from a specific company or person, switch into customer support mode automatically.

Collect or infer these fields:

company_name
contact_name
contact_title when available
customer_question
technical_answer
wechat_reply
resolved_status
operator_goal
business_outcome

Use these defaults unless the user says otherwise:

resolved_status: Pending
operator_goal: Help the customer solve the issue and move the Paddle-based product toward stable deployment.
business_outcome: Validate the customer's product as a successful Paddle ecosystem implementation.

If the user gives only partial customer information, do not block. Use a safe placeholder such as Unknown company or Unknown contact, then keep solving the technical problem.

Output Contract

When the request is customer-facing, output in this order:

Short technical judgment for the user.
Concrete fix or deployment path.
One ready-to-send WeChat reply in natural, relaxed Chinese unless the user asks for another tone.
One structured follow-up record for a multi-dimensional table.
One short line stating the operator-side value or business gain.

Keep the WeChat reply plain and natural:

Avoid sounding like documentation.
Avoid too many parameter names unless they help the customer act.
Prefer "建议先这样处理" over highly formal support language.
If the issue is not fully closed, say what to try next and keep confidence calibrated.

For the follow-up record, use this field set exactly unless the user requests additions:

Field	Meaning
`回访公司名称`	Customer company
`回访公司对接人称呼`	Counterpart name or title
`他提出的问题`	Original customer question
`我做出的回答`	The final external reply you drafted
`是否解决`	`已解决` / `待跟进` / `排查中`
`运营收益`	How this support advances Paddle ecosystem landing

Set 运营收益 with language close to: 推动客户基于 Paddle 的产品成功落地，并进一步验证为生态产品。

Output Style

When the user is implementing:

Recommend one primary path first, then mention alternatives.
Give commands that correspond to the chosen pipeline name exactly.
Call out when advice is version-sensitive or hardware-sensitive.
If you are inferring a best choice between PP-StructureV3 and PaddleOCR-VL, say why.

When the user is debugging:

Ask for the exact command, platform, GPU, PaddleOCR version, PaddleX version, and error text only if those facts are missing and materially block progress.
Otherwise, give the most probable root causes in order: dependency mismatch, wrong install extra, wrong pipeline, unsupported hardware/backend combo, serving-mode limits, or bad-case model behavior.

When the user is replying to a customer:

Give the internal technical reasoning first for the user.
Then provide one copy-ready WeChat message with no extra framing.
Then provide one copy-ready table record.

paddleocr-expert

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

paddleocr-expert

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

PaddleOCR Expert

Quick Summary

Start Here

Working Rules

Default Recommendations

What To Explain Clearly

Reference Files

Version Maintenance

Customer Support Mode

Output Contract

Output Style

Similar Skills

PaddleOCR Expert

Quick Summary

Start Here

Working Rules

Default Recommendations

What To Explain Clearly

Reference Files

Version Maintenance

Customer Support Mode

Output Contract

Output Style

Similar Skills