Help us improve
Share bugs, ideas, or general feedback.
From paideia
Converts hand-written/scanned answer PDFs to markdown via OCR, then grades against reference solutions using strategy-based comparison.
npx claudepluginhub taewooopark/paideia --plugin paideiaHow this skill is triggered — by the user, by Claude, or both
Slash command
/paideia:answer-processingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- User uploads an answer PDF and asks to grade it
Transcribes hand-written or scanned answer PDFs to markdown. Supports three OCR tiers: Claude native vision, local Ollama Qwen3-VL, and pytesseract fallback. Engine selectable via .course-meta or per-call override.
Processes academic PDFs into structured Obsidian literature notes and 9-node critical-thinking canvases. Useful for researchers who want to deeply read, summarize, or critique papers and save results in Obsidian.
Grades skill test outputs against expectations using transcripts and files, extracts implicit claims, and provides evaluation feedback.
Share bugs, ideas, or general feedback.
/grade is invokedanswers/<quiz-name>.pdf ← user uploads hand-written scan
↓ (pdf skill, OCR)
answers/converted/<quiz-name>.md
↓ (this skill)
grade report → stdout (compact) + errors/log.md (append)
If /grade was called with an argument, use it as a hint. Otherwise find the most recently modified file in answers/ (not answers/converted/).
Use the vision-ocr skill — delegates to a local VLM (Qwen3-VL 8B via ollama) for clean prose + LaTeX transcription (the script reads INTERFACE_LANG from .course-meta so the VLM keeps the handwriting in its original language), with pytesseract as automatic fallback.
python3 "${CLAUDE_PLUGIN_ROOT}/scripts/vision_ocr.py" answers/<name>.pdf answers/converted/<name>.md
The script handles model warmup, page-by-page inference, and tier fallback. See .claude/skills/vision-ocr/SKILL.md. The output header tells the grader which tier produced the text:
<!-- SOURCE: ..., qwen3-vl:8b @ 300dpi, N pages --> → high-confidence<!-- TIER: tesseract fallback --> → degraded; treat results conservativelyHand-written math OCR will be imperfect. Expect:
Do not grade on algebraic correctness of OCR output. Instead, apply strategy-based grading:
Read the converted MD file. For each problem, identify:
Which pattern(s) did the user invoke? Look for:
Did the reasoning reach the correct end form? Even if algebra is wrong, the user's final expression structure (Does it have a log? A sqrt? A series? Correct variables?) tells you if the approach worked.
Where did they stop? Incomplete work is common; note which step is the last recognizable one.
Open the reference (converted/solutions/<hw>.md for HW, quizzes/<name>_answers.md for quizzes, twins/<id>_<ts>_sol.md for twins, chain/<ts>_sol.md for chain).
For each problem/part, produce a verdict:
## P<n>
- Pattern match: ✅ / ⚠️ / ❌ [user invoked <Pk>, solution uses <Pk>]
- Variable choice: ✅ / ⚠️ / ❌ [user held <x> fixed, should be <y>]
- End form: ✅ / ⚠️ / ❌ [user's final: <form>, expected: <form>]
- Completeness: <last step user reached>
- Overall: <PASS | PARTIAL | FAIL>
- Note: <one line — what to study, which Pk to re-drill>
Do not report line-by-line algebra mistakes unless they are specifically about sign errors or notation bugs that matter on the exam (e.g., missing $-$ on $\kappa$ definition, conjugate vs. transpose confusion).
Canonical errors/log.md schema — single source of truth. Every command that appends here (/grade, /blind, future drills) MUST use exactly these keys. Downstream readers (statusline.py, weakmap, session_start.py) pattern-match on pattern: and problem_id: lines; any drift silently hides entries.
For each non-✅ entry, append to errors/log.md:
- problem_id: <id>
pattern: <Pk>
error_type: pattern-missed | wrong-variable | wrong-end-form | algebraic | sign | definition
summary: "<1 line>"
source: answers/converted/<name>.md
date: <ISO>
Compact table, no verbose explanations:
| Problem | Pattern | Vars | End form | Overall |
|---|---|---|---|---|
| P1 | ✅ | ✅ | ⚠️ | PARTIAL |
| P2 | ❌ | — | — | FAIL |
| P3 | ✅ | ✅ | ✅ | PASS |
Dominant issue: pattern-missed on P2 (used brute-force integration; should use residue theorem, P7).
Drill next: /blind <problem testing P7>, or /pattern P7 for quick review.
Keep this under 15 lines of output.
If OCR yields <100 chars total, ask the user (in INTERFACE_LANG from .course-meta, default en):
"OCR returned too little. PDF quality may be low or the handwriting too small. Options:
(a) re-scan brighter/larger and re-upload
(b) type the answer into .md and save it to answers/converted/<name>.md, then /grade again"
Skip PDF conversion. Read answers/<name>.md directly. Everything else is the same.
Hand-written work often has margin notes, arrows, struck-through attempts. OCR will render them chaotically. Note in the grade (in $INTERFACE_LANG): "Answer ordering ambiguous. My interpretation: . Let me know if different."
If the user pastes their work directly into chat (not as PDF), grade it from context. Still apply strategy-based grading.
❌ Demand pixel-perfect algebra from OCR output ❌ Mark something wrong because OCR mangled a Greek letter ❌ Require the user to retype their solution in LaTeX ❌ Produce 3-page grade reports (stay compact) ❌ Reveal the reference solution before grading (user might be asking "did I get it right" as a first pass)
/gradepdf skill for OCRcourse-index/patterns.md (pattern IDs) and converted/solutions/ or equivalenterrors/log.md and answers/converted/