General extraction/ingestion skill that routes to specific workflows based on input type. Extracts structured information from documents, emails, reviews, feedback, and other sources.
From aops-coworknpx claudepluginhub nicsuzor/academicops --plugin aops-coworkThis skill is limited to using the following tools:
procedures/review-inline-comments.mdDesigns and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Taxonomy note: This skill provides domain expertise (HOW) for extracting structured information from documents and sources. See [[TAXONOMY.md]] for the skill/workflow distinction.
General-purpose extraction skill that intelligently routes to specialized workflows based on input type. Extracts structured information from various sources and stores it appropriately (public framework vs. private data).
[[AXIOMS.md]] — including [[AXIOMS.md#P52]] (Read-Then-Write Memory)
MANDATORY before creating any new extracted content: search PKB for existing knowledge on the same subject.
mcp__pkb__search(query="[topic/person/document subject]")
This prevents duplicate memories and grounds extraction in accumulated knowledge. See [[remember]] skill's "search first" step as the model.
Provide a unified entry point for all extraction tasks:
When invoked, analyze the input and route to the appropriate workflow:
Signals:
Route to: workflows/training-data.md
Storage:
$ACA_DATA/processed/review_training/Signals:
Route to: Existing archived/skills/extractor/SKILL.md logic
Storage: Use Skill(skill="remember") for PKB storage
Signals:
Route to: Existing aops-core/skills/decision-extract/SKILL.md
Storage: Daily note with decision formatting
Signals:
Route to: workflows/document-knowledge.md (to be created)
Storage: Depends on content - PKB or framework docs
procedures/review-inline-comments.mdreview.txt + source.pdf + metadata.jsonarchived/skills/review-training/SKILL.mdworkflows/revision-history.md (to be created)See procedures/review-inline-comments.md for detailed procedure.
Quick summary:
CRITICAL: Training data often contains sensitive information (author names, unpublished work, specific critiques).
Sensitive data → $ACA_DATA/processed/review_training/{collection_name}/:
extracted_examples.json (full text/feedback pairs)training_pairs.jsonl (machine-readable format)collection_summary.md (with identifying information)Generalized patterns → Framework (public repo):
aops-core/skills/hydrator/workflows/peer-review.md (update with principles)aops-core/skills/*/references/ (depersonalized examples)High-quality extraction:
Generalization quality:
Delegate to archived/skills/extractor/SKILL.md.
Key principle: Most archival documents have NO long-term value. Be highly selective.
Extract: Concrete outcomes, significant relationships, financial records Skip: Newsletters, invitations, administrative routine, mass communications
Storage: Use Skill(skill="remember") with proper tags and canonical identifiers.
Delegate to aops-core/skills/decision-extract/SKILL.md.
Key principle: Extract tasks requiring approval/choice that are blocking other work.
Storage: Daily note with formatted decision list for batch processing.
$ACA_DATA/processed/Directory structure:
$ACA_DATA/processed/
├── review_training/
│ ├── {collection_name}/
│ │ ├── extracted_examples.json
│ │ ├── training_pairs.jsonl
│ │ ├── collection_summary.md
│ │ └── source_documents/
│ └── ...
├── email_archive/
│ └── ...
└── ...
Access: This directory is:
/home/nic/brain)When adding examples to public framework docs:
/extract vs. specialized skillsUse /extract:
Use specialized skill directly:
/remember - When you know you want to add to knowledge base/decision-extract - When specifically extracting decisions/review-training - When processing matched review/source pairs (legacy)/extract → analyze input → route to:
- /remember (for archival preservation)
- /decision-extract (for pending decisions)
- training-data workflow (for LLM training data)
- document-knowledge workflow (for general extraction)
| Scenario | Behavior |
|---|---|
| Unclear input type | Ask user to clarify extraction goal |
| Cannot convert document format | Try alternative conversion, document failure |
| Ambiguous feedback | Flag with "quality": "ambiguous", include with caveats |
| No clear extraction value | Ask user if they want to skip or force extraction |
| Storage location unclear | Default to $ACA_DATA/processed/, confirm with user |
Input: DOCX file with inline comments from peer review
User: /extract /path/to/review.docx --type peer-review
Agent:
pandoc --track-changes=all$ACA_DATA/processed/review_training/aoir2026/aops-core/skills/hydrator/workflows/peer-review.md with depersonalized principlesInput: Directory of email MSG files
User: /extract emails/2025-Q1/ --type archive
Agent:
/remember to store in PKBInput: Mixed documents without type specified
User: /extract documents/
Agent:
Before completing extraction:
Completeness:
Quality:
Sensitivity:
$ACA_DATA/processed/Documentation: