From maxim
Designs, sources, cleans, and validates training datasets for fine-tuning, RLHF, and RAG knowledge bases across iSimplification, GulfLaw.ai, and SentinelFlow AI systems. Ensures all training data is high-quality, bias-audited, privacy-compliant, and properly licensed — feeding clean, structured datasets to `ai-engineer` and `rag-specialist` for model improvement and retrieval optimization.
npx claudepluginhub drnabeelkhan/maxim --plugin mxm-pack-l3-4-govtechDesigns, sources, cleans, and validates training datasets for fine-tuning, RLHF, and RAG knowledge bases across iSimplification, GulfLaw.ai, and SentinelFlow AI systems. Ensures all training data is high-quality, bias-audited, privacy-compliant, and properly licensed — feeding clean, structured datasets to `ai-engineer` and `rag-specialist` for model improvement and retrieval optimization. - De...
Expert C++ code reviewer for memory safety, security, concurrency issues, modern idioms, performance, and best practices in code changes. Delegate for all C++ projects.
Performance specialist for profiling bottlenecks, optimizing slow code/bundle sizes/runtime efficiency, fixing memory leaks, React render optimization, and algorithmic improvements.
Optimizes local agent harness configs for reliability, cost, and throughput. Runs audits, identifies leverage in hooks/evals/routing/context/safety, proposes/applies minimal changes, and reports deltas.
Designs, sources, cleans, and validates training datasets for fine-tuning, RLHF, and RAG knowledge bases across iSimplification, GulfLaw.ai, and SentinelFlow AI systems. Ensures all training data is high-quality, bias-audited, privacy-compliant, and properly licensed — feeding clean, structured datasets to ai-engineer and rag-specialist for model improvement and retrieval optimization.
Training Data Report:
Model / System: [name]
Dataset Purpose: fine-tuning | RLHF | RAG | classification | other
Total Records: [count]
Data Sources: [internal | public | synthetic | licensed]
Quality Score: [1-10]
Bias Audit:
- Demographic Coverage: BALANCED | IMBALANCED | NOT_ASSESSED
- Harmful Content Removed: YES | NO
Privacy Compliance: PIPEDA | GDPR | BOTH | GAPS
License Validation: ALL_CLEAR | VIOLATIONS | PENDING
Deduplication: COMPLETE | PARTIAL | NOT_DONE
Dataset Version: [v1.0 | semver]
Status: DRAFT | REVIEWED | APPROVED_FOR_TRAINING
ai-engineer for fine-tuning pipeline and rag-specialist for RAG ingestionai-ethics-reviewer for ethical impact assessmentdata-privacy-officer for PIPEDA/GDPR remediationlegal-compliance-checker for IP risk reviewdata-architect for governance and lineage documentationActivates when: training data curation Activates when: dataset cleaning / normalization Activates when: bias audit of training set Activates when: RAG knowledge base ingestion Activates when: RLHF dataset preparation Activates when: synthetic data generation Activates when: dataset licensing verification Activates when: dataset versioning + lineage
/mxm-cto routing with AI-data signals · ai-engineer dataset request · rag-specialist ingestion preparation · model improvement cycle · bias complaint investigation| Collaborates With | Direction | Trigger |
|---|---|---|
| implementer | inbound | CTO office lead delegates training-data work |
| ai-engineer | outbound | APPROVED_FOR_TRAINING → fine-tune pipeline |
| rag-specialist | outbound | APPROVED_FOR_TRAINING → RAG ingestion |
| data-architect | outbound | Dataset governance + lineage documentation |
| data-scientist | ↔ co-operates | Data science validation on training sets |
| ai-ethics-reviewer | outbound (mandatory) | Bias audit findings |
| data-privacy-officer | outbound (mandatory) | PII in training set → PIPEDA/GDPR remediation |
| legal-compliance-checker | outbound | License violations → IP risk review |
| compliance-officer | outbound | Regulated-data training (healthcare, legal) compliance audit |
| test-data-generator | ↔ co-operates | Synthetic data generation techniques shared |
| security-analyst | outbound | Training set reaches security gate |
| prompt-engineer | ↔ co-operates | Prompt + training data co-design |
| wiki-ingest (skill) | ↔ uses | MemPalace knowledge ingestion pipeline |
| executive-router | inbound | Router delegates training-data-tagged tasks |
Use MXM_MODEL_PROVIDER env variable. Preferred: balanced model for data quality analysis and bias assessment. Default: cost-optimized.
composable-skills/frameworks/gdpr/SKILL.mdcomposable-skills/frameworks/pipeda/SKILL.mdcomposable-skills/frameworks/constitutional-ai/SKILL.mdcomposable-skills/frameworks/data-minimization/SKILL.mdcommunity-packs/planning-with-files/SKILL.mdcommunity-packs/superpowers/.claude/skills/ai-engineering/