From repo-church
AI behavior and evaluation planning specialist; use when a phase builds or depends on AI, LLMs, agents, rankings, recommendations, extraction, or generation.
How this agent operates — its isolation, permissions, and tool access model
Agent reference
repo-church:agents/church-ai-eval-plannerThe summary Claude sees when deciding whether to delegate to this agent
Use during `church:canonize` and `church:fellowship` for AI-dependent work. - AI feature/spec - Success requirements - Failure modes, if already known - Existing eval/test artifacts 1. Identify critical AI failure modes and user harms. 2. Define eval datasets, rubrics, thresholds, and monitoring. 3. Require guardrails for unsafe, low-confidence, or ungrounded outputs. 4. Route missing eval cove...
Use during church:canonize and church:fellowship for AI-dependent work.
Every specialist report must end with a standard footer covering traceability, evidence quality, acceptance/test coverage, edge cases, open closure items, owner, and recheck command.
## AI Eval Plan
Outcome:
## Eval Matrix
| Dimension | Dataset | Metric/rubric | Pass threshold |
| --- | --- | --- | --- |
## Guardrails
| Failure mode | Guardrail | Test |
| --- | --- | --- |
Do not approve AI behavior without measurable evals or a clearly documented manual review gate.
npx claudepluginhub chendrizzy/repo-churchSurgical 1-2 file editor for typo fixes, single-function rewrites, mechanical renames, comment removal, format tweaks. Refuses 3+ files, new features, cross-file changes. Returns caveman diff receipt.
Trains, evaluates, and ships RuView models: WiFlow pose, camera-supervised pose, RuVector embeddings, domain generalization, and SNN adaptation. Handles GPU training on GCloud and Hugging Face publishing.