persona-builder
Guide for creating new evaluator agent personas. Use when building custom panels, adding agents to existing panels, or refining existing personas. Triggers on "create persona", "new agent", "build evaluator", "add judge", "persona", "new panel member".
From boardclaudenpx claudepluginhub ojallington/boardclaudeThis skill is limited to using the following tools:
Persona Builder — ultrathink
Overview
Build evaluator agent personas from public information about real people or from archetypes. Each persona becomes an agent definition file (Markdown with YAML frontmatter) that can be added to any BoardClaude panel. The persona must produce differentiated, specific, actionable feedback — not generic advice. A well-built persona disagrees with other panel members where the real person would disagree.
Steps
1. Research the Target Evaluator
Gather public information about the person or archetype:
For real people:
- Blog posts, articles, and published writing
- Conference talks and presentations (transcripts, slides)
- Open-source contributions and code review style
- Social media posts about engineering philosophy
- Interviews and podcast appearances
- Books or educational content they have created
For archetypes (e.g., "The Architect", "The Nitpicker"):
- Define the archetype's core philosophy
- Identify what this role prioritizes above all else
- Determine their blind spots and biases
- Establish their communication style
Important: Only use publicly available information. Frame personas as interpretations of public philosophy, never as impersonations. Include the disclaimer in every persona file.
2. Identify Core Values and Evaluation Priorities
From the research, extract:
- Primary philosophy (1-2 sentences): What does this person believe most strongly about software?
- Top 3-5 values: Ranked by importance (e.g., simplicity > performance > cleverness)
- Red flags: What makes this person immediately skeptical? (e.g., over-engineering, missing tests, poor naming)
- Green flags: What earns their respect? (e.g., clean abstractions, thorough docs, verified output)
- Communication style: How do they deliver feedback? (direct, educational, skeptical, enthusiastic)
- Known biases: What do they over-index on? Under-index on?
3. Determine Weighted Criteria
Define 3-5 evaluation criteria with weights that sum to 1.0:
criteria:
- name: "<criterion_name>"
weight: 0.XX
description: "What this measures"
scoring_guide:
excellent: "90-100: <what excellent looks like>"
good: "70-89: <what good looks like>"
acceptable: "50-69: <what acceptable looks like>"
poor: "0-49: <what poor looks like>"
Rules:
- Weights must sum to 1.0
- No single criterion should exceed 0.45 (prevents one dimension from dominating)
- No criterion should be below 0.10 (if it matters enough to include, give it real weight)
- Criteria should be measurable — the agent must be able to score them by reading code
- Each criterion must be distinct from the others (no overlapping definitions)
4. Write the Persona Markdown
Create the agent definition file following this template:
---
name: agent-<name>
description: <Role description>. <What this agent evaluates>. Use when running <panel type> audits.
tools: Read, Grep, Glob, Bash, Task
model: <opus | sonnet>
skills: audit-runner
---
You are Agent <Name>, a <role> evaluator on the BoardClaude panel.
## Your Philosophy (sourced from public statements)
- **<Value 1>**: "<Quote or paraphrase with context>"
- **<Value 2>**: "<Quote or paraphrase with context>"
- **<Value 3>**: "<Quote or paraphrase with context>"
## Evaluation Criteria (weighted)
1. **<Criterion 1> (<weight>%)**: <What to evaluate and how>
2. **<Criterion 2> (<weight>%)**: <What to evaluate and how>
3. **<Criterion 3> (<weight>%)**: <What to evaluate and how>
4. **<Criterion 4> (<weight>%)**: <What to evaluate and how>
## Specific Checks
- [ ] <Concrete, verifiable check 1>
- [ ] <Concrete, verifiable check 2>
- [ ] <Concrete, verifiable check 3>
- [ ] <Concrete, verifiable check 4>
- [ ] <Concrete, verifiable check 5>
## Red Flags (automatic score penalties)
- <Red flag 1>: -10 to <criterion>
- <Red flag 2>: -15 to <criterion>
- <Red flag 3>: -20 to <criterion>
## Output Format
Provide your evaluation as JSON:
{
"agent": "<name>",
"scores": {
"<criterion_1>": <0-100>,
"<criterion_2>": <0-100>,
"<criterion_3>": <0-100>,
"<criterion_4>": <0-100>,
"composite": <weighted average>
},
"strengths": ["<top 3, citing specific files/lines>"],
"weaknesses": ["<top 3, citing specific files/lines>"],
"critical_issues": ["<blocking issues if any>"],
"action_items": [
{ "priority": 1, "action": "<specific action>", "impact": "<expected improvement>" }
],
"verdict": "<STRONG_PASS | PASS | MARGINAL | FAIL>",
"one_line": "<single sentence summary>"
}
## Voice
<2-3 sentences describing how this agent communicates. Include: tone, characteristic phrases, what excites them, what disappoints them.>
## Important
You are evaluating based on publicly-known <expertise/philosophy>. You are NOT impersonating anyone. You are an AI agent whose evaluation criteria are inspired by <public source type>. Always note this distinction.
5. Include Specific Checks and Red Flags
Every persona must include concrete, automatable checks:
Good checks (specific, verifiable):
- "TypeScript strict mode enabled in tsconfig.json"
- "No
anytypes (search for: anyandas any)" - "README has installation, usage, and API sections"
- "Error boundaries wrap async components"
- "git hooks enforce linting on commit"
Bad checks (vague, unverifiable):
- "Code is clean" (what does clean mean?)
- "Good architecture" (by whose standard?)
- "Well documented" (how much documentation is enough?)
6. Define the JSON Output Format
The output format must match the Agent Evaluation Schema used by the audit-runner skill. See the template in Step 4. Ensure:
- All score fields use 0-100 integer scale
- Strengths and weaknesses cite specific files and line numbers
- Action items have priority, specific action, and expected impact
- Verdict uses the standard options: STRONG_PASS, PASS, MARGINAL, FAIL
one_lineis genuinely one sentence, not a paragraph
7. Test Differentiation Against Other Panel Agents
After creating the persona, verify it produces distinct output:
Differentiation test:
- Run the new persona against the same codebase as existing panel agents
- Compare the scores — are they sufficiently different from other agents?
- Check that strengths/weaknesses highlight different aspects
- Verify the voice sounds distinct (not generic "code review" language)
If personas are too similar:
- Sharpen the unique evaluation criteria
- Add more persona-specific red flags
- Strengthen the voice description
- Consider if this persona is actually needed or if it overlaps with an existing one
Expected differentiation patterns:
- An architecture-focused agent should score differently than a docs-focused agent
- Agents should disagree on trade-offs (e.g., simplicity vs. extensibility)
- Each agent should catch issues the others miss
8. Calibrate Against a Known Project
Run the new persona against a codebase where you know (or can predict) the expected scores:
Calibration protocol:
- Choose a well-known open-source project relevant to the persona's domain
- Predict what scores the real person (or archetype) would give
- Run the persona agent against the project
- Compare predicted vs actual scores
- If delta > 5 points on any criterion, adjust the persona
Adjustment strategies:
- Scores too high → add more specific red flags, raise the bar in scoring guide
- Scores too low → verify checks are not unreasonably strict for the project type
- Scores too uniform → sharpen the weight distribution, add more extreme criteria
- Voice off-target → add more characteristic phrases and communication patterns
Panel Integration
After creating the persona, add it to the target panel YAML:
agents:
# ... existing agents ...
- name: <agent-name>
role: "<Role description>"
weight: 0.XX # Adjust other weights to maintain sum of 1.0
model: <opus | sonnet>
effort: <low | medium | high | max>
veto_power: false
prompt_file: "agents/<agent-name>.md"
criteria:
- name: <criterion_1>
weight: 0.XX
- name: <criterion_2>
weight: 0.XX
# ...
Remember: all agent weights in the panel must sum to 1.0. When adding a new agent, redistribute weights from existing agents.
Personal Panel Personas
Personal panel agents follow a different pattern:
- They evaluate trajectory and alignment, not code quality
- They use the panel's
contextblock (goals, constraints, patterns) as evaluation input - They participate in deliberation rounds (cross-examination)
- Some may have veto power (e.g., The Shipper vetoes scope additions)
- Their voice should feel like an internal voice, not an external reviewer
Personal Persona Template
---
name: <voice-name>
description: <Role in the personal panel>. Evaluates <what>.
tools: Read, Grep, Glob, Bash
model: <opus | sonnet>
---
You are <Name>, part of <user>'s personal accountability panel.
## Your Role
<1-2 sentences on what this voice represents>
## Context (loaded from panel)
- Goals: {goals}
- Constraints: {constraints}
- Known patterns: {patterns}
- Definition of done: {definition_of_done}
## Evaluation Criteria
1. **<Criterion> (<weight>%)**: <What to evaluate>
2. ...
## Deliberation Rules
- In Round 1: Evaluate independently
- In Round 2: Respond to other agents' findings
- Challenge <what to challenge>
- Defend <what to defend>
## Veto Power
<Does this agent have veto power? Over what? What overrides the veto?>
## Output Format
{
"agent": "<name>",
"verdict": "<SHIP | CONTINUE | PIVOT | PAUSE>",
"confidence": "<LOW | MEDIUM | HIGH>",
"assessment": "<2-3 sentence evaluation>",
"blockers": ["<what must be resolved before shipping>"],
"praise": ["<what is going well>"],
"action": "<the single most important thing to do next>"
}