Comprehensive multi-perspective review using specialized judges with debate and consensus building
Conducts comprehensive code reviews using multiple specialized AI judges that debate findings and build consensus. Use this when you need thorough, multi-perspective feedback on completed work before shipping.
/plugin marketplace add NeoLabHQ/context-engineering-kit/plugin install reflexion@context-engineering-kitOptional file paths, commits, or context to review (defaults to recent changes)The review is report-only - findings are presented for user consideration without automatic fixes. </context>
Before starting the review, understand what was done:
Identify the scope of work to review:
Capture relevant context:
Summarize scope for confirmation:
📋 Review Scope:
- Original request: [summary]
- Files changed: [list]
- Approach taken: [brief description]
Proceeding with multi-agent review...
Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews.
Prompt for Agent:
You are a Requirements Validator conducting a thorough review of completed work.
## Your Task
Review the following work and assess alignment with original requirements:
[CONTEXT]
Original Requirements: {requirements}
Work Completed: {summary of changes}
Files Modified: {file list}
[/CONTEXT]
## Your Process (Chain-of-Verification)
1. **Initial Analysis**:
- List all requirements from the original request
- Check each requirement against the implementation
- Identify gaps, over-delivery, or misalignments
2. **Self-Verification**:
- Generate 3-5 verification questions about your analysis
- Example: "Did I check for edge cases mentioned in requirements?"
- Answer each question honestly
- Refine your analysis based on answers
3. **Final Critique**:
Provide structured output:
### Requirements Alignment Score: X/10
### Requirements Coverage:
✅ [Met requirement 1]
✅ [Met requirement 2]
⚠️ [Partially met requirement 3] - [explanation]
❌ [Missed requirement 4] - [explanation]
### Gaps Identified:
- [gap 1 with severity: Critical/High/Medium/Low]
- [gap 2 with severity]
### Over-Delivery/Scope Creep:
- [item 1] - [is this good or problematic?]
### Verification Questions & Answers:
Q1: [question]
A1: [answer that influenced your critique]
...
Be specific, objective, and cite examples from the code.
Prompt for Agent:
You are a Solution Architect evaluating the technical approach and design decisions.
## Your Task
Review the implementation approach and assess if it's optimal:
[CONTEXT]
Problem to Solve: {problem description}
Solution Implemented: {summary of approach}
Files Modified: {file list with brief description of changes}
[/CONTEXT]
## Your Process (Chain-of-Verification)
1. **Initial Evaluation**:
- Analyze the chosen approach
- Consider alternative approaches
- Evaluate trade-offs and design decisions
- Check for architectural patterns and best practices
2. **Self-Verification**:
- Generate 3-5 verification questions about your evaluation
- Example: "Am I being biased toward a particular pattern?"
- Example: "Did I consider the project's existing architecture?"
- Answer each question honestly
- Adjust your evaluation based on answers
3. **Final Critique**:
Provide structured output:
### Solution Optimality Score: X/10
### Approach Assessment:
**Chosen Approach**: [brief description]
**Strengths**:
- [strength 1 with explanation]
- [strength 2]
**Weaknesses**:
- [weakness 1 with explanation]
- [weakness 2]
### Alternative Approaches Considered:
1. **[Alternative 1]**
- Pros: [list]
- Cons: [list]
- Recommendation: [Better/Worse/Equivalent to current approach]
2. **[Alternative 2]**
- Pros: [list]
- Cons: [list]
- Recommendation: [Better/Worse/Equivalent]
### Design Pattern Assessment:
- Patterns used correctly: [list]
- Patterns missing: [list with explanation why they'd help]
- Anti-patterns detected: [list with severity]
### Scalability & Maintainability:
- [assessment of how solution scales]
- [assessment of maintainability]
### Verification Questions & Answers:
Q1: [question]
A1: [answer that influenced your critique]
...
Be objective and consider the context of the project (size, team, constraints).
Prompt for Agent:
You are a Code Quality Reviewer assessing implementation quality and suggesting refactorings.
## Your Task
Review the code quality and identify refactoring opportunities:
[CONTEXT]
Files Changed: {file list}
Implementation Details: {code snippets or file contents as needed}
Project Conventions: {any known conventions from codebase}
[/CONTEXT]
## Your Process (Chain-of-Verification)
1. **Initial Review**:
- Assess code readability and clarity
- Check for code smells and complexity
- Evaluate naming, structure, and organization
- Look for duplication and coupling issues
- Verify error handling and edge cases
2. **Self-Verification**:
- Generate 3-5 verification questions about your review
- Example: "Am I applying personal preferences vs. objective quality criteria?"
- Example: "Did I consider the existing codebase style?"
- Answer each question honestly
- Refine your review based on answers
3. **Final Critique**:
Provide structured output:
### Code Quality Score: X/10
### Quality Assessment:
**Strengths**:
- [strength 1 with specific example]
- [strength 2]
**Issues Found**:
- [issue 1] - Severity: [Critical/High/Medium/Low]
- Location: [file:line]
- Example: [code snippet]
### Refactoring Opportunities:
1. **[Refactoring 1 Name]** - Priority: [High/Medium/Low]
- Current code:
```
[code snippet]
```
- Suggested refactoring:
```
[improved code]
```
- Benefits: [explanation]
- Effort: [Small/Medium/Large]
2. **[Refactoring 2]**
- [same structure]
### Code Smells Detected:
- [smell 1] at [location] - [explanation and impact]
- [smell 2]
### Complexity Analysis:
- High complexity areas: [list with locations]
- Suggested simplifications: [list]
### Verification Questions & Answers:
Q1: [question]
A1: [answer that influenced your critique]
...
Provide specific, actionable feedback with code examples.
Implementation Note: Use the Task tool with subagent_type="general-purpose" to spawn these three agents in parallel, each with their respective prompt and context.
After receiving all three judge reports:
Synthesize the findings:
Conduct debate session (if significant disagreements exist):
Reach consensus:
Compile all findings into a comprehensive, actionable report:
# 🔍 Work Critique Report
## Executive Summary
[2-3 sentences summarizing overall assessment]
**Overall Quality Score**: X/10 (average of three judge scores)
---
## 📊 Judge Scores
| Judge | Score | Key Finding |
|-------|-------|-------------|
| Requirements Validator | X/10 | [one-line summary] |
| Solution Architect | X/10 | [one-line summary] |
| Code Quality Reviewer | X/10 | [one-line summary] |
---
## ✅ Strengths
[Synthesized list of what was done well, with specific examples]
1. **[Strength 1]**
- Source: [which judge(s) noted this]
- Evidence: [specific example]
---
## ⚠️ Issues & Gaps
### Critical Issues
[Issues that need immediate attention]
- **[Issue 1]**
- Identified by: [judge name]
- Location: [file:line if applicable]
- Impact: [explanation]
- Recommendation: [what to do]
### High Priority
[Important but not blocking]
### Medium Priority
[Nice to have improvements]
### Low Priority
[Minor polish items]
---
## 🎯 Requirements Alignment
[Detailed breakdown from Requirements Validator]
**Requirements Met**: X/Y
**Coverage**: Z%
[Specific requirements table with status]
---
## 🏗️ Solution Architecture
[Key insights from Solution Architect]
**Chosen Approach**: [brief description]
**Alternative Approaches Considered**:
1. [Alternative 1] - [Why chosen approach is better/worse]
2. [Alternative 2] - [Why chosen approach is better/worse]
**Recommendation**: [Stick with current / Consider alternative X because...]
---
## 🔨 Refactoring Recommendations
[Prioritized list from Code Quality Reviewer]
### High Priority Refactorings
1. **[Refactoring Name]**
- Benefit: [explanation]
- Effort: [estimate]
- Before/After: [code examples]
### Medium Priority Refactorings
[similar structure]
---
## 🤝 Areas of Consensus
[List where all judges agreed]
- [Agreement 1]
- [Agreement 2]
---
## 💬 Areas of Debate
[If applicable - where judges disagreed]
**Debate 1: [Topic]**
- Requirements Validator position: [summary]
- Solution Architect position: [summary]
- Resolution: [consensus reached or "reasonable disagreement"]
---
## 📋 Action Items (Prioritized)
Based on the critique, here are recommended next steps:
**Must Do**:
- [ ] [Critical action 1]
- [ ] [Critical action 2]
**Should Do**:
- [ ] [High priority action 1]
- [ ] [High priority action 2]
**Could Do**:
- [ ] [Medium priority action 1]
- [ ] [Nice to have action 2]
---
## 🎓 Learning Opportunities
[Lessons that could improve future work]
- [Learning 1]
- [Learning 2]
---
## 📝 Conclusion
[Final assessment paragraph summarizing whether the work meets quality standards and key takeaways]
**Verdict**: ✅ Ready to ship | ⚠️ Needs improvements before shipping | ❌ Requires significant rework
---
*Generated using Multi-Agent Debate + LLM-as-a-Judge pattern*
*Review Date: [timestamp]*
# Review recent work from conversation
/critique
# Review specific files
/critique src/feature.ts src/feature.test.ts
# Review with specific focus
/critique --focus=security
# Review a git commit
/critique HEAD~1..HEAD