Orchestrate parallel LLM judges to evaluate implementation plans, code artifacts, and PRDs against code quality metrics like SOLID principles, DRY, KISS, accuracy, best practices, and testability, aggregating scored CaseScore JSON results for automated reviews.
npx claudepluginhub closedloop-ai/claude-plugins --plugin judgesEvaluates how accurately an implementation plan accounts for existing code — correctly identifying what to modify vs create, avoiding reimplementation, and finding the right integration points.
Evaluates file and folder structure organization from implementation plans
Evaluates whether an implementation plan is grounded in codebase reality by comparing plan claims against the investigation log. Detects hallucinated file paths, nonexistent modules, and fabricated APIs.
Orchestrates context compression for judge evaluation by determining artifact lists per type, allocating token budgets, and delegating compression
Evaluates whether an implementation plan follows the conventions, patterns, and style found in the actual codebase, as documented in the investigation log.
Evaluates code implementation adherence to custom best practices documents
Evaluates implementation plans for DRY (Don't Repeat Yourself) violations
Evaluates whether an implementation plan addresses the core business/functional goals expressed in the PRD
Evaluates implementation plans for KISS (Keep It Simple) violations
Structural completeness auditor for draft PRDs
Evaluates PRD dependency completeness and integration risk
Evaluates PRD scope discipline and hypothesis traceability
Evaluates PRD acceptance criteria testability and language precision
Evaluates implementation plan readability with focus on clarity, structure, and template adherence
Evaluates code implementation adherence to SOLID Interface Segregation Principle (ISP) and Dependency Inversion Principle (DIP)
Evaluates code implementation adherence to SOLID Liskov Substitution Principle (LSP)
Evaluates code implementation adherence to SOLID Open/Closed Principle (OCP)
Evaluates implementation plans for SSOT (Single Source of Truth) violations
Evaluates technical accuracy of AI assistant responses including API usage, language features, and algorithmic concepts
Evaluates test content quality including coverage, assertions, structure, and best practices
Evaluates whether implementation plan verbosity is appropriately calibrated to problem complexity
Compresses artifacts for judge evaluation. Reads a single raw artifact, applies tiered summarization within a token budget, and returns compacted content with metadata. Isolation via forked context prevents pollution of agent context
Check for a cached plan-evaluation.json result before launching the plan-evaluator agent. This skill should be used in Phase 1.3 (Simple Mode Evaluation) of the orchestrator prompt. Triggers on: entering Phase 1.3, checking simple mode, evaluating plan complexity. Returns EVAL_CACHE_HIT with cached values or EVAL_CACHE_MISS signaling re-evaluation is needed.
Orchestrate parallel judge agent execution, aggregate CaseScore results, write plan-judges.json, code-judges.json, or prd-judges.json, and validate output. Supports evaluating implementation plans (16 judges), code artifacts (11 judges), or PRD artifacts (4 judges) via --artifact-type parameter.
ClosedLoop is an AI platform that brings the speed of individual AI-driven development to the full software development team. We're offering our agents as open sourced Claude Code plugins because we just couldn't keep this a secret for ourselves — check out our agents for planning, code reviews, judging quality and more that outperform Opus 4.6 and Sonnet 4.5 out of the box.
Bootstrap. Plan. Code. Ship. It's that simple.
LLMs are great at non-deterministic content generation — horrible at being repeatably correct.
That's why we took Claude Code and extended it with a lightweight multi-agent orchestration workflow paradigm that works for us; modeling how we collaborate as a team.
Optimized for efficiency & correctness to produce code that lands without the churn; it's grounded in your codebase and outperforms Opus 4.6 out of the box at half the cost.
What's more impactful is that it allowed our team of engineers to shift left; reviewing and approving sprints-worth of work scope in documented implementation plans and generating the code while we slept.
Tickets become Tasks. Epics become Features. Sections of your quarterly roadmap land in a few PRs.
Multi-repository, adaptive self-learning, & artifact-bound phased workflow gates that loop until correct.
Close the Loop on your SDLC with the same tools that made us 400% faster today.
| Plugin | Description |
|---|---|
| bootstrap | Project bootstrapping and initial setup |
| code | Code generation, implementation planning, and iterative development loop |
| code-review | Automated code review with inline GitHub PR comments |
| judges | LLM-as-judge evaluators for plan and code quality |
| platform | Claude Code expert guidance, prompt engineering, and artifact management |
| self-learning | Pattern capture and organizational knowledge sharing |
# Install a plugin from the marketplace
claude /plugin marketplace install closedloop
# Or install from source for development
git clone git@github.com:closedloop-ai/claude-plugins.git
cd claude-plugins
git config core.hooksPath .githooks
# Bootstrap.
claude /bootstrap:start
# Plan. Code.
claude /code:start --prd requirements.md
See CONTRIBUTING.md for development setup, workflow, and code style guidelines.
Our claude code plugins are a low-key engineering preview of the agents that run the larger ClosedLoop platform. These agents should be used for testing in trusted environments.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Uses power tools
Uses Bash, Write, or Edit tools