Plugin

judges

Name: judges
Author: closedloop-ai

Orchestrate parallel LLM judges to evaluate implementation plans, code artifacts, and PRDs against code quality metrics like SOLID principles, DRY, KISS, accuracy, best practices, and testability, aggregating scored CaseScore JSON results for automated reviews.

Component Overview

Commands

brownfield-accuracy-judge, code-organization-judge +19

Agents

artifact-type-tailored-context, eval-cache +1

Skills

Hooks

MCP Servers

LSP Servers

Output Styles

Themes

Monitors

Install

npx claudepluginhub closedloop-ai/claude-plugins --plugin judges

Component Details

Agents (21)

brownfield-accuracy-judge

/brownfield-accuracy-judge

Evaluates how accurately an implementation plan accounts for existing code — correctly identifying what to modify vs create, avoiding reimplementation, and finding the right integration points.

code-organization-judge

/code-organization-judge

Evaluates file and folder structure organization from implementation plans

codebase-grounding-judge

/codebase-grounding-judge

Evaluates whether an implementation plan is grounded in codebase reality by comparing plan claims against the investigation log. Detects hallucinated file paths, nonexistent modules, and fabricated APIs.

context-manager-for-judges

/context-manager-for-judges

Orchestrates context compression for judge evaluation by determining artifact lists per type, allocating token budgets, and delegating compression

convention-adherence-judge

/convention-adherence-judge

Evaluates whether an implementation plan follows the conventions, patterns, and style found in the actual codebase, as documented in the investigation log.

custom-best-practices-judge

/custom-best-practices-judge

Evaluates code implementation adherence to custom best practices documents

dry-judge

/dry-judge

Evaluates implementation plans for DRY (Don't Repeat Yourself) violations

goal-alignment-judge

/goal-alignment-judge

Evaluates whether an implementation plan addresses the core business/functional goals expressed in the PRD

kiss-judge

/kiss-judge

Evaluates implementation plans for KISS (Keep It Simple) violations

prd-auditor

/prd-auditor

Structural completeness auditor for draft PRDs

prd-dependency-judge

/prd-dependency-judge

Evaluates PRD dependency completeness and integration risk

prd-scope-judge

/prd-scope-judge

Evaluates PRD scope discipline and hypothesis traceability

prd-testability-judge

/prd-testability-judge

Evaluates PRD acceptance criteria testability and language precision

readability-judge

/readability-judge

Evaluates implementation plan readability with focus on clarity, structure, and template adherence

solid-isp-dip-judge

/solid-isp-dip-judge

Evaluates code implementation adherence to SOLID Interface Segregation Principle (ISP) and Dependency Inversion Principle (DIP)

solid-liskov-substitution-judge

/solid-liskov-substitution-judge

Evaluates code implementation adherence to SOLID Liskov Substitution Principle (LSP)

solid-open-closed-judge

/solid-open-closed-judge

Evaluates code implementation adherence to SOLID Open/Closed Principle (OCP)

ssot-judge

/ssot-judge

Evaluates implementation plans for SSOT (Single Source of Truth) violations

technical-accuracy-judge

/technical-accuracy-judge

Evaluates technical accuracy of AI assistant responses including API usage, language features, and algorithmic concepts

test-judge

/test-judge

Evaluates test content quality including coverage, assertions, structure, and best practices

verbosity-judge

/verbosity-judge

Evaluates whether implementation plan verbosity is appropriately calibrated to problem complexity

Skills (3)

artifact-type-tailored-context

/artifact-type-tailored-context

Compresses artifacts for judge evaluation. Reads a single raw artifact, applies tiered summarization within a token budget, and returns compacted content with metadata. Isolation via forked context prevents pollution of agent context

eval-cache

/eval-cache

Check for a cached plan-evaluation.json result before launching the plan-evaluator agent. This skill should be used in Phase 1.3 (Simple Mode Evaluation) of the orchestrator prompt. Triggers on: entering Phase 1.3, checking simple mode, evaluating plan complexity. Returns EVAL_CACHE_HIT with cached values or EVAL_CACHE_MISS signaling re-evaluation is needed.

run-judges

/run-judges

Orchestrate parallel judge agent execution, aggregate CaseScore results, write plan-judges.json, code-judges.json, or prd-judges.json, and validate output. Supports evaluating implementation plans (16 judges), code artifacts (11 judges), or PRD artifacts (4 judges) via --artifact-type parameter.

README

ClosedLoop.AI Claude Plugins

Multi-Agent SDLC Grounded in Your Codebase

ClosedLoop is an AI platform that brings the speed of individual AI-driven development to the full software development team. We're offering our agents as open sourced Claude Code plugins because we just couldn't keep this a secret for ourselves — check out our agents for planning, code reviews, judging quality and more that outperform Opus 4.6 and Sonnet 4.5 out of the box.

Bootstrap. Plan. Code. Ship. It's that simple.

LLMs are great at non-deterministic content generation — horrible at being repeatably correct.

That's why we took Claude Code and extended it with a lightweight multi-agent orchestration workflow paradigm that works for us; modeling how we collaborate as a team.

Optimized for efficiency & correctness to produce code that lands without the churn; it's grounded in your codebase and outperforms Opus 4.6 out of the box at half the cost.

What's more impactful is that it allowed our team of engineers to shift left; reviewing and approving sprints-worth of work scope in documented implementation plans and generating the code while we slept.

Tickets become Tasks. Epics become Features. Sections of your quarterly roadmap land in a few PRs.

Multi-repository, adaptive self-learning, & artifact-bound phased workflow gates that loop until correct.

Close the Loop on your SDLC with the same tools that made us 400% faster today.

Plugins

Plugin	Description
bootstrap	Project bootstrapping and initial setup
code	Code generation, implementation planning, and iterative development loop
code-review	Automated code review with inline GitHub PR comments
judges	LLM-as-judge evaluators for plan and code quality
platform	Claude Code expert guidance, prompt engineering, and artifact management
self-learning	Pattern capture and organizational knowledge sharing

Prerequisites

Python 3.11+ (3.13 recommended)
jq
Claude Code

Quick Start

# Install a plugin from the marketplace
claude /plugin marketplace install closedloop

# Or install from source for development
git clone git@github.com:closedloop-ai/claude-plugins.git
cd claude-plugins
git config core.hooksPath .githooks

# Bootstrap.
claude /bootstrap:start

# Plan. Code.
claude /code:start --prd requirements.md

Benchmarks

Contributing

See CONTRIBUTING.md for development setup, workflow, and code style guidelines.

Disclaimer

Our claude code plugins are a low-key engineering preview of the agents that run the larger ClosedLoop platform. These agents should be used for testing in trusted environments.

License

Apache License 2.0

Similar Plugins

ui-design

32.9k

239

Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns

Stats

Version1.4.0

Parent Repo Stars83

Parent Repo Forks4

MaintenanceExcellent

AddedMar 23, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

closedloop-ai86

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools