Automated quality assurance for Claude Code agents using LLM-as-judge evaluation
/plugin marketplace add BrandCast-Signage/agent-benchmark-kit
/plugin install agent-benchmark-kit@BrandCast-Signage/agent-benchmark-kit
Run automated benchmark tests on Claude Code agents and track performance over time
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Comprehensive feature development workflow with specialized agents for codebase exploration, architecture design, and quality review
End-to-end feature orchestration with testing, security, performance, and deployment
Schema validation, data quality monitoring, streaming validation pipelines, and input validation for backend APIs