Plugin

evaluate-plugin

Name: evaluate-plugin
Author: laurigates

Evaluate skill effectiveness with behavioral test cases, grade results against assertions, and track quality improvements across benchmark runs. Analyze patterns, compare outputs blindly, and get prioritized suggestions for skill enhancements.

testing

code-quality

What's Inside

Agents3

eval-analyzer

/eval-analyzer

Analyze evaluation results to identify patterns, weaknesses, and improvement opportunities. Operates in comparison mode (with-skill vs baseline) or benchmark mode (trends across runs). Use after grading to generate suggestions.

eval-comparator

/eval-comparator

Blind comparison of two outputs without knowing their origin. Rates content quality and structure quality to objectively determine which output is better. Use to compare with-skill vs baseline runs without bias.

eval-grader

/eval-grader

Grade evaluation runs against predefined assertions. Examines execution transcripts and outputs to determine pass/fail with cited evidence. Use as a subagent from evaluation orchestration skills.

Skills4

evaluate-improve

/evaluate-improve

Analyze evaluation results and suggest concrete skill improvements. Use after running evaluations to get actionable recommendations for improving skill quality, descriptions, or instructions.

evaluate-plugin-batch

/evaluate-plugin-batch

Batch evaluate all skills in a plugin. Runs /evaluate:skill for each skill that has eval cases, then produces a plugin-level report. Use when auditing an entire plugin's quality or before a release.

evaluate-report

/evaluate-report

View evaluation results and benchmark reports. Use when you want to see past eval results, compare benchmark runs, or review quality trends for a skill or plugin.

evaluate-skill

/evaluate-skill

Evaluate a skill's effectiveness by running test cases and grading results. Use when you want to test whether a skill produces correct guidance, validate skill improvements, or benchmark a skill before release.

Stats

Version1.3.1

ReleasedApr 3, 2026

LanguagePython

Stars34

Forks5

MaintenanceGood

LicenseMIT

Last CommitMar 25, 2026

AddedMar 9, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

laurigates-claude-plugins46

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

Claude Plugins

A curated collection of 38 Claude Code plugins providing 300+ skills and 14 agents for development workflows.

Install the Marketplace

Install the full plugin collection as a marketplace:

claude plugin install laurigates/claude-plugins

This registers all 38 plugins. You can then enable individual plugins as needed.

Install Individual Plugins

If you prefer to install plugins one at a time:

claude plugin install laurigates-claude-plugins/<plugin-name>

For example:

claude plugin install laurigates-claude-plugins/git-plugin
claude plugin install laurigates-claude-plugins/python-plugin
claude plugin install laurigates-claude-plugins/testing-plugin

Getting Started

Install the marketplace using the command above
Run a health check — /health:check then /health:audit to diagnose your setup and get plugin recommendations for your stack
Follow the tiered setup — The Plugin Map provides a recommended install order (Tier 0 foundation through Tier 3+ stack-specific), decision trees, and project presets

MCP Server Setup

Use the included justfile for quick MCP server configuration:

# Set up all MCP servers and cclsp
just claude-setup

# Or install individual servers
just mcp-github
just mcp-playwright
just mcp-context7

Alternatively, use the /configure:mcp skill for interactive configuration.

Prerequisites

Bash 5+ — Required for shell scripts. macOS ships Bash 3.2; install via brew install bash.

Plugins by Category

AI & Agents

Plugin	Skills	Description
agent-patterns-plugin	16	Multi-agent coordination and orchestration patterns
agents-plugin	1 + 10 agents	Task-focused agents for test, review, debug, docs, and CI workflows
langchain-plugin	4	LangChain JS/TS development - agents, chains, LangGraph, Deep Agents
prompt-engineering-plugin	1	Prompt engineering for accurate, grounded responses - anti-hallucination workflow

Development

Plugin	Skills	Description
api-plugin	2	API integration and testing - REST endpoints, client generation
blueprint-plugin	30	Blueprint Development methodology - PRD/PRP workflow with version tracking
home-assistant-plugin	4	Home Assistant configuration - automations, scripts, scenes, entities
obsidian-plugin	6	Obsidian CLI operations - vault management, search, properties, tasks
project-plugin	6	Project initialization, management, maintenance, and continuous development

Languages

Plugin	Skills	Description
css-plugin	2	CSS tooling - Lightning CSS transpilation, UnoCSS atomic utilities
python-plugin	17	Python ecosystem - uv, ruff, pytest, basedpyright, packaging
rust-plugin	5	Rust development - cargo, clippy, nextest, memory safety
typescript-plugin	17	TypeScript development - Bun, Biome, ESLint, strict types

Quality & Testing

Plugin	Skills	Description
code-quality-plugin	13	Code review, refactoring, linting, static analysis, debugging methodology
evaluate-plugin	4 + 3 agents	Skill evaluation and benchmarking - test effectiveness, grade results
codebase-attributes-plugin	3	Structured codebase health attributes with severity-based agent routing
feedback-plugin	1	Session feedback analysis - capture skill bugs and enhancements as issues
testing-plugin	15	Test execution, TDD workflow, Vitest, Playwright, mutation testing

Version Control

Plugin	Skills	Description
git-plugin	27 + 1 agent	Git workflows - commits, branches, PRs, worktrees, release-please

CI/CD

Plugin	Skills	Description
finops-plugin	7	GitHub Actions FinOps - billing, cache usage, workflow efficiency
github-actions-plugin	8	GitHub Actions CI/CD - workflows, authentication, inspection

Infrastructure

Plugin	Skills	Description
configure-plugin	42	Project infrastructure standards - pre-commit, CI/CD, Docker, testing
container-plugin	9 + 1 agent	Container development - Docker, registry, Skaffold, OrbStack
kubernetes-plugin	8 + 1 agent	Kubernetes and Helm - deployments, charts, releases, ArgoCD
migration-patterns-plugin	2	Safe database and system migration - dual write, shadow mode
networking-plugin	6	Network diagnostics, discovery, monitoring, HTTP load testing
terraform-plugin	6 + 1 agent	Terraform and Terraform Cloud - infrastructure as code

evaluate-plugin

What's Inside

evaluate-plugin

Popularity

What's Inside

Confidence

README

Claude Plugins

Install the Marketplace

Install Individual Plugins

Getting Started

MCP Server Setup

Prerequisites

Plugins by Category

AI & Agents

Development

Languages

Quality & Testing

Version Control

CI/CD

Infrastructure

Documentation & Communication

Similar Plugins

plugin-eval

prompt-engineering-plugin

rashomon

skill-compass

More by laurigates

git-plugin

hooks-plugin

tools-plugin

comfyui-plugin

configure-plugin

Claude Plugins

Install the Marketplace

Install Individual Plugins

Getting Started

MCP Server Setup

Prerequisites

Plugins by Category

AI & Agents

Development

Languages

Quality & Testing

Version Control

CI/CD

Infrastructure

Documentation & Communication

Popularity

Health & Quality

More by laurigates

git-plugin

hooks-plugin

tools-plugin

comfyui-plugin

configure-plugin

Similar Plugins

plugin-eval

prompt-engineering-plugin

rashomon

skill-compass

skillforge

crucible