Evaluate and optimize AI agents
npx claudepluginhub entityprocess/agentvDesign and review AI agent systems — architecture patterns, workflow design, and plugin quality review
Development skills for building and optimizing AgentV evaluations
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 79 focused plugins, 184 specialized agents, and 150 skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations
Share bugs, ideas, or general feedback.
Evaluate AI agents from the terminal. No server. No signup.
npm install -g agentv
agentv init
agentv eval evals/example.yaml
That's it. Results in seconds, not minutes.
AgentV runs evaluation cases against your AI agents and scores them with deterministic code graders + customizable LLM graders. Everything lives in Git — YAML eval files, markdown judge prompts, JSONL results.
# evals/math.yaml
description: Math problem solving
tests:
- id: addition
input: What is 15 + 27?
expected_output: "42"
assertions:
- type: contains
value: "42"
agentv eval evals/math.yaml
1. Install and initialize:
npm install -g agentv
agentv init
2. Configure targets in .agentv/targets.yaml — point to your agent or LLM provider.
3. Create an eval in evals/:
description: Code generation quality
tests:
- id: fizzbuzz
criteria: Write a correct FizzBuzz implementation
input: Write FizzBuzz in Python
assertions:
- type: contains
value: "fizz"
- type: code-grader
command: ./validators/check_syntax.py
- type: llm-grader
prompt: ./graders/correctness.md
4. Run it:
agentv eval evals/my-eval.yaml
5. Compare results across targets:
agentv compare .agentv/results/runs/<timestamp>/index.jsonl
agentv eval evals/my-eval.yaml # JSONL (default)
agentv eval evals/my-eval.yaml -o report.html # HTML dashboard
agentv eval evals/my-eval.yaml -o results.xml # JUnit XML for CI
Use AgentV programmatically:
import { evaluate } from '@agentv/core';
const { results, summary } = await evaluate({
tests: [
{
id: 'greeting',
input: 'Say hello',
assertions: [{ type: 'contains', value: 'Hello' }],
},
],
});
console.log(`${summary.passed}/${summary.total} passed`);
Full docs at agentv.dev/docs.
git clone https://github.com/EntityProcess/agentv.git
cd agentv
bun install && bun run build
bun test
See AGENTS.md for development guidelines.
MIT