Skill

eval-review

Interactively reviews skill evaluation results: presents judge scores and outputs for human feedback, proposes targeted SKILL.md improvements from user input. Use for qualitative review after evals.

Python

Bash

Markdown

testing

developer-tools

npx claudepluginhub opendatahub-io/agent-eval-harness --plugin agent-eval-harness

Tool Access

This skill is limited to using the following tools:

ReadWriteEditBashGlobGrepAgentAskUserQuestionSkill

Preview

You are an interactive reviewer. You present evaluation results to the user, collect their qualitative feedback, analyze patterns in what judges missed vs what humans noticed, and propose targeted SKILL.md improvements. You work alongside `/eval-optimize` (automated fixes) by catching things that judges can't — tone, intent, user experience.

Supporting Assets

prompts/review-results.mdscripts/agent_eval

SKILL.md

Similar Skills

strategic-compact

179.0k

Suggests manual /compact at logical task boundaries in long Claude Code sessions and multi-phase tasks to avoid arbitrary auto-compaction losses.

1 file

ecc

Stats

Stars7

Forks8

Last CommitApr 21, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Argument	Required	Default	Description
`--run-id <id>`	yes	—	Which eval run to review
`--config <path>`	no	`eval.yaml`	Path to eval config
`--case <filter>`	no	all	Substring match to select specific cases

Argument

Required

Default

Description

--run-id <id>

yes

—

Which eval run to review

--config <path>

eval.yaml

Path to eval config

--case <filter>

all

Substring match to select specific cases

run_id: "<id>" reviewed_cases: <count> feedback_cases: <count_with_feedback> reviewer: "human" feedback: case-001-name: "User's comment about this case" case-002-name: "Another comment" case-003-name: "" # empty = acceptable

Argument	Required	Default	Description
`--run-id <id>`	yes	—	Which eval run to review
`--config <path>`	no	`eval.yaml`	Path to eval config
`--case <filter>`	no	all	Substring match to select specific cases

Argument

Required

Default

Description

--run-id <id>

yes

—

Which eval run to review

--config <path>

eval.yaml

Path to eval config

--case <filter>

all

Substring match to select specific cases

eval-review

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

Help us improve

Help us improve

eval-review

Tool Access

Preview

Supporting Assets

SKILL.md

Step 0: Parse Arguments

Step 1: Load Results

Step 2: Present Overview

Step 3: Walk Through Cases

Step 4: Check Transcripts (if available)

Step 5: Save Feedback

Step 6: Analyze Patterns

Step 7: Propose Changes

Step 8: Next Steps

Rules

Similar Skills

Help us improve

Step 0: Parse Arguments

Step 1: Load Results

Step 2: Present Overview

Step 3: Walk Through Cases

Step 4: Check Transcripts (if available)

Step 5: Save Feedback

Step 6: Analyze Patterns

Step 7: Propose Changes

Step 8: Next Steps

Rules