Help us improve
Share bugs, ideas, or general feedback.
From harness-eval-lab
Full qualitative review of the agent setup. Reads every file, applies per-component rubrics, runs 21 cross-type optimization checks, and produces KEEP/REVIEW/REMOVE verdicts. Use when the user wants a deep review, redundancy check, or quality assessment of their setup.
npx claudepluginhub redhat-community-ai-tools/harness-eval-labHow this skill is triggered — by the user, by Claude, or both
Slash command
/harness-eval-lab:eval-setup-reviewThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Full qualitative review of the user's agent setup. Claude reads every file and evaluates quality, redundancy, coherence, and optimization opportunities.
Creates p5.js generative art with seeded randomness, noise fields, and interactive parameter exploration. Use for algorithmic art, flow fields, or particle systems.
Share bugs, ideas, or general feedback.
Full qualitative review of the user's agent setup. Claude reads every file and evaluates quality, redundancy, coherence, and optimization opportunities.
Before doing anything else, ask the user:
Where should i present the results?
- Terminal - print the report here in the conversation
- File - write a markdown report to a file (you'll choose the path)
Wait for their answer before proceeding.
Determine the setup path. If the user doesn't specify one, use the current working directory.
uv run python skills/eval-setup-lint/scripts/run_assessment.py <setup-path> recommended
Read the JSON output. This gives you per-component diagnostics, token budget, context utilization, trigger overlaps, and dependency findings.
Do NOT present the Layer 1 report separately. Use it as context for the qualitative review.
Read the actual content of every component: SKILL.md files (including reference files in subdirectories), command files, agent files, CLAUDE.md, and settings.json for hooks.
For each component, provide:
Use the per-component rubric files for detailed criteria:
rubric/skills-rubric.mdrubric/claude-md-rubric.mdrubric/commands-rubric.mdrubric/agents-rubric.mdrubric/hooks-rubric.mdRead rubric/cross-type-checks.md and answer all 21 checks with YES/NO and a one-line explanation. These check whether components should be transformed (skill to hook?), merged, or removed.
Based on everything from Steps 2-5, summarize findings in 5 areas. Do not assign numeric scores. Count issues and cite specifics.
Structure: Count Layer 1 structural/frontmatter errors. List by name. "N errors (list)" or "Clean. No issues found."
Security: Count Layer 1 security findings + qualitative concerns. List by name. "N issues (list)" or "Clean. No issues found."
Coherence: Count duplicates, conflicts, trigger overlaps, broken dependencies, cross-type issues. List specifics.
Efficiency: Report always-loaded vs on-demand token ratio, heaviest component with token counts, and context utilization highlights (any models where peak > 20%).
Redundancy: Count components containing content Claude already knows by default. List which ones and why.
Read report-format.md for the full report structure. The report must include:
If the user chose terminal: print the report in the conversation.
If the user chose file: write the report as markdown to the path they specified (or suggest eval-setup-review-report.md in the current directory). Tell them the file path when done.