Help us improve
Share bugs, ideas, or general feedback.
From harness-eval-lab
Deep-evaluate a single skill with static analysis and qualitative issue detection, both individually and in context of the full setup. Use when the user wants to check if a specific skill is worth keeping, well-built, or redundant.
npx claudepluginhub redhat-community-ai-tools/harness-eval-labHow this skill is triggered — by the user, by Claude, or both
Slash command
/harness-eval-lab:eval-skillThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Deep-evaluate a single skill using two layers: static analysis (Layer 1) and qualitative issue detection (Layer 2), both individually and in context of the full setup.
Creates p5.js generative art with seeded randomness, noise fields, and interactive parameter exploration. Use for algorithmic art, flow fields, or particle systems.
Share bugs, ideas, or general feedback.
Deep-evaluate a single skill using two layers: static analysis (Layer 1) and qualitative issue detection (Layer 2), both individually and in context of the full setup.
Before doing anything else, ask the user:
Where should i present the results?
- Terminal - print the report here in the conversation
- File - write a markdown report to a file (you'll choose the path)
Wait for their answer before proceeding.
Determine the skill path. If the user says a skill name, find it under skills/<name>/SKILL.md.
Determine the setup context path (usually the current working directory).
uv run python skills/eval-skill/scripts/run_skill_eval.py <skill-path> <context-path> recommended
If no context path, pass - as the second argument.
Read the JSON output. It contains diagnostics, token count, and contextual findings from Layer 1.
Read the skill's actual content:
Also read for context (don't check these, they're context for evaluating the target skill): 4. All OTHER skill SKILL.md files in the workspace 5. CLAUDE.md 6. Hooks in .claude/settings.json
Read rubric/skills-rubric.md for the issue categories and what to flag.
Check the skill against all 5 categories. For each issue found, cite specific evidence from the content.
Verdict: KEEP (no issues or minor only), REVIEW (multiple issues), REMOVE (fundamentally broken/redundant)
Read rubric/contextual-analysis.md and evaluate all 5 contextual dimensions.
Check redundancy against three sources:
Read report-format.md for the full report structure.
The report must include:
If the user chose terminal: print the report in the conversation.
If the user chose file: write the report as markdown to the path they specified (or suggest eval-skill-report.md in the current directory). Tell them the file path when done.