From harness-eval
Compares harness evaluation history: shows score trends, per-tier deltas, diminishing returns detection, grade projections, bilingual reports, and ASCII charts. Useful after 2+ evaluations.
npx claudepluginhub whchoi98/harness-eval --plugin harness-evalThis skill uses the workspace's default tool permissions.
You are performing a harness evaluation comparison. This analyzes evaluation history to show trends and improvements.
Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.
Builds DCF models with sensitivity analysis, Monte Carlo simulations, and scenario planning for investment valuation and risk assessment.
Calculates profitability (ROE, margins), liquidity (current ratio), leverage, efficiency, and valuation (P/E, EV/EBITDA) ratios from financial statements in CSV, JSON, text, or Excel for investment analysis.
You are performing a harness evaluation comparison. This analyzes evaluation history to show trends and improvements.
Get evaluation history: Run:
HARNESS_EVAL_ROOT="${CLAUDE_PLUGIN_ROOT}" bash "${CLAUDE_PLUGIN_ROOT}/scripts/history.sh" "$(pwd)" list
This returns a JSON array of past evaluations.
Check minimum history: If fewer than 2 evaluations exist, inform the user:
"Not enough evaluation history to compare. Run /harness-eval quick or /harness-eval standard at least twice to enable comparison."
Get comparison data: Run:
HARNESS_EVAL_ROOT="${CLAUDE_PLUGIN_ROOT}" bash "${CLAUDE_PLUGIN_ROOT}/scripts/history.sh" "$(pwd)" compare
This returns current vs previous delta.
Present bilingual comparison report (English first, then ---, then Korean):
# Harness Evaluation Comparison
## Current vs Previous
| Metric | Previous | Current | Delta |
|--------|----------|---------|-------|
| Score | {prev_score}/10 | {curr_score}/10 | {delta} |
| Grade | {prev_grade} | {curr_grade} | {changed?} |
## Per-Tier Changes
| Tier | Previous | Current | Delta |
|------|----------|---------|-------|
| Basic | X/Y | X/Y | ↑/↓/→ |
| Functional | X/Y | X/Y | ↑/↓/→ |
| Robust | X/Y | X/Y | ↑/↓/→ |
| Production | X/Y | X/Y | ↑/↓/→ |
---
# 하네스 평가 비교
## 현재 vs 이전
| 지표 | 이전 | 현재 | 변화 |
|------|------|------|------|
| 점수 | {prev_score}/10 | {curr_score}/10 | {delta} |
| 등급 | {prev_grade} | {curr_grade} | {changed?} |
## 단계별 변화
| 단계 | 이전 | 현재 | 변화 |
|------|------|------|------|
| 기본 | X/Y | X/Y | ↑/↓/→ |
| 기능적 | X/Y | X/Y | ↑/↓/→ |
| 견고 | X/Y | X/Y | ↑/↓/→ |
| 프로덕션 | X/Y | X/Y | ↑/↓/→ |
Score history chart: If 3+ evaluations exist, show an ASCII bar chart:
## Score History
eval-04-06-001 ████████░░ 7.2 B
eval-04-06-002 █████████░ 7.9 B
eval-04-06-003 █████████░ 8.5 A-
Use █ for filled, ░ for empty, 10 chars total width.
Trend analysis:
Recommendations: Based on the comparison, suggest the highest-impact actions to continue improving.
Save reports to files: Save the English and Korean comparison reports as separate files:
mkdir -p .harness-eval/reports
.harness-eval/reports/eval-{YYYY-MM-DD}-{NNN}-compare-en.md.harness-eval/reports/eval-{YYYY-MM-DD}-{NNN}-compare-ko.mdUse the Write tool to create each file. Inform the user of the saved file paths.
Be analytical and forward-looking. Focus on trajectory and momentum, not just current state.
Always produce the report in both English and Korean. English section first, then a horizontal rule (---), then the Korean section. Tables, scores, and charts are identical in both sections — only the prose text (analysis, recommendations, warnings) differs.