From harness-claude
Enforces tiered performance gates blocking commits/merges on cyclomatic complexity, coupling, structural issues, and runtime regressions. Use after changes, on PRs, for audits.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Performance enforcement and benchmark management. Tier-based gates block commits and merges based on complexity, coupling, and runtime regression severity.
Detects performance regressions across git versions by comparing benchmarks for latency (p50/p95/p99), throughput, memory, CPU/I/O, with statistical tests and reports.
Enforces Red-Green-Refactor TDD with correctness tests and Vitest benchmarks for performance-critical features, hot-path logic, and spec-defined requirements like <100ms response.
Share bugs, ideas, or general feedback.
Performance enforcement and benchmark management. Tier-based gates block commits and merges based on complexity, coupling, and runtime regression severity.
No merge with Tier 1 performance violations. No commit with cyclomatic complexity exceeding the error threshold.
Tier 1 violations are non-negotiable blockers. If a Tier 1 violation is detected, execution halts and the violation must be resolved before any further progress. Do not attempt workarounds.
Assess current performance posture. Run check_performance to evaluate current performance metrics against defined budgets. Run get_critical_paths to identify performance-critical functions and hot paths.
Run structural checks. Execute harness check-perf --structural to compute complexity metrics for all changed files:
Run coupling checks. Execute harness check-perf --coupling to compute coupling metrics:
Classify violations by tier:
If Tier 1 violations found, report them immediately and STOP. Do not proceed to benchmarks. The violations must be fixed first.
If no violations found, proceed to Phase 2.
Hotspot scoring and coupling analysis benefit from the knowledge graph but work without it.
Staleness sensitivity: Medium -- auto-refresh if >10 commits stale. Hotspot scoring uses churn data which does not change rapidly.
| Feature | With Graph | Without Graph |
|---|---|---|
| Hotspot scoring (churn x complexity) | GraphComplexityAdapter computes from graph nodes | git log --format="%H" -- <file> for per-file commit count; complexity from check-perf --structural output; multiply manually |
| Coupling ratio | GraphCouplingAdapter computes from graph edges | Parse import statements, count fan-out/fan-in per file |
| Critical path resolution | Graph inference (high fan-in) + @perf-critical annotations | @perf-critical annotations only; grep for decorator/comment |
| Transitive dep depth | Graph BFS depth | Import chain follow, 2 levels deep |
Notice when running without graph: "Running without graph (run harness scan to enable hotspot scoring and coupling analysis)"
Impact on tiers: Without graph, Tier 1 hotspot detection is degraded. Hotspot scoring falls back to churn-only (no complexity multiplication). This limitation is documented in the performance report output.
This phase runs only when .bench.ts files exist in the project. If none are found, skip to Phase 3.
Check baseline lock-in. Before running benchmarks, verify baselines are kept in sync:
.bench.ts files changed in this PR: git diff --name-only | grep '.bench.ts'.bench.ts files are new or modified:
.harness/perf/baselines.json is also modified in this PRharness perf baselines update and commit the result.".bench.ts files changed: skip this check--check-baselines flagCheck for benchmark files. Scan the project for *.bench.ts files. If none exist, skip this phase entirely.
Verify clean working tree. Run git status --porcelain. If there are uncommitted changes, STOP. Benchmarks on dirty trees produce unreliable results.
Run benchmarks. Execute harness perf bench to run all benchmark suites.
Load baselines. Load existing baselines via get_perf_baselines before running new benchmarks. Read .harness/perf/baselines.json for previous benchmark results. If no baselines exist, treat this as a baseline-capture run.
Compare results against baselines using the RegressionDetector:
Resolve critical paths via CriticalPathResolver:
@perf-critical annotations in source filesFlag regressions by tier:
If this is a baseline-capture run, report results without regression comparison. Recommend running harness perf baselines update to persist. After benchmarks pass, update baselines via update_perf_baselines to record new performance targets.
Format violations by tier. Present Tier 1 violations first (most severe), then Tier 2, then Tier 3. Each violation entry includes:
Show hotspot scores for top functions if knowledge graph data is available:
Show benchmark regression summary if benchmarks ran:
Recommend specific actions for each Tier 1 and Tier 2 violation:
Output the report in structured markdown format suitable for PR comments or CI output.
Tier 1 violations present — FAIL. Block commit and merge. List all Tier 1 violations with their locations and values. The developer must fix these before proceeding.
Tier 2 violations present, no Tier 1 — WARN. Allow commit but block merge until addressed. List all Tier 2 violations. These must be resolved before the PR can be merged.
Only Tier 3 or no violations — PASS. Proceed normally. Log Tier 3 violations as informational notes.
Record gate decision in .harness/state.json under a perfGate key:
{
"perfGate": {
"result": "pass|warn|fail",
"tier1Count": 0,
"tier2Count": 0,
"tier3Count": 0,
"timestamp": "ISO-8601"
}
}
Exit with appropriate code: 0 for pass, 1 for fail, 0 for warn (with warning output).
harness check-perf — Primary command for all performance checks. Runs structural and coupling analysis.harness check-perf --structural — Run only structural complexity checks.harness check-perf --coupling — Run only coupling analysis.harness perf bench — Run benchmarks only. Requires clean working tree.harness perf baselines show — View current benchmark baselines.harness perf baselines update — Persist current benchmark results as new baselines.harness perf --check-baselines -- Verify baseline file is updated when benchmarks change. Runs the baseline lock-in check standalone.harness perf critical-paths — View the current critical path set and how it was determined.harness validate — Run after enforcement to verify overall project health.harness graph scan — Refresh knowledge graph for accurate hotspot scoring.| Tier | Severity | Gate | Examples |
|---|---|---|---|
| 1 | error | Block commit | Cyclomatic complexity > 15, >5% regression on critical path, hotspot in top 5%, circular dependency |
| 2 | warning | Block merge | Complexity > 10, nesting > 4, >10% regression elsewhere, fan-out > 10, size budget exceeded |
| 3 | info | None | File length > 300 lines, fan-in > 20, transitive depth > 30, >5% non-critical regression |
harness validate passes after enforcementThese are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
| Rationalization | Why It Is Wrong |
|---|---|
| "The cyclomatic complexity is 16 but the function is straightforward, so I can override the Tier 1 threshold" | Tier 1 violations are non-negotiable blockers. No merge with Tier 1 performance violations. If a threshold needs adjustment, reconfigure with documented justification. |
| "The benchmark regression is only 6% and it is probably just noise" | The noise margin (default 3%) is applied before flagging. A 6% regression on a perf-critical path exceeds the Tier 1 threshold even after noise consideration. |
| "The working tree has a small uncommitted change but it should not affect benchmark results" | No running benchmarks with a dirty working tree. Uncommitted changes invalidate benchmark results. |
| "I will update the baselines to match the new performance numbers rather than fixing the regression" | Baselines must come from fresh runs against committed code. Silently moving the goalposts defeats the purpose of performance gates. |
Phase 1: ANALYZE
harness check-perf --structural
Result: processOrderBatch() in src/orders/processor.ts has cyclomatic complexity 18 (Tier 1, threshold: 15)
Phase 2: BENCHMARK — skipped (Tier 1 violation found)
Phase 3: REPORT
TIER 1 VIOLATIONS (1):
- src/orders/processor.ts:processOrderBatch — complexity 18 > 15
Recommendation: Extract validation and transformation into separate functions
Phase 4: ENFORCE
Result: FAIL — 1 Tier 1 violation. Commit blocked.
Phase 1: ANALYZE — no structural violations
Phase 2: BENCHMARK
harness perf bench
Baseline: parseDocument 4.2ms, current: 4.8ms (+14.3%)
parseDocument is @perf-critical — Tier 1 threshold applies (>5%)
Phase 3: REPORT
TIER 1 VIOLATIONS (1):
- parseDocument: 14.3% regression on critical path (threshold: 5%)
Recommendation: Profile parseDocument to identify the regression source
Phase 4: ENFORCE
Result: FAIL — 1 Tier 1 violation. Merge blocked.
Phase 1: ANALYZE
harness check-perf --structural --coupling
Result: src/utils/formatter.ts has 320 lines (Tier 3, threshold: 300)
Phase 2: BENCHMARK
harness perf bench — all within noise margin
Phase 3: REPORT
TIER 3 INFO (1):
- src/utils/formatter.ts: 320 lines > 300 line threshold
No Tier 1 or Tier 2 violations.
Phase 4: ENFORCE
Result: PASS — no blocking violations.
@perf-critical annotations in source files and verify graph fan-in thresholds. The critical path set can be overridden in .harness/perf/critical-paths.json.// perf-ignore: <reason> comment and add the exception to .harness/perf/exceptions.json.