From correctless
Generates project health and ROI dashboard aggregating QA findings, bugs caught, token costs, git velocity, workflow trends from artifacts and session data. Run monthly or for evaluations.
npx claudepluginhub joshft/correctless --plugin correctlessThis skill is limited to using the following tools:
Aggregate all accumulated workflow data into a project health dashboard. Shows the value the workflow has delivered over time.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Aggregate all accumulated workflow data into a project health dashboard. Shows the value the workflow has delivered over time.
Read everything in the accumulation layer. Skip files that don't exist.
glob .correctless/artifacts/qa-findings-*.json — every QA round from every featureglob .correctless/verification/*-verification.md — every verification.correctless/meta/workflow-effectiveness.json — post-merge bug history.correctless/antipatterns.md — accumulated bug classes.correctless/meta/drift-debt.json — architectural erosionglob .correctless/artifacts/findings/audit-*-history.md — all audit runsglob .correctless/specs/*.md — count of features that went through the workflowglob .correctless/artifacts/summary-*.md — per-feature summaries (from /csummary)glob .correctless/decisions/*.md — for staleness checks (revisit-when/revisit-by markers)glob .correctless/artifacts/workflow-state-*.json — for spec_updates counts per featureglob .correctless/artifacts/token-log-*.json — per-feature token usage from subagent spawnsglob .correctless/artifacts/audit-trail-*.jsonl — per-branch tool invocation logs from the PostToolUse hook. Contains: timestamp, phase, tool name, file path, branch.glob ~/.claude/usage-data/session-meta/*.json — filter by project_path matching the current project root. Contains exact token counts, tool usage, duration, error rates per session.glob ~/.claude/usage-data/facets/*.json — match by session_id to session-meta entries for this project. Contains AI-analyzed session quality: outcome, friction, satisfaction.Features completed — count of spec files in .correctless/specs/
Total issues caught — sum of:
qa-findings-*.json filesIssues by phase — categorize each issue by which phase caught it:
Bug escape rate — from workflow-effectiveness.json:
post_merge_bugs count / (total issues caught + post_merge_bugs)Phase effectiveness — from workflow-effectiveness.json:
Antipattern trends — from .correctless/antipatterns.md:
Frequency field)Drift debt health — from .correctless/meta/drift-debt.json:
Olympics history — from audit history files:
Feature velocity — from git log:
Print to conversation AND write to .correctless/artifacts/metrics-{date}.md:
# Correctless Metrics — {Project Name}
# Generated: {date}
## Overview
- **Features completed:** {N} (from spec count)
- **Total issues caught:** {N}
- **Bug escape rate:** {N} escaped / {M} caught ({percentage}%)
- **Workflow active since:** {date of first spec file}
## Issues by Phase
| Phase | Issues Caught | % of Total | Notes |
|-------|--------------|------------|-------|
| Review | {N} | {%} | {e.g., "Security checklist added 40% of these"} |
| Test Audit | {N} | {%} | |
| QA | {N} | {%} | {e.g., "3 class fixes added structural tests"} |
| Verify | {N} | {%} | |
| Audit (Olympics) | {N} | {%} | |
## Bug Escapes
{From workflow-effectiveness.json}
| ID | Severity | What | Phase That Missed | Why |
|----|----------|------|-------------------|-----|
| PMB-001 | {sev} | {desc} | {phase} | {reason} |
**Weakest phase:** {phase with most misses relative to responsibility}
**Recommendation:** {what to improve — more templates? higher intensity? more QA rounds?}
## Antipattern Trends
- **Total entries:** {N}
- **Top categories:**
| Category | Count | Trend |
|----------|-------|-------|
| {category} | {N} | {growing/stable/resolved} |
- **Most frequent:** {AP-xxx} — {description} — seen {N} times
{If any category has 3+ entries: "Consider whether this is an architectural issue, not just a code pattern."}
## Drift Debt
- **Open items:** {N}
- **Oldest:** {DRIFT-xxx} — {age} days — {description}
- **Resolved this period:** {N}
- **Accumulating faster than resolving:** {yes/no}
{If oldest > 60 days: "Flag for /cdevadv analysis — this is rotting."}
## Olympics History
| Run | Preset | Date | Rounds | Findings | Fixed |
|-----|--------|------|--------|----------|-------|
| 1 | QA | {date} | {N} | {N} | {N} |
| 2 | Hacker | {date} | {N} | {N} | {N} |
**Average convergence:** {N} rounds
**Recurring patterns:** {patterns that keep appearing across runs}
## Velocity
- **Average feature duration:** {N} days (branch creation to merge)
- **Features per month:** {N}
- **Workflow overhead per feature:** ~{N} min estimated
## ROI Estimate
- **Issues caught:** {N}
- **Estimated fix time if found in production:** {N} × 2 hours avg = {N} hours
- **Workflow time invested:** {features} × {overhead per feature} = {N} hours
- **Net time saved:** {production fix time} - {workflow time} = {N} hours
{This is a rough estimate. Production bugs take 2-10x longer to fix than pre-merge bugs due to debugging, hotfixes, rollbacks, and incident response. The 2-hour average is conservative.}
## Health Analysis
- **QA Round Trend:** {e.g., "Averaging 2.3 rounds — trending down from 3.1 last quarter."}
- **Antipattern Growth:** {e.g., "Error handling category growing fastest (+4 in 3 months). Consider architectural pattern."}
- **Drift Staleness:** {e.g., "3 drift items older than 90 days. Schedule /cdevadv layers analysis."}
- **Olympics Convergence (at high+ intensity):** {e.g., "Last 5 runs converged in ≤2 rounds — consider rotating presets." Omit this bullet at standard intensity.}
- **Decision Record Staleness:** {e.g., "2 decisions have expired revisit-by dates."}
- **Spec Revision Rate:** {e.g., "Specs revised mid-TDD in 60% of features — spec phase may need more brainstorm time."}
- **Cross-Metric Correlations:** {e.g., "Spec revision rate is high AND antipattern growth is accelerating in error handling — spec phase isn't learning from antipatterns."}
After computing the raw metrics above, analyze them for actionable insights. This is the interpretive layer — raw numbers without analysis are useless.
QA Round Trends:
Antipattern Growth:
Drift Debt Staleness:
/cdevadv layers analysis."Olympics Convergence Speed (high+ intensity):
Decision Record Staleness:
.correctless/decisions/ for files with revisit-when or revisit-by markers. Flag expired conditions.Spec Revision Rate:
Cross-Metric Correlations (the most valuable insights):
Read all .correctless/artifacts/token-log-*.json files. Correlate token spend with findings data from QA, verification, and audit artifacts.
1. Cost per bug caught: Total tokens across all features / total distinct findings. "Across {N} features, you spent {T} tokens and caught {B} bugs — {T/B} tokens per bug caught pre-merge."
2. Tokens per feature by phase: Group all token log entries by the skill field and sum total_tokens per skill. Show as a table:
| Phase | Tokens | % of Total | Findings | Tokens/Finding |
|---|---|---|---|---|
| TDD (ctdd) | {N} | {%} | {N} | {N} |
| Review (creview/creview-spec) | {N} | {%} | {N} | {N} |
| Verification (cverify) | {N} | {%} | {N} | {N} |
| Audit (caudit) | {N} | {%} | {N} | {N} |
| Other (all remaining: cspec, cdebug, crefactor, cmodel, credteam, cdevadv, cpostmortem) | {N} | {%} | {N} | {N} |
This shows where the budget goes. If 65% goes to TDD and TDD catches 60% of bugs, the allocation is efficient. If 40% goes to audit and it catches 5% of bugs, consider reducing audit intensity.
3. Bug escape rate: From workflow-effectiveness.json: post_merge_bugs count / (total caught + escaped). "Escape rate: {N}%. {M} caught pre-merge, {K} escaped."
4. Estimated production fix cost avoided: Each caught bug saves an estimated 2-10 hours of production debugging, hotfixes, rollbacks, and incident response. "{N} bugs caught × 2-10 hours = {range} hours saved. At $150/hr developer cost, that's ${range} saved." This is a rough estimate — say so. But even the conservative end usually exceeds the token cost.
5. Tokens per LOC: Total tokens / total lines added (from git diff --stat across features). Track over time — should be stable if overhead scales linearly.
6. Olympics efficiency (high+ intensity): Tokens per finding per round. "Round 1: {N} findings at {T} tokens. Round {M}: {N} findings at {T} tokens." Shows diminishing returns.
7. Token trend: Compare with previous metrics. "Token cost per feature: {stable/growing/shrinking}. Cost per bug: {improving/degrading}."
8. Tool call distribution per phase: From audit trail JSONL files, count tool invocations grouped by phase. Shows where tool activity concentrates — if 80% of Edit calls happen in tdd-impl (GREEN), the implementation phase is the most write-heavy.
Add to the dashboard after the existing ROI Estimate section:
## Token ROI Analysis
### Cost Summary
- **Total tokens tracked:** {N} across {M} features
- **Average tokens per feature:** {N}
- **Cost per bug caught:** {N} tokens ({M} bugs caught)
### Phase Distribution
| Phase | Tokens | % of Total | Findings | Tokens/Finding |
|-------|--------|-----------|----------|----------------|
| TDD (ctdd) | {N} | {%} | {N} | {N} |
| Review (creview, creview-spec) | {N} | {%} | {N} | {N} |
| Verification (cverify) | {N} | {%} | {N} | {N} |
| Audit (caudit) | {N} | {%} | {N} | {N} |
| Other (cspec, cdebug, crefactor, cmodel, credteam, cdevadv, cpostmortem) | {N} | {%} | {N} | {N} |
### Bug Escape Rate
- **Pre-merge bugs caught:** {N}
- **Post-merge bugs escaped:** {M}
- **Escape rate:** {%}
- **Estimated production fix cost avoided:** {N bugs} × 2-10 hours = {range} hours (~${range} at $150/hr)
### Efficiency
- **Tokens per LOC:** {N} (total tokens / lines added across features)
- **Olympics efficiency (high+ intensity):** Round 1: {N} findings at {T}k tokens → Round {M}: {N} findings at {T}k tokens
### Token Trend
- Token cost per feature: {stable/growing/shrinking}
- Cost per bug caught: {improving/degrading}
If no token logs exist, skip this section with: "No token usage data yet. Token tracking starts automatically when skills run — data will appear after the next feature."
Determine the current project root: git rev-parse --show-toplevel. Then use find ~/.claude/usage-data/session-meta/ -name '*.json' to list all session-meta files (do NOT use Glob with ~ — use find or Bash for tilde expansion). Filter to sessions where project_path matches the project root (exact string match on the absolute path). For each matching session, look up the corresponding facets file at ~/.claude/usage-data/facets/{session_id}.json (the facets filename IS the session_id).
Note: Not all sessions have facets files (~26% coverage is typical). When computing facets-based metrics, note the sample size: "Outcome data available for {N} of {M} sessions ({%})."
From session-meta:
input_tokens + output_tokens across all project sessions. This is ground truth — cross-check against manual token logs. If they diverge significantly, note: "Session-meta shows {N} tokens total. Token logs show {M}. The difference ({D}) is orchestrator overhead not captured by subagent tracking."mean(duration_minutes) across sessions.tool_counts across sessions. Show top 6 tools by call count.sum(tool_errors) / sum(all tool calls). Break down by tool_error_categories.user_response_times. Long times (>60s) suggest confusion. Short times (<15s) suggest flow.From facets:
outcome: "fully_achieved".claude_helpfulness values.friction_counts across sessions. Flag growing categories.Correctless vs Freeform comparison:
Identify Correctless sessions by checking whether Correctless artifacts were modified during the session's time window. A session is "Correctless" if:
.correctless/artifacts/workflow-state-*.json) have phase_entered_at timestamps within the session's start_time to start_time + duration_minutes range, ORtool_counts includes calls to tools that only Correctless uses (the Task tool with high counts suggests orchestrated workflow), ORIf none of these signals are present, the session is "freeform." Note: this heuristic is approximate — some Correctless sessions (e.g., /cstatus or /chelp) may look freeform. Err on the side of undercounting Correctless sessions rather than overcounting.
Important: Slash commands like /cspec are intercepted by Claude Code before reaching the conversation — they do NOT appear in first_prompt. Do not use first_prompt to identify Correctless sessions.
## Session Analytics
### Overview
- **Sessions tracked:** {N} (from {date} to {date})
- **Average session duration:** {N} minutes
- **Total tokens (ground truth):** {N} input + {N} output
### Tool Distribution
| Tool | Calls | % of Total |
|------|-------|-----------|
| {tool} | {N} | {%} |
### Quality Signals
- **Outcome rate:** {N}% fully achieved
- **Helpfulness:** {distribution}
- **Friction rate:** {N}% tool errors
- **Top friction:** {category from facets friction_counts} ({N} occurrences)
- **User engagement:** avg response time {N}s ({<15s: flowing | 15-60s: normal | >60s: confused})
### Correctless vs Freeform
| Metric | Correctless Sessions | Freeform Sessions |
|--------|---------------------|-------------------|
| Outcome rate | {%} | {%} |
| Friction rate | {%} | {%} |
| Avg duration | {N} min | {N} min |
| Avg tokens | {N} | {N} |
If no session-meta data exists for this project, skip with: "No Claude Code session data found for this project. Session analytics will appear after running a few sessions."
If previous metrics files exist (.correctless/artifacts/metrics-*.md), compare:
Note trends: "Bug escape rate: 5% → 3% → 1.5% over 3 months. The workflow is getting more effective."
Use TaskCreate/TaskUpdate:
/cmetrics writes a dashboard artifact but does not modify workflow state or source code. Re-run anytime safely./ctdd, verification reports from /cverify, antipatterns from /cpostmortem.templates/redaction-rules.md first.