From traces
Analyze execution traces to cluster failure patterns and propose harness improvements (rules, hooks, skills)
npx claudepluginhub artmin96/forge-studio --plugin tracesThis skill uses the workspace's default tool permissions.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Proposes cuts, reorganization, and simplification to improve document structure, clarity, and flow while preserving comprehension. Use for structural or editorial reviews.
Periodic harness evolution skill inspired by NeoSigma's self-improving loop (39.3% improvement from failure mining + clustering + gated changes). Run weekly or when you suspect recurring patterns.
This skill analyzes and proposes only. It does NOT modify harness files.
ls -t ~/.claude/traces/*.jsonl | head -14 (last ~2 weeks)exit_code != "0" or output contains error keywords (Error, Exception, FATAL, failed, denied, not found)Group failures by root cause mechanism, not individual error messages:
| Cluster Type | Signal | Example |
|---|---|---|
| Tool misuse | Same command pattern fails repeatedly | git push blocked 5 times |
| Stale context | Edit failures after many edits without reads | 3+ edit-without-read sequences |
| Environment | Missing binary or wrong path | command not found patterns |
| Test regression | Same test fails across sessions | test_X fails in 4/7 sessions |
| Permission | Blocked by hooks or denied by user | BLOCKED: in output |
| Workflow | Repeated manual corrections after agent actions | Reverts, re-dos |
Prioritize clusters by: total_failures (high) and sessions_affected (high).
For each cluster (top 5 max), propose ONE of:
rules.d/ rule — if the failure is a behavioral pattern (agent keeps doing X when it shouldn't)For each proposal, include:
## Harness Evolution Report
**Period:** [date range]
**Sessions analyzed:** N
**Total trace entries:** N
**Failures found:** N (N% error rate)
---
### Cluster 1: [descriptive name]
- **Type:** [tool misuse | stale context | environment | test regression | permission | workflow]
- **Frequency:** N occurrences across M sessions
- **Example traces:**
[2-3 representative trace entries]
- **Root cause:** [mechanism explanation]
- **Proposed change:** [specific suggestion with file path]
- **Token impact:** [estimated additional chars/message if rules.d/ change]
- **Regression risk:** [low | medium | high] — [why]
### Cluster 2: ...
---
### Summary
| Metric | Value |
|--------|-------|
| Clusters found | N |
| Proposed rule changes | N |
| Proposed hook changes | N |
| Proposed skill changes | N |
| No action needed | N |
### Next Steps
- [prioritized list of what to implement first]
- [suggested re-analysis date]