From harnessml
Use after running several experiments, when you need to step back and connect the learnings. This is also the skill for deciding what to do next — and for recognizing when further experiments aren't adding understanding.
npx claudepluginhub msilverblatt/harness-ml --plugin harnessmlThis skill uses the workspace's default tool permissions.
Use after running several experiments, when you need to step back and connect the learnings. This is also the skill for deciding what to do next — and for recognizing when further experiments aren't adding understanding.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Use after running several experiments, when you need to step back and connect the learnings. This is also the skill for deciding what to do next — and for recognizing when further experiments aren't adding understanding.
Individual experiments produce individual findings. Synthesis produces understanding. After 10 experiments, you should be able to say: "Here is what drives this target, here is where the model struggles, and here is why."
Without synthesis, you have a list of experiments. With synthesis, you have a theory.
Periodically — every 5-10 experiments, or whenever you feel stuck — stop and answer:
Summarize the confirmed findings. What features matter and why? What model types work best and why? What relationships has the data revealed?
This is not a list of experiment results. It's a narrative: "The target is primarily driven by X and Y. Feature Z captures X well, but Y is only partially captured — the model struggles when Y interacts with W."
Equally valuable. What hypotheses were tested and refuted? What strategies were exhausted? This prevents revisiting dead ends and focuses future work.
Look at the diagnostic patterns across experiments:
This is where the next breakthrough will come from — not from tuning what works, but from understanding what doesn't.
State your best understanding of what drives the target. This theory should evolve with each experiment. If it hasn't changed in 5 experiments, either you're confirming it (good) or you're not learning (bad — try something different).
Based on everything above, what would teach you the most? Not what would improve the metric the most — what would improve your understanding the most. Often these are the same thing, but not always.
experiments(action="journal")
Read through the journal with fresh eyes. Look for:
There is no "done." But there is a point of diminishing returns where additional experiments aren't adding meaningful understanding.
Signals you're approaching diminishing returns:
What to do at this point:
Don't declare victory. Instead, present the state of knowledge:
## Project Synthesis — [Date or Experiment Count]
### State of Knowledge
[Narrative of what drives the target and how the model captures it]
### Key Findings
[3-5 most important confirmed learnings, with evidence]
### Ruled Out
[Strategies and hypotheses that were tested and refuted]
### Remaining Weaknesses
[Where the model struggles and current best understanding of why]
### Model Composition
[What's in the ensemble, why each model is there, diversity assessment]
### Recommended Next Steps
[Ranked by expected learning value, not expected metric gain]
### Open Questions
[What you'd investigate with more time/data]
Write the synthesis to the notebook so it carries across sessions:
notebook(action="write", type="theory", content="[current understanding of what drives the target]")
notebook(action="write", type="plan", content="[ranked next steps with reasoning]")
The theory entry supersedes the previous one — only the latest is shown in the summary. The plan entry becomes the starting point for the next session.
If you ruled out strategies, record those too:
notebook(action="write", type="decision", content="Abandoned [strategy] because [structural reason]")
Synthesis naturally triggers other skills:
diagnosis to understand the errors, domain-research to generate hypotheses about X's behaviormodel-diversity to evaluate whether it would add valuefeature-engineering with domain-driven hypotheseseda to re-examine the data in light of new understandingdomain-research to bring in external knowledgeSynthesis is the hub. Everything flows through understanding.