From mims-harvard-tooluniverse
Orchestrates ToolUniverse self-improvement cycle: dispatches skills for API discovery, tool creation, researcher testing, fixes, optimization, and git pushes. Invoke for full loops, test rounds, coverage expansion, or evolution.
npx claudepluginhub joshuarweaver/cascade-data-analytics --plugin mims-harvard-tooluniverseThis skill uses the workspace's default tool permissions.
Coordinates the full development lifecycle by dispatching to specialized devtu skills.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Coordinates the full development lifecycle by dispatching to specialized devtu skills.
Discover → Create → Test → Fix → Optimize → Ship → Repeat
Each phase maps to a dedicated skill:
| Phase | Skill | What it does |
|---|---|---|
| Discover | devtu-auto-discover-apis | Gap analysis, web search for APIs, batch discovery |
| Create | devtu-create-tool | Build tool class + JSON config + test examples |
| Test | (this skill) | Launch researcher persona agents to find issues |
| Fix | devtu-fix-tool | Diagnose failures, implement fixes, validate |
| Optimize | devtu-optimize-skills | Improve skill reports, evidence handling, UX |
| Optimize | devtu-optimize-descriptions | Improve tool JSON descriptions for clarity |
| Docs | devtu-docs-quality | Validate documentation accuracy |
| Ship | devtu-github | Branch, commit, push, create PR |
Pick an entry point based on what's needed:
Skill(skill="devtu-auto-discover-apis")Skill(skill="devtu-create-tool")Skill(skill="devtu-fix-tool")Skill(skill="devtu-optimize-skills")Invoke Skill(skill="devtu-auto-discover-apis") to:
Invoke Skill(skill="devtu-create-tool") for each new API:
_lazy_registry_static.py and default_config.pypython -m tooluniverse.cli test <ToolName>This is the core testing loop, run directly by this skill.
gh pr list --state openorigin/maingit fetch origin && git rebase origin/mainLaunch 2 agents per round (A + B) using the Agent tool with these parameters:
Each agent gets:
Feature-{round}{letter}-{num} (e.g., Feature-59A-001)Agent prompt template — see references/persona-template.md
Before implementing ANY agent-reported issue, verify via CLI:
python3 -m tooluniverse.cli run <ToolName> '<json_args>'
50%+ of agent reports are false positives from MCP interface confusion. Only fix verified issues.
Anti-patterns: hint text instead of validation, parameter aliases instead of fixing naming, post-hoc probing instead of pre-validation.
Standard testing verifies tools work. Usefulness testing verifies skills actually solve scientist problems. Run this after standard testing:
Score 1-10 rubric:
Common failure patterns found in usefulness tests:
| Pattern | Score Impact | Fix |
|---|---|---|
| "Call A, then B, then C" without explaining what to DO with results | -3 | Add interpretation tables |
| Tool params wrong (tool works but skill documents wrong names) | -2 | Verify ALL tool params via get_tool_info() |
| Promises data the API can't deliver (e.g., DepMap CRISPR scores) | -2 | Be honest about limitations; add computational procedure workaround |
| No synthesis phase at the end | -2 | Add "so what?" phase that combines all evidence |
| No evidence grading | -1 | Add T1-T4 or similar confidence tiers |
| No computational procedures for things tools can't do | -1 | Add Python code blocks using scipy/pandas/numpy |
When tools can't help, add computational procedures: Some analyses need Python code, not API calls. Skills should include working code blocks for:
See devtu-optimize-skills Patterns 14-15 for full guidance.
Skill(skill="simplify") — always after writing or modifying coderuff check src/tooluniverse/<file>.pypython -c "from tooluniverse.<module> import <Class>"python -m tooluniverse.cli run <Tool> '<json>'git push origin <branch>Also see
Skill(skill="devtu-code-optimization")for reusable fix patterns and anti-patterns.
After fixes are stable:
Skill(skill="devtu-optimize-descriptions") — improve tool descriptionsSkill(skill="devtu-optimize-skills") — improve research skill qualitySkill(skill="devtu-docs-quality") — validate docs accuracyInvoke Skill(skill="devtu-github") or manually:
git fetch origin && git stash && git rebase origin/main && git stash popgit push --force-with-lease origin <branch>gh pr create / verify with gh pr view <N> --json mergeable"mergeable": "MERGEABLE" before reporting doneGitHub repo: mims-harvard/ToolUniverse — always verify with git remote -v before pushing.
git fetch origin && git rebase origin/main| Category | Signal |
|---|---|
| Silent parameter miss | Wrong-field check; param ignored |
| Always-fires conditional | .get("field") on wrong type |
| Silent normalization | Auto-transform not disclosed |
| Wrong notation/case | Gene fusions, Title Case names |
| Substring match | Short symbol returns multiple targets |
| try/except indent | Mismatched → SyntaxError |
Full patterns → references/bug-patterns.md
After each round: advance counter, update patterns file, keep this SKILL.md under 150 lines.
Current round: 127 (rounds completed: 52-126)