From mims-harvard-tooluniverse
Optimizes ToolUniverse JSON tool descriptions for clarity and usability by reviewing prerequisites, abbreviations, parameters, and usage guidance. Use when reviewing tool docs or improving API documentation.
npx claudepluginhub joshuarweaver/cascade-data-analytics --plugin mims-harvard-tooluniverseThis skill uses the workspace's default tool permissions.
Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.
Use when:
Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage
Problem: Users don't know if they need ONE input or ALL inputs.
Fix: Use "Required: Provide ONE input type" for mutually exclusive options.
// Before
"description": "Process BED regions, motifs, or gene lists..."
// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."
Number the options and use bold for "Required".
Problem: Users don't know what to install/configure before use.
Fix: Add prerequisites note to first tool in each family.
"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."
Include:
Problem: New users don't understand technical terms.
Fix: Expand on first use with format: "Abbreviation (Full Name)".
Common abbreviations to expand:
// Before
"description": "Download H5AD files..."
// After
"description": "Download H5AD (HDF5-based AnnData) files..."
Problem: Users don't know what fields are available or what syntax to use.
Fix: List operators, common fields, and provide multiple examples.
"parameter_name": {
"type": "string",
"description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}
Include:
Problem: Users don't know which value to choose or what trade-offs exist.
Fix: Explain what each value means and provide recommendations.
// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"
// After
"threshold": "Peak calling stringency. '05'=1e-5 (permissive, more peaks, broad features), '10'=1e-10 (moderate, balanced), '20'=1e-20 (strict, high confidence, narrow peaks). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."
For each parameter option, explain:
Problem: Users provide multiple options when only one is allowed.
Fix: Label options as "Option 1", "Option 2", etc.
"bed_data": {
"description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
"description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
"description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}
For tools that download or return large files:
"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."
When tool returns submission URL instead of direct results:
"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."
For tools with multiple format options:
"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."
{
"name": "Tool_operation_name",
"type": "ToolClassName",
"description": "[Action verb] to [purpose]. [Prerequisites if first tool]. [Key data/features]. [Required inputs if mutually exclusive]. [Note about limitations/requirements]. Use for: [use case 1], [use case 2], [use case 3].",
"parameter": {
"properties": {
"param_name": {
"type": "string",
"description": "[What it does]. [Format/syntax if applicable]. [Options with trade-offs]. [Examples]. [Recommendation if applicable]."
}
}
}
}
To verify description quality, ask:
Can a new user understand what the tool does?
Can a user provide correct inputs on first try?
Can a user choose appropriate parameters?
Are prerequisites obvious?
"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."
Key elements:
"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."
Key elements:
"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."
Key elements:
After updating descriptions, validate JSON syntax:
# Validate all tool JSONs
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"
# Check all tools in category
for f in src/tooluniverse/data/*_tools.json; do
python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid"
done
Before (Unclear):
{
"name": "Tool_enrichment",
"description": "Perform enrichment with tool to find factors.",
"parameter": {
"properties": {
"bed": {"description": "BED data"},
"motif": {"description": "Motif"},
"genes": {"description": "Genes"},
"threshold": {"description": "Threshold value"}
}
}
}
After (Clear):
{
"name": "Tool_enrichment_analysis",
"description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
"parameter": {
"properties": {
"bed_data": {
"description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
},
"motif": {
"description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
},
"gene_list": {
"description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
},
"threshold": {
"description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
}
}
}
}
Priority order for optimization:
Critical (fix immediately):
High (fix soon):
Medium (nice to have):
Expected impact: 50-75% reduction in user errors, 50-67% faster time to first successful use.