Skill

tooluniverse-disease-research

Generates detailed disease research reports as markdown files using 100+ ToolUniverse tools across 10 dimensions with source citations. For queries on diseases, syndromes, or medical conditions.

ai-ml

documentation

Install

npx claudepluginhub joshuarweaver/cascade-data-analytics --plugin mims-harvard-tooluniverse

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Generate a comprehensive disease research report with full source citations. The report is created as a markdown file and progressively updated during research.

Supporting Assets

EXAMPLES.mdREPORT_TEMPLATE.mdRESEARCH_PROTOCOL.mdTOOLS_REFERENCE.mdtool_usage_details.md

SKILL.md

Similar Skills

github-deep-research

2 files

Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.

bytedance-deer-flow-1

63.9k

surprise-me

Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.

bytedance-deer-flow-1

63.9k

image-generation

2 files

Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.

bytedance-deer-flow-1

63.9k

Stats

Stars1291

Forks199

Last CommitMar 29, 2026

Actions

View Source View Plugin View on GitHub View README

ToolUniverse Disease Research

Generate a comprehensive disease research report with full source citations. The report is created as a markdown file and progressively updated during research.

IMPORTANT: Always use English disease names and search terms in tool calls. Respond in the user's language.

LOOK UP, DON'T GUESS

When asked about a disease, query Orphanet/OMIM/DisGeNET FIRST. Don't rely on memory for prevalence, genetics, or treatment — these change over time. When you're not sure about a fact, your first instinct should be to SEARCH for it using tools, not to reason harder from memory.

When to Use

User asks about any disease, syndrome, or medical condition
Needs comprehensive disease intelligence or a detailed research report
Asks "what do we know about [disease]?"

Core Workflow: Report-First Approach

DO NOT show the search process to the user. Instead:

Create report file first - Initialize {disease_name}_research_report.md
Research each dimension - Use all relevant tools
Update report progressively - Write findings after each dimension
Include citations - Every fact must reference its source tool

Disease Mechanism Reasoning

When synthesizing disease etiology, trace the full pathogenic cascade:

Genetic basis - Which variants (rare or common) confer risk, and in which genes?
Molecular mechanism - How do those variants alter protein function, expression, or regulation?
Cellular effect - What downstream cellular processes are disrupted (signaling, metabolism, stress response)?
Tissue/organ manifestation - How does cellular dysfunction present as organ-level pathology?

This chain structures the Genetic & Molecular Basis (Section 3) and Biological Pathways (Section 5) sections.

10 Research Dimensions

Dim	Section	Key Tools
1	Identity & Classification	OSL_get_efo_id_by_disease_name, ols_search_efo_terms, ols_get_efo_term, umls_search_concepts, icd_search_codes, snomed_search_concepts
2	Clinical Presentation	OpenTargets phenotypes, HPO lookup, MedlinePlus
3	Genetic & Molecular Basis	OpenTargets targets, ClinVar variants, GWAS associations, gnomAD
4	Treatment Landscape	OpenTargets drugs, clinical trials, GtoPdb
5	Biological Pathways	Reactome pathways, humanbase_ppi_analysis, GTEx expression, HPA
6	Epidemiology & Literature	PubMed, OpenAlex, Europe PMC, Semantic Scholar
7	Similar Diseases	OpenTargets similar entities
8	Cancer-Specific (if applicable)	CIViC genes/variants/therapies
9	Pharmacology	GtoPdb targets/interactions/ligands
10	Drug Safety	OpenTargets warnings, clinical trial AEs, FAERS

See: tool_usage_details.md for complete tool calls per section.

Report Template

Create this file structure at the start:

# Disease Research Report: {Disease Name}

**Report Generated**: {date}
**Disease Identifiers**: (to be filled)

---

## Executive Summary
(Brief 3-5 sentence overview - fill after all research complete)

---

## 1. Disease Identity & Classification
### Ontology Identifiers
| System | ID | Source |

### Synonyms & Alternative Names
### Disease Hierarchy

---

## 2. Clinical Presentation
### Phenotypes (HPO)
| HPO ID | Phenotype | Description | Source |

### Symptoms & Signs
### Diagnostic Criteria

---

## 3. Genetic & Molecular Basis
### Associated Genes
| Gene | Score | Ensembl ID | Evidence | Source |

### GWAS Associations
| SNP | P-value | Odds Ratio | Study | Source |

### Pathogenic Variants (ClinVar)

---

## 4. Treatment Landscape
### Approved Drugs
| Drug | ChEMBL ID | Mechanism | Phase | Target | Source |

### Clinical Trials
| NCT ID | Title | Phase | Status | Source |

---

## 5. Biological Pathways & Mechanisms

## 6. Epidemiology & Risk Factors

## 7. Literature & Research Activity

## 8. Similar Diseases & Comorbidities

## 9. Cancer-Specific Information (if applicable)

## 10. Drug Safety & Adverse Events

---

## References
### Tools Used
| # | Tool | Parameters | Section | Items Retrieved |

Citation Format

Every piece of data MUST include its source:

In tables: Add a Source column with tool name In lists: - Finding [Source: tool_name] In prose: (Source: tool_name, query: "...") References section: Complete tool usage log with parameters

Progressive Update Pattern

# After each dimension's research:
# 1. Read current report
# 2. Replace placeholder with formatted content
# 3. Write back immediately
# 4. Continue to next dimension

Evidence Grading & Interpretation

Every finding in the report should be graded:

Grade	Criteria	Example
T1 (Strong)	Replicated genetic evidence (GWAS, rare variants), FDA-approved therapy	BRCA1 → breast cancer; trastuzumab for HER2+
T2 (Moderate)	Single genetic study, phase II+ trial data, strong biological evidence	FOXO3 → longevity (centenarian studies)
T3 (Association)	Observational data, gene expression changes, pathway membership	IL-6 elevated in Alzheimer's CSF
T4 (Computational)	Network proximity, text mining, predicted associations	DisGeNET text-mined gene-disease link

Synthesis Questions (answer in Executive Summary)

After collecting data from all 10 dimensions, the report MUST answer:

What causes this disease? Summarize the genetic architecture (monogenic vs polygenic, key loci, penetrance)
What are the therapeutic options? Ranked by evidence level and approval status
What biomarkers exist? For diagnosis, prognosis, and treatment selection
What's the unmet need? What aspects lack effective treatment or understanding?
What are the active research frontiers? Based on clinical trials and recent publications

Interpreting Cross-Database Concordance

When multiple databases provide different data for the same disease:

OpenTargets + DisGeNET + OMIM agree on a gene: T1 evidence — high confidence
Only OpenTargets reports an association: Check the datasource scores — genetic_association > literature > animal_model
DisGeNET score > 0.5 but not in OpenTargets: May be text-mined; verify with PubMed
Gene in GWAS but not OMIM: Likely a complex disease susceptibility locus, not Mendelian

Handling Conflicting Data

Conflict	Resolution
Different prevalence estimates across sources	Report range; note the most recent/largest study
Drug approved in one country but not another	Note regulatory status per region
Gene-disease association in one DB but absent in another	Grade by evidence type; text-mining alone is T4
Clinical trial results contradict label indications	The trial result is newer evidence; note both

Final Report Quality Checklist

All 10 sections have content (or marked "No data available")
Every data point has a source citation
Executive summary reflects key findings
References section lists all tools used
Tables properly formatted
No placeholder text remains

Expected Output Scale

For a well-studied disease (e.g., Alzheimer's), the final report should include:

5+ ontology IDs, 10+ synonyms, disease hierarchy
20+ phenotypes with HPO IDs
50+ genes, 30+ GWAS associations, 100+ ClinVar variants
20+ drugs, 50+ clinical trials
10+ pathways, PPI network, expression data
100+ publications
15+ similar diseases
Drug warnings and adverse events

Total: 500+ individual data points, each with source citation.

Cross-Skill References

For rare disease differential diagnosis, run: python3 skills/tooluniverse-rare-disease-diagnosis/scripts/clinical_patterns.py --type differential --symptoms 'symptom1,symptom2'

Reference Files

REPORT_TEMPLATE.md - Full report markdown template and citation format guide
RESEARCH_PROTOCOL.md - Step-by-step code procedures, progressive update pattern, quality checklist
tool_usage_details.md - Complete tool calls for each research dimension
TOOLS_REFERENCE.md - Complete tool documentation
EXAMPLES.md - Sample disease research reports