From tooluniverse
Designs therapeutic proteins using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation with ESMFold/AlphaFold2. Useful for protein binders, scaffolds, enzyme variants, and miniprotein design.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tooluniverse:tooluniverse-protein-therapeutic-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
KEY PRINCIPLES:
Therapeutic protein design starts with the target interaction. What binding surface do you need to cover? A small pocket = nanobody or peptide. A large flat surface = designed protein. Stability, immunogenicity, and manufacturability constrain the design space.
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Apply when user asks to:
Phase 1: Target Characterization
Get structure (PDB, EMDB cryo-EM, AlphaFold), identify binding epitope
Phase 2: Backbone Generation (RFdiffusion)
Define constraints, generate >= 5 backbones, filter by geometry
Phase 3: Sequence Design (ProteinMPNN)
Design >= 8 sequences per backbone, sample with temperature control
Phase 4: Structure Validation (ESMFold/AlphaFold2)
Predict structure, compare to backbone, assess pLDDT/pTM
Phase 5: Developability Assessment
Aggregation, pI, expression prediction
Phase 6: Report Synthesis
Ranked candidates, FASTA, experimental recommendations
[TARGET]_protein_design_report.md first with section headers[TARGET]_designed_sequences.fasta and [TARGET]_top_candidates.csvEvery design MUST include: Sequence, Length, Target, Method, and Quality Metrics (pLDDT, pTM, MPNN score, binding prediction).
| Tool | Purpose | Key Parameter |
|---|---|---|
NvidiaNIM_rfdiffusion | Backbone generation | diffusion_steps (NOT num_steps) |
NvidiaNIM_proteinmpnn | Sequence design | pdb_string (NOT pdb) |
ESMFold_predict_structure | Fast validation | sequence (NOT seq) |
NvidiaNIM_alphafold2 | High-accuracy validation | sequence, algorithm |
NvidiaNIM_esm2_650m | Sequence embeddings | sequences, format |
| Tool | Wrong | Correct |
|---|---|---|
NvidiaNIM_rfdiffusion | num_steps=50 | diffusion_steps=50 |
NvidiaNIM_proteinmpnn | pdb=content | pdb_string=content |
ESMFold_predict_structure | seq="MVLS..." | sequence="MVLS..." |
NvidiaNIM_alphafold2 | seq="MVLS..." | sequence="MVLS..." |
NVIDIA_API_KEY environment variable required| Tool | Purpose | Key Parameters |
|---|---|---|
PDBe_get_uniprot_mappings | Find PDB structures | uniprot_id |
RCSBData_get_entry | Download PDB file | pdb_id |
alphafold_get_prediction | Get AlphaFold DB structure | accession |
emdb_search | Search cryo-EM maps | query |
emdb_get_entry | Get entry details | entry_id |
UniProt_get_entry_by_accession | Get target sequence | accession |
InterPro_get_protein_domains | Get domains | accession |
| Tier | Criteria |
|---|---|
| T1 (best) | pLDDT >85, pTM >0.8, low aggregation, neutral pI |
| T2 | pLDDT >75, pTM >0.7, acceptable developability |
| T3 | pLDDT >70, pTM >0.65, developability concerns |
| T4 | Failed validation or major developability issues |
npx claudepluginhub mims-harvard/tooluniverse --plugin tooluniverseDrives the full complexa design pipeline for protein binder, ligand binder, and AME motif scaffolding with flow matching, search, refold, and diversity analysis.
Generates and analyzes protein sequences, structures, and functions using ESM3 (generative multimodal design) and ESM C (embeddings). Supports local models and cloud-based Forge API for protein engineering tasks.
Generates protein sequences, predicts 3D structure, performs inverse folding, and extracts embeddings using ESM3 and ESM C models. Works locally on GPU or via EvolutionaryScale Forge API.