From NVIDIA BioNeMo Agent Toolkit
Predicts monomer protein structures from amino-acid sequences using OpenFold2 via NVIDIA's hosted API or local Docker NIM. Supports A3M MSAs and mmCIF templates.
How this skill is triggered — by the user, by Claude, or both
Slash command
/bionemo-agent-toolkit:openfold2-nimThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Predict a single protein-chain structure from an amino-acid sequence, with
Predict a single protein-chain structure from an amino-acid sequence, with
optional A3M multiple sequence alignments and mmCIF templates. Use this
SKILL.md for basic hosted/local NIM use; load supplemental files only when
the task needs deeper context:
references/api.md: exact endpoints, schemas, Docker flags, response fields.references/science.md: model scope, strengths, limitations, and handoffs.references/parameters.md: MSA, template, model-selection, and relax effects.references/validation.md: artifact and scientific sanity checks.references/examples.md: compact hosted/local payload patterns.Ask only when context is unclear:
Hosted NVIDIA API or local Docker NIM?
https://health.api.nvidia.com/v1/biology/openfold/openfold2/predict-structure-from-msa-and-templatehttp://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-templatehttp://localhost:8000/v1/health/readyMode difference: hosted and local use the same prediction path except local
does not include /v1/. Hosted requests use Authorization: Bearer $NGC_API_KEY; local inference requests use no auth header after readiness.
Do not print API keys. Confirm they exist with shell tests, not echoes.
Hosted needs NGC_API_KEY in the request header. Supported local Docker
startup uses NGC_API_KEY, or NVIDIA_API_KEY as a fallback, plus
LOCAL_NIM_CACHE. A repo-root .env file may be sourced as a local override.
Use the official OpenFold2 NIM image and mount LOCAL_NIM_CACHE at
/opt/nim/.cache. Current docs recommend at least 80 GB disk, 64 GB system
RAM, 8 CPU cores, and one supported GPU; the container is roughly 55 GB and
first startup downloads about 10 GB of model parameters.
When writing local setup commands, copy the preflight below exactly. Do not
drop .env, NVIDIA_API_KEY, LOCAL_NIM_CACHE, or the no-auth local request.
set -a
[ -f .env ] && . ./.env
set +a
if [ -z "${NGC_API_KEY:-}" ] && [ -n "${NVIDIA_API_KEY:-}" ]; then
export NGC_API_KEY="$NVIDIA_API_KEY"
fi
: "${NGC_API_KEY:?Set NGC_API_KEY or NVIDIA_API_KEY}"
: "${LOCAL_NIM_CACHE:?Set LOCAL_NIM_CACHE}"
echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
export NIM_TEST_GPU="${NIM_TEST_GPU:-0}"
mkdir -p "${LOCAL_NIM_CACHE}"
chmod 777 "${LOCAL_NIM_CACHE}"
docker run --rm --name openfold2 \
--runtime=nvidia \
--gpus "device=${NIM_TEST_GPU}" \
-e NGC_API_KEY \
-v "${LOCAL_NIM_CACHE}:/opt/nim/.cache" \
-p 8000:8000 \
nvcr.io/nim/openfold/openfold2:latest
Readiness check:
until curl -sf http://localhost:8000/v1/health/ready; do sleep 5; done
Use Python requests; curl escaping is fragile for A3M/mmCIF text. The
sequence field is required. input_id, alignments, selected_models,
relax_prediction, use_templates, and explicit_templates are optional.
import os
import requests
hosted = True
url = (
"https://health.api.nvidia.com/v1/biology/openfold/openfold2/predict-structure-from-msa-and-template"
if hosted
else "http://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-template"
)
headers = {"Content-Type": "application/json"}
if hosted:
headers["Authorization"] = f"Bearer {os.environ['NGC_API_KEY']}"
seq = "MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPT"
payload = {
"sequence": seq,
"input_id": "kras_fragment",
"selected_models": [1],
"relax_prediction": False,
"alignments": {
"uniref90": {
"a3m": {
"alignment": f">query\n{seq}",
"format": "a3m",
}
}
},
}
response = requests.post(url, headers=headers, json=payload, timeout=300)
response.raise_for_status()
result = response.json()
Payload gotchas:
sequence must use valid amino-acid IUPAC symbols.alignments by database name, then a3m with
alignment and format. When the user needs to create or deepen an MSA,
hand off to msa-search-nim / MSA Search and map its A3M output into this
alignments shape.explicit_templates with mmCIF content;
do not write new HHR-template examples.selected_models chooses AlphaFold2/OpenFold parameter sets 1-5. Select one
or two models for smoke tests; use all five for stronger production runs.The response includes one prediction per selected model, ordered by confidence.
Save every returned structure-like text field and the full JSON response so
field-shape differences are auditable. Production answers should explicitly
write .pdb or .cif artifacts, preserve the response JSON, and print any
confidence/ranking fields the service returns.
from pathlib import Path
import json
Path("openfold2_response.json").write_text(json.dumps(result, indent=2))
def save_strings(obj, prefix="openfold2"):
i = 0
if isinstance(obj, dict):
for key, value in obj.items():
if isinstance(value, str) and ("ATOM" in value or value.lstrip().startswith("data_")):
i += 1
ext = "cif" if value.lstrip().startswith("data_") else "pdb"
Path(f"{prefix}_{key}_{i}.{ext}").write_text(value)
elif isinstance(value, (dict, list)):
i += save_strings(value, f"{prefix}_{key}")
elif isinstance(obj, list):
for idx, value in enumerate(obj, start=1):
if isinstance(value, (dict, list)):
i += save_strings(value, f"{prefix}_{idx}")
return i
saved = save_strings(result)
print(f"saved {saved} structure artifact(s)")
For production monomer runs:
selected_models: [1, 2, 3, 4, 5] unless the user requests a smoke test.relax_prediction: True in Python payloads when relaxation is desired;
JSON examples may show true.Treat tiny toy sequences and single-sequence MSAs as API smoke tests, not
quality evidence. For scientific interpretation and validation, read
references/science.md and references/validation.md.
401: missing, expired, or unauthorized NGC API key.422: invalid amino-acid characters, sequence too long, malformed A3M, bad
selected_models, or malformed mmCIF template object.404: remove /v1/ from the prediction URL.LOCAL_NIM_CACHE.npx claudepluginhub nvidia-bionemo/bionemo-agent-toolkit --plugin bionemo-agent-toolkitPredicts biomolecular structures (proteins, RNA, DNA, ligands, multi-chain complexes) using OpenFold3 NIM via hosted NVIDIA API or local Docker. Covers endpoint choice, auth, payloads, output artifacts, and container setup.
Accesses AlphaFold DB's 200M+ predicted protein structures by UniProt ID using BioPython or REST API. Downloads PDB/mmCIF files, analyzes pLDDT/PAE confidence, bulk-fetches proteomes via Google Cloud.
Predicts protein 3D structures from sequence using ESMFold de novo, AlphaFold database retrieval, RCSB experimental structures, ProtVar variant impact, and ProtParam properties.