Bioinformatics analysis (Biopython) - sequence, protein, and structure analysis
Analyzes DNA, RNA, protein sequences, and 3D structures using Biopython.
/plugin marketplace add slapglif/theory2-physics-plugin/plugin install theory2-physics@theory2-physics-plugin<operation> [options]Analyze DNA, RNA, protein sequences, and 3D protein structures using Biopython.
Analyze DNA, RNA, or protein sequences for basic properties:
# DNA sequence analysis
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-sequence \
--sequence="ATGCGATCGATCG"
# Protein sequence analysis (auto-detected)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-sequence \
--sequence="MVLSPADKTNVK" --seq-type=protein --explain
# RNA sequence
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-sequence \
--sequence="AUGCGAUCGAUCG" --seq-type=rna
Parameters:
--sequence: DNA/RNA/Protein sequence string (required)--seq-type: auto, dna, rna, or protein (default: auto)--explain: Include detailed explanation--json: Output as JSONReturns:
Translate DNA sequences to protein using genetic code tables:
# Standard genetic code (table 1)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics translate-dna \
--sequence="ATGGCTAGCTAG"
# Mitochondrial genetic code (table 2)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics translate-dna \
--sequence="ATGGCTAGCTAG" --table=2
Parameters:
--sequence: DNA sequence to translate (required)--table: Codon table (1=Standard, 2=Mitochondrial, default: 1)--json: Output as JSONReturns:
Find Open Reading Frames (ORFs) in DNA sequences:
# Find ORFs with default 100bp minimum
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics find-orfs \
--sequence="ATGAAATAG..." --min-length=100
# Find shorter ORFs (50bp minimum)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics find-orfs \
--sequence="ATGAAATAG..." --min-length=50
Parameters:
--sequence: DNA sequence to search (required)--min-length: Minimum ORF length in nucleotides (default: 100)--json: Output as JSONReturns:
Compute detailed protein properties:
# Full protein analysis with explanation
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-protein \
--sequence="MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH" \
--explain
Parameters:
--sequence: Protein sequence (one-letter amino acid codes, required)--explain: Include detailed explanation--json: Output as JSONReturns:
Interpretation:
Search for protein domains/motifs using PROSITE-like patterns:
# Search for common domains
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics find-domains \
--sequence="MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH"
Parameters:
--sequence: Protein sequence to search (required)--json: Output as JSONReturns:
Searches for:
Perform pairwise sequence alignment:
# Global alignment (Needleman-Wunsch)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics align-sequences \
--seq1="ACGT" --seq2="AGCT" --alignment-type=global
# Local alignment (Smith-Waterman)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics align-sequences \
--seq1="ACGTACGTACGT" --seq2="CGTACGTA" --alignment-type=local
Parameters:
--seq1: First sequence (required)--seq2: Second sequence (required)--alignment-type: global or local (default: global)--json: Output as JSONReturns:
Alignment Types:
Load and analyze PDB structure files:
# Load PDB structure
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics load-structure \
--pdb-file="/path/to/1abc.pdb"
Parameters:
--pdb-file: Path to PDB structure file (required)--json: Output as JSONReturns:
Find atomic contacts within a structure:
# Find contacts within 4.0 Å
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics find-contacts \
--pdb-file="/path/to/1abc.pdb" --cutoff=4.0
# Find contacts in specific chain
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics find-contacts \
--pdb-file="/path/to/1abc.pdb" --cutoff=5.0 --chain=A
Parameters:
--pdb-file: Path to PDB structure file (required)--cutoff: Distance cutoff in Angstroms (default: 4.0)--chain: Filter by chain ID (optional)--json: Output as JSONReturns:
Analyze residues around a potential binding site:
# Analyze 8Å radius around residue 100 in chain A
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-binding-site \
--pdb-file="/path/to/1abc.pdb" --center-residue=100 --chain=A --radius=8.0
# Larger binding site (10Å)
/home/mikeb/theory2/.venv/bin/theory --json bioinformatics analyze-binding-site \
--pdb-file="/path/to/1abc.pdb" --center-residue=150 --chain=B --radius=10.0
Parameters:
--pdb-file: Path to PDB structure file (required)--center-residue: Central residue number (required)--chain: Chain ID (default: A)--radius: Search radius in Angstroms (default: 8.0)--json: Output as JSONReturns:
All commands return structured JSON:
{
"status": "success",
"result": {
// Command-specific results
},
"metadata": {
"tool": "Biopython",
"method": "Analysis method",
"timestamp": "ISO-8601",
"duration_ms": 123,
"gpu_used": false
},
"provenance": {
"method": "Description",
"inputs": {...},
"library": "Bio.X.Y"
},
"next_actions": [
"Suggested next steps"
]
}
All bioinformatics commands require Biopython:
# Install Biopython
uv pip install biopython
# 1. Find ORFs in genomic DNA
theory --json bioinformatics find-orfs \
--sequence="..." --min-length=300
# 2. Translate ORF to protein
theory --json bioinformatics translate-dna \
--sequence="ORF_SEQUENCE"
# 3. Analyze protein properties
theory --json bioinformatics analyze-protein \
--sequence="TRANSLATED_PROTEIN" --explain
# 4. Find functional domains
theory --json bioinformatics find-domains \
--sequence="TRANSLATED_PROTEIN"
# 1. Analyze both sequences
theory --json bioinformatics analyze-sequence --sequence="SEQ1"
theory --json bioinformatics analyze-sequence --sequence="SEQ2"
# 2. Align sequences
theory --json bioinformatics align-sequences \
--seq1="SEQ1" --seq2="SEQ2" --alignment-type=global
# 1. Load structure
theory --json bioinformatics load-structure --pdb-file="protein.pdb"
# 2. Identify binding site
theory --json bioinformatics analyze-binding-site \
--pdb-file="protein.pdb" --center-residue=100 --radius=10
# 3. Find contacts within binding site
theory --json bioinformatics find-contacts \
--pdb-file="protein.pdb" --cutoff=5.0 --chain=A
from bioinformatics.sequence_analysis import (
analyze_sequence,
translate_dna,
find_orfs,
align_sequences
)
from bioinformatics.protein_analysis import (
analyze_protein,
find_domains
)
from bioinformatics.structure_analysis import (
load_pdb_structure,
find_contacts,
analyze_binding_site
)
# Sequence analysis
seq_result = analyze_sequence("ATGCGATCG", seq_type="dna")
# Protein analysis
protein_result = analyze_protein("MVLSPADK")
# Structure analysis
structure = load_pdb_structure("protein.pdb")
contacts = find_contacts("protein.pdb", cutoff=4.0)