From sciagent-skills
Performs COBRA analysis on genome-scale metabolic models with COBRApy: FBA, FVA, knockouts, flux sampling, production envelopes, gapfilling, media optimization for strain design and essential gene identification.
npx claudepluginhub jaechang-hits/sciagent-skills --plugin sciagent-skillsThis skill uses the workspace's default tool permissions.
COBRApy is a Python package for constraint-based reconstruction and analysis (COBRA) of genome-scale metabolic models. It provides flux balance analysis (FBA), flux variability analysis (FVA), gene and reaction knockout screens, flux sampling, production envelopes, gapfilling, and media optimization on SBML-format metabolic networks.
Loads, analyzes, and simulates metabolic models with COBRApy: FBA, FVA, gene knockouts, flux sampling, SBML/JSON/YAML I/O for systems biology.
Perform constraint-based metabolic modeling with COBRApy: FBA, FVA, gene knockouts, flux sampling on SBML models for systems biology and metabolic engineering.
Build, read, validate, and modify SBML biological network models using libSBML Python API. Supports Levels 1–3, reactions, FBC for flux balance, integrates with COBRApy, Tellurium for ODE/metabolic/signaling models.
Share bugs, ideas, or general feedback.
COBRApy is a Python package for constraint-based reconstruction and analysis (COBRA) of genome-scale metabolic models. It provides flux balance analysis (FBA), flux variability analysis (FVA), gene and reaction knockout screens, flux sampling, production envelopes, gapfilling, and media optimization on SBML-format metabolic networks.
cobra (includes GLPK solver), numpy, pandaspip install cobra
from cobra.io import load_model
model = load_model("textbook") # E. coli core model
print(f"Model: {model.id} — {len(model.reactions)} rxns, {len(model.metabolites)} mets, {len(model.genes)} genes")
solution = model.optimize()
print(f"Growth rate: {solution.objective_value:.4f} /h")
print(f"Status: {solution.status}")
# Model: e_coli_core — 95 rxns, 72 mets, 137 genes
# Growth rate: 0.8739 /h
# Status: optimal
Load bundled models and read/write standard formats.
from cobra.io import load_model, read_sbml_model, write_sbml_model, load_json_model, save_json_model
# Bundled: "textbook" (95 rxns), "ecoli" (2583 rxns), "salmonella"
model = load_model("textbook")
# model = read_sbml_model("my_model.xml") # from SBML file
# model = load_json_model("my_model.json") # from JSON file
write_sbml_model(model, "output_model.xml")
save_json_model(model, "output_model.json")
print(f"Saved model: {model.id}")
Access reactions, metabolites, and genes via DictList containers.
from cobra.io import load_model
model = load_model("textbook")
# Inspect a reaction
rxn = model.reactions.get_by_id("PFK")
print(f"Reaction: {rxn.id} — {rxn.name}")
print(f"Equation: {rxn.reaction}")
print(f"Bounds: {rxn.bounds}, GPR: {rxn.gene_reaction_rule}")
# Inspect a metabolite
met = model.metabolites.get_by_id("atp_c")
print(f"Metabolite: {met.id}, Formula: {met.formula}, Compartment: {met.compartment}")
# Query and list exchange reactions
atp_rxns = model.reactions.query("atp", attribute="name")
print(f"ATP-related reactions: {len(atp_rxns)}, Exchange reactions: {len(model.exchanges)}")
Predict optimal flux distributions by maximizing an objective.
from cobra.io import load_model
from cobra.flux_analysis import pfba
model = load_model("textbook")
# Standard FBA
solution = model.optimize()
print(f"Growth: {solution.objective_value:.4f} /h, Active fluxes: {(solution.fluxes.abs() > 1e-6).sum()}")
# Parsimonious FBA — same growth, minimal total flux
pfba_sol = pfba(model)
print(f"pFBA total flux: {pfba_sol.fluxes.abs().sum():.1f} vs standard: {solution.fluxes.abs().sum():.1f}")
# Change objective; slim_optimize for speed
from cobra.io import load_model
model = load_model("textbook")
with model:
model.objective = "ATPM"
print(f"Max ATPM flux: {model.optimize().objective_value:.2f}")
print(f"Growth (slim): {model.slim_optimize():.4f}") # no flux vector, faster
Determine feasible flux ranges at or near optimality.
from cobra.io import load_model
from cobra.flux_analysis import flux_variability_analysis
model = load_model("textbook")
fva = flux_variability_analysis(model, fraction_of_optimum=1.0)
fva_90 = flux_variability_analysis(model, fraction_of_optimum=0.9)
fva["range"] = fva["maximum"] - fva["minimum"]
fva_90["range"] = fva_90["maximum"] - fva_90["minimum"]
print(f"Mean range at 100%: {fva['range'].mean():.2f}, at 90%: {fva_90['range'].mean():.2f}")
# Loopless FVA on specific reactions
from cobra.io import load_model
from cobra.flux_analysis import flux_variability_analysis
model = load_model("textbook")
fva_ll = flux_variability_analysis(
model, loopless=True, reaction_list=["PFK", "PGI", "FBA", "TPI", "GAPD"],
)
print(fva_ll)
Screen for essential genes/reactions via knockout simulations.
from cobra.io import load_model
from cobra.flux_analysis import single_gene_deletion, double_gene_deletion
model = load_model("textbook")
wt_growth = model.slim_optimize()
# Single gene deletions
gene_results = single_gene_deletion(model)
gene_results["growth_fraction"] = gene_results["growth"] / wt_growth
essential = gene_results[gene_results["growth_fraction"] < 0.01]
print(f"Essential genes: {len(essential)} / {len(model.genes)}")
# Double deletions (synthetic lethality) — use multiprocessing
double_results = double_gene_deletion(model, processes=4)
print(f"Double deletion results: {double_results.shape}")
Modify nutrient availability and compute minimal media.
from cobra.io import load_model
from cobra.medium import minimal_medium
model = load_model("textbook")
# View current medium
for rxn_id, flux in sorted(model.medium.items()):
print(f" {rxn_id}: {flux}")
# Anaerobic switch via context manager
with model:
medium = model.medium
medium["EX_o2_e"] = 0.0
model.medium = medium
print(f"Anaerobic growth: {model.slim_optimize():.4f} /h")
# Minimal medium
min_med = minimal_medium(model, minimize_components=True, open_exchanges=True)
print(f"Minimal medium: {len(min_med)} components")
Sample feasible flux distributions for variability analysis.
from cobra.io import load_model
from cobra.sampling import sample
model = load_model("textbook")
samples = sample(model, n=500, method="optgp")
print(f"Samples shape: {samples.shape}") # (500, n_reactions)
print(f"PFK flux: mean={samples['PFK'].mean():.2f}, std={samples['PFK'].std():.2f}")
Compute phenotype phase planes and fill model gaps.
from cobra.io import load_model
from cobra.flux_analysis import production_envelope
model = load_model("textbook")
envelope = production_envelope(
model,
reactions=model.reactions.get_by_id("EX_ac_e"),
carbon_sources=model.reactions.get_by_id("EX_glc__D_e"),
)
print(f"Envelope: {len(envelope)} points")
print(envelope[["flux_minimum", "flux_maximum", "carbon_yield_minimum", "carbon_yield_maximum"]].head())
# Gapfilling: restore growth after reaction removal
from cobra.io import load_model
from cobra.flux_analysis.gapfilling import gapfill
model = load_model("textbook")
universal = load_model("textbook") # In practice, use a universal reaction DB
with model:
model.remove_reactions([model.reactions.get_by_id("PFK")])
print(f"Growth after removing PFK: {model.slim_optimize():.4f}")
for rxn in gapfill(model, universal)[0]:
print(f" Gapfill suggests: {rxn.id}")
Reactions, metabolites, and genes are stored in DictList — ordered, indexable, and accessible by ID.
rxn = model.reactions[0] # by index
rxn = model.reactions.get_by_id("PFK") # by ID
matches = model.reactions.query("phospho") # keyword search
EX_ prefix reactions represent system boundary. Positive flux = secretion; negative = uptake. Managed via model.medium dict.
Boolean expressions linking genes to reactions: (b0726 and b0727) or b1234. Gene knockout propagates through GPR logic.
with model: snapshots state and reverts all changes on exit (bounds, objective, medium, knockouts). Nesting supported.
Goal: Identify essential, growth-reducing, and neutral genes.
from cobra.io import load_model
from cobra.flux_analysis import single_gene_deletion
model = load_model("textbook")
wt_growth = model.slim_optimize()
results = single_gene_deletion(model)
results["growth_fraction"] = results["growth"] / wt_growth
essential = results[results["growth_fraction"] < 0.01]
reduced = results[(results["growth_fraction"] >= 0.01) & (results["growth_fraction"] < 0.9)]
neutral = results[results["growth_fraction"] >= 0.9]
print(f"Essential: {len(essential)}, Reduced: {len(reduced)}, Neutral: {len(neutral)}")
for idx in essential.index:
print(f" Essential gene: {list(idx)[0]}")
Goal: Find minimal medium at different growth targets; compare aerobic vs anaerobic.
from cobra.io import load_model
from cobra.medium import minimal_medium
import pandas as pd
model = load_model("textbook")
results = []
for frac in [0.1, 0.5, 0.8, 1.0]:
with model:
target = model.slim_optimize() * frac
model.reactions.get_by_id("Biomass_Ecoli_core").lower_bound = target
try:
mm = minimal_medium(model, minimize_components=True, open_exchanges=True)
results.append({"growth_frac": frac, "n_components": len(mm)})
except Exception:
results.append({"growth_frac": frac, "n_components": None})
print(pd.DataFrame(results).to_string(index=False))
for label, o2 in [("Aerobic", 1000.0), ("Anaerobic", 0.0)]:
with model:
medium = model.medium
medium["EX_o2_e"] = o2
model.medium = medium
print(f"{label} growth: {model.slim_optimize():.4f} /h")
Goal: Design a strain with maximized target metabolite production. Combines modules 3, 5, and 8.
| Parameter | Module/Function | Default | Range / Options | Effect |
|---|---|---|---|---|
fraction_of_optimum | flux_variability_analysis | 1.0 | 0.0-1.0 | Fraction of max objective to maintain; lower = wider flux ranges |
loopless | flux_variability_analysis | False | True, False | Eliminate thermodynamically infeasible loops; slower |
method | sample | "optgp" | "optgp", "achr" | Sampling algorithm; optgp supports parallelism |
n | sample | required | 100-10000 | Number of flux samples to draw |
processes | sample, double_gene_deletion | 1 | 1-N_cores | Parallel worker processes |
minimize_components | minimal_medium | False | True, False | True = fewest nutrients (MILP); False = minimize total flux |
open_exchanges | minimal_medium | False | True, False | Allow all exchanges as nutrient candidates |
carbon_sources | production_envelope | None | Reaction object | Compute carbon yield alongside flux envelope |
thinning | sample | 100 | 1-1000 | Steps between kept samples; higher = less correlated |
Use context managers for temporary changes: with model: reverts all modifications on exit.
with model:
model.reactions.PFK.knock_out()
print(model.slim_optimize()) # modified
# model is restored here
Validate with slim_optimize() before analysis: Quick feasibility check before expensive operations (FVA, sampling).
Check solution.status after optimization: Always verify "optimal" before interpreting fluxes.
Use loopless FVA when thermodynamic feasibility matters: Standard FVA can include infeasible internal cycles that inflate flux ranges.
Parallelize expensive operations: Sampling and double deletions support processes parameter.
Prefer SBML for model exchange: Community standard supported by all COBRA tools.
Use slim_optimize() in loops: Skips full flux vector construction, significantly faster for screening.
Validate flux samples: Use sampler.validate(samples) to check stoichiometric and bound constraints.
from cobra.io import load_model
model = load_model("textbook")
print("Glucose_uptake | Growth_rate")
for glc_uptake in [1, 2, 5, 10, 15, 20]:
with model:
model.reactions.get_by_id("EX_glc__D_e").lower_bound = -glc_uptake
growth = model.slim_optimize()
print(f" {glc_uptake:>13} | {growth:.4f}")
from cobra.io import load_model
import pandas as pd
model = load_model("textbook")
conditions = [
{"name": "Rich aerobic", "EX_o2_e": 1000, "EX_glc__D_e": 10},
{"name": "Anaerobic", "EX_o2_e": 0, "EX_glc__D_e": 10},
{"name": "Low glucose", "EX_o2_e": 1000, "EX_glc__D_e": 1},
]
results = []
for c in conditions:
with model:
medium = model.medium
medium["EX_o2_e"], medium["EX_glc__D_e"] = c["EX_o2_e"], c["EX_glc__D_e"]
model.medium = medium
results.append({"condition": c["name"], "growth": round(model.slim_optimize(), 4)})
print(pd.DataFrame(results).to_string(index=False))
from cobra.io import load_model
from cobra.flux_analysis import find_blocked_reactions
model = load_model("textbook")
# Feasibility, mass balance, dead-ends, blocked reactions
print(f"Growth feasible: {model.slim_optimize() > 0}")
print(f"Missing formula: {sum(1 for m in model.metabolites if m.formula is None)}")
print(f"Dead-end metabolites: {sum(1 for m in model.metabolites if len(m.reactions) == 1)}")
print(f"Blocked reactions: {len(find_blocked_reactions(model))} / {len(model.reactions)}")
| Problem | Cause | Solution |
|---|---|---|
solution.status == "infeasible" | Constraints cannot be simultaneously satisfied | Check medium has required nutrients; verify reaction bounds; use model.medium to restore defaults |
solution.status == "unbounded" | No upper bound on fluxes | Set finite upper bounds on exchange reactions |
| Very slow optimization | Large model + default GLPK solver | Install CPLEX or Gurobi: model.solver = "cplex" |
ValueError setting bounds | lower_bound > upper_bound temporarily | Set as tuple: rxn.bounds = (new_lb, new_ub) |
| Gene deletion returns NaN | Knockout makes model infeasible | Expected for essential genes; classify as essential |
IOError reading SBML | Invalid SBML or missing namespace | Validate at sbml.org; try cobra.io.sbml.validate_sbml_model(path) |
| Flux samples fail validation | Numerical solver tolerance | Increase thinning parameter; try method="achr" |
1 reference file:
references/api_workflows.md — Consolidates API quick reference and advanced workflows. Covers: detailed function signatures, solver configuration (GLPK/CPLEX/Gurobi), advanced analysis (find_blocked_reactions, find_essential_genes/find_essential_reactions), model manipulation (adding reactions/metabolites/genes), flux sample validation, and 5 workflow examples (knockout with visualization, media design, flux space exploration, production strain design, model validation). Relocated inline: basic FBA/FVA/deletion/sampling (Core API modules 3-7). Omitted: geometric FBA internals, MIP gap configuration — consult COBRApy docs.