From ml-research
Comprehensive guide for marimo - reactive Python notebooks as pure .py files, uv integration, AI-friendly architecture, reproducible data science workflows, and serverless deployment with WASM
npx claudepluginhub nishide-dev/claude-code-ml-researchThis skill uses the workspace's default tool permissions.
marimo is a next-generation reactive notebook for Python that solves Jupyter's fundamental problems: hidden state, poor Git integration, and lack of reproducibility. It stores notebooks as pure `.py` files with reactive execution based on a DAG (Directed Acyclic Graph), making it ideal for AI-assisted development and production-ready data science workflows.
Verifies tests pass on completed feature branch, presents options to merge locally, create GitHub PR, keep as-is or discard; executes choice and cleans up worktree.
Guides root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Writes implementation plans from specs for multi-step tasks, mapping files and breaking into TDD bite-sized steps before coding.
Share bugs, ideas, or general feedback.
marimo is a next-generation reactive notebook for Python that solves Jupyter's fundamental problems: hidden state, poor Git integration, and lack of reproducibility. It stores notebooks as pure .py files with reactive execution based on a DAG (Directed Acyclic Graph), making it ideal for AI-assisted development and production-ready data science workflows.
Key Benefits:
uv package managerResources:
Unlike Jupyter's imperative execution (run cells in any order), marimo uses reactive programming:
How it works:
Example:
# Cell 1: Load data
import polars as pl
df = pl.read_csv("data.csv")
# Cell 2: Process (depends on df)
df_clean = df.drop_nulls()
# Cell 3: Visualize (depends on df_clean)
import altair as alt
chart = alt.Chart(df_clean).mark_line().encode(x='date', y='sales')
If you modify Cell 1, marimo automatically re-runs Cells 2 and 3 in the correct order. No manual re-running needed.
marimo notebooks are pure Python scripts, not JSON:
import marimo
__generated_with = "0.9.0"
app = marimo.App()
@app.cell
def __(mo):
mo.md("# Data Analysis")
return
@app.cell
def __():
import polars as pl
df = pl.read_csv("data.csv")
return df, pl
@app.cell
def __(df, alt):
chart = alt.Chart(df).mark_bar().encode(x='category', y='sales')
return chart,
Benefits:
marimo enforces clean variable management:
This eliminates Jupyter's "zombie variables" problem where deleted cells leave variables in memory.
# With uv (recommended)
uv tool install marimo
# With pip
pip install marimo
# With pipx
pipx install marimo
# Create and edit new notebook
marimo new notebook.py
# Edit existing notebook
marimo edit notebook.py
# Run as app (hides code)
marimo run notebook.py
# Convert .ipynb to marimo .py
marimo convert notebook.ipynb -o notebook.py
# Then fix any dependency issues
marimo edit notebook.py
marimo has deep integration with the uv package manager for sandboxed, reproducible environments.
marimo uses PEP 723 to embed dependencies in the notebook file:
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "marimo",
# "polars==1.7.1",
# "altair==5.4.1",
# "duckdb==1.1.3",
# ]
# ///
import marimo
# ... rest of notebook
Benefits:
.py file, receiver gets correct environmentRun notebooks in isolated environments with automatic dependency management:
# Create sandbox environment and run
marimo edit --sandbox notebook.py
What happens:
uv creates a temporary virtual environment (fast!)Auto-import detection:
When you write import pandas, marimo:
pandas to PEP 723 metadatauvpip install needed!Generate lock file for exact reproducibility:
# Generate uv.lock from PEP 723 metadata
uv pip compile --script notebook.py -o requirements.lock
# Install from lock file
uv pip sync requirements.lock
marimo is optimized for AI coding (Claude Code, Cursor, Copilot).
1. Pure Python format:
2. Structured code:
@app.cell decorator3. File watching:
.py file changesRecommended setup:
# Terminal 1: Start marimo with file watching
marimo edit --watch notebook.py
# Terminal 2: Use Claude Code to edit
claude
Process:
notebook.py (pure Python, easy for LLM)This is called "Vibe Coding" - watch results while AI codes.
Cursor excels at refactoring marimo notebooks:
Cursor understands @app.cell structure and makes precise edits.
marimo editor has data-aware AI:
# In marimo editor, AI knows df's schema!
df # Press Ctrl+Shift+I to ask AI
# AI sees: df has columns ['date', 'sales', 'category']
# You ask: "Plot monthly sales trend"
# AI generates: alt.Chart(df).encode(x='month(date):T', y='sum(sales):Q')
The AI has access to runtime state (DataFrame schemas, variable types), so it generates immediately executable code.
marimo provides reactive UI elements for building interactive notebooks:
import marimo as mo
# Slider
slider = mo.ui.slider(start=0, stop=100, value=50, label="Threshold")
# Dropdown
dropdown = mo.ui.dropdown(
options=["A", "B", "C"],
value="A",
label="Category"
)
# Text input
text = mo.ui.text(placeholder="Enter query...")
# Date picker
date = mo.ui.date(label="Start Date")
UI elements automatically trigger re-execution:
# Cell 1: Create slider
import marimo as mo
threshold = mo.ui.slider(0, 100, value=50)
threshold
# Cell 2: Use slider value (automatically re-runs when slider changes)
import polars as pl
df_filtered = df.filter(pl.col("sales") > threshold.value)
df_filtered
Key point: threshold.value creates a dependency. When slider moves, this cell auto-updates.
# Interactive table with search, sort, filter
mo.ui.table(df, selection="multi")
# With custom formatters
mo.ui.table(
df,
formatters={
"price": lambda x: f"${x:.2f}",
"date": lambda x: x.strftime("%Y-%m-%d")
}
)
# Group multiple inputs
form = mo.ui.form({
"model": mo.ui.dropdown(["linear", "tree", "neural"]),
"epochs": mo.ui.slider(1, 100, value=10),
"lr": mo.ui.number(start=0.001, stop=0.1, step=0.001, value=0.01)
})
# Access submitted values
if form.value:
model_type = form.value["model"]
epochs = form.value["epochs"]
learning_rate = form.value["lr"]
Motivation: Jupyter requires manual re-running when changing parameters.
marimo solution: Bind query parameters to UI elements.
# Cell 1: UI controls
import marimo as mo
date_range = mo.ui.date_range()
category = mo.ui.dropdown(["Electronics", "Clothing", "Food"])
# Cell 2: Query (auto-updates when UI changes)
import duckdb
query = f"""
SELECT date, SUM(sales) as total
FROM sales
WHERE date BETWEEN '{date_range.value[0]}' AND '{date_range.value[1]}'
AND category = '{category.value}'
GROUP BY date
"""
df = duckdb.sql(query).pl()
# Cell 3: Visualization (auto-updates)
import altair as alt
chart = alt.Chart(df).mark_line().encode(x='date', y='total')
Adjust date range or category → entire pipeline re-runs instantly.
Motivation: Convert experiment notebook to production pipeline without refactoring.
marimo solution: Notebooks are parameterized Python scripts.
# notebook.py can be run as CLI:
# python notebook.py -- --epochs 50 --model-type transformer
import marimo as mo
import sys
app = mo.App()
@app.cell
def __():
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--epochs", type=int, default=10)
parser.add_argument("--model-type", default="linear")
args = parser.parse_args()
return args,
@app.cell
def __(args):
# Train model with args.epochs and args.model_type
model = train_model(epochs=args.epochs, model_type=args.model_type)
return model,
No rewrite needed - same file works for:
marimo edit notebook.pypython notebook.py -- --epochs 100Motivation: Streamlit re-runs entire app on interaction (slow for large data).
marimo solution: Only re-run changed cells (reactive).
# Build dashboard
import marimo as mo
# Cell 1: Load data once (doesn't re-run on interaction)
import polars as pl
df = pl.read_parquet("large_data.parquet") # 10GB file
# Cell 2: UI filters (cheap to re-run)
filters = mo.ui.form({
"region": mo.ui.dropdown(df["region"].unique().to_list()),
"date": mo.ui.date_range()
})
# Cell 3: Filter data (only re-runs when filters change)
df_filtered = df.filter(
(pl.col("region") == filters.value["region"]) &
(pl.col("date").is_between(*filters.value["date"]))
)
# Cell 4: Viz (only re-runs when df_filtered changes)
chart = create_chart(df_filtered)
Deploy as app:
marimo run dashboard.py # Code hidden, UI visible
Motivation: Share interactive analysis without server costs.
marimo solution: Export to WASM (runs in browser).
# Export to self-contained HTML with Python runtime
marimo export html-wasm notebook.py -o analysis.html
Upload analysis.html to GitHub Pages or S3:
Defer expensive computations until needed:
# Cell 1: Define lazy query
import duckdb
lazy_query = duckdb.sql("SELECT * FROM large_table WHERE ...")
# Cell 2: Only execute when needed
if user_clicked_button:
result = lazy_query.pl() # Materialize now
Cache expensive computations:
@app.cell
def __():
import functools
@functools.lru_cache
def expensive_computation(param):
# Heavy processing
return result
return expensive_computation,
Import cells from other notebooks:
# utils.py (marimo notebook)
@app.cell
def load_data():
import polars as pl
return pl.read_csv("data.csv")
# analysis.py (another marimo notebook)
@app.cell
def __():
from utils import load_data
df = load_data()
return df,
# Cell 1: Define model
import lightning as L
class Model(L.LightningModule):
def __init__(self, lr):
super().__init__()
self.model = ...
self.lr = lr
# Cell 2: Interactive LR tuning
import marimo as mo
lr_slider = mo.ui.slider(0.0001, 0.01, step=0.0001, value=0.001)
# Cell 3: Train (re-runs when LR changes)
model = Model(lr=lr_slider.value)
trainer = L.Trainer(max_epochs=10)
trainer.fit(model, train_loader)
# Cell 1: Initialize W&B
import wandb
run = wandb.init(project="marimo-demo")
# Cell 2: Log metrics reactively
wandb.log({"accuracy": model_accuracy, "loss": model_loss})
# Cell 1: Load Hydra config
from hydra import compose, initialize
with initialize(config_path="configs"):
cfg = compose(config_name="config")
# Cell 2: Interactive config override
import marimo as mo
batch_size = mo.ui.slider(16, 128, value=cfg.batch_size)
# Cell 3: Use config
dataloader = DataLoader(dataset, batch_size=batch_size.value)
--sandbox flag for reproducibility.py format for clean PRs and code review.py files with --watch modemo.md() for explanatory textglobals(): Breaks static analysisProblem: Same variable defined in multiple cells.
# Cell 1
x = 1
# Cell 2
x = 2 # Error!
Solution: Use unique names or consolidate logic.
# Cell 1
x_initial = 1
# Cell 2
x_processed = x_initial * 2
Problem: Large cells re-run too often.
Solution 1: Split into smaller cells (only changed parts re-run).
Solution 2: Use lazy evaluation or caching.
@functools.lru_cache
def load_large_data():
return pd.read_parquet("huge.parquet")
Problem: Cell doesn't depend on UI element.
Solution: Reference ui_element.value to create dependency.
# Wrong (no dependency)
slider = mo.ui.slider(0, 100)
result = compute(50) # Hardcoded value
# Correct (reactive)
slider = mo.ui.slider(0, 100)
result = compute(slider.value) # Creates dependency
marimo convert notebook.ipynb -o notebook.py
Open in marimo and resolve errors:
Add PEP 723 dependencies:
# /// script
# dependencies = ["pandas", "matplotlib"]
# ///
# Test in sandbox
marimo edit --sandbox notebook.py
# Verify reproducibility
rm -rf .venv
marimo edit --sandbox notebook.py # Should work fresh
| Feature | Jupyter | marimo | Advantage |
|---|---|---|---|
| Execution | Manual, any order | Automatic, DAG-based | marimo (reproducibility) |
| File format | JSON (.ipynb) | Python (.py) | marimo (Git, AI-friendly) |
| State management | Hidden state | Always synced | marimo (no bugs) |
| Package management | Manual pip/conda | uv integration | marimo (speed, reproducibility) |
| AI coding | Poor (JSON noise) | Excellent (pure Python) | marimo |
| Ecosystem | Massive, mature | Growing | Jupyter |
| Learning curve | Low | Medium | Jupyter |
Recommendation:
.py format is optimal for LLMsCreate notebook:
marimo new notebook.py
Edit notebook:
marimo edit notebook.py
marimo edit --sandbox notebook.py # Isolated environment
marimo edit --watch notebook.py # Hot reload for AI editing
Run as app:
marimo run notebook.py # Hide code, show outputs
Convert from Jupyter:
marimo convert notebook.ipynb -o notebook.py
Export:
marimo export html notebook.py -o output.html
marimo export html-wasm notebook.py -o app.html # Serverless
marimo export script notebook.py -o script.py # Pure Python
Minimal notebook:
import marimo
app = marimo.App()
@app.cell
def __():
import marimo as mo
mo.md("# Hello, marimo!")
return mo,
@app.cell
def __(mo):
slider = mo.ui.slider(0, 100, value=50)
slider
return slider,
@app.cell
def __(slider):
f"Value: {slider.value}"
return
marimo is the future of Python notebooks:
For modern data science workflows that prioritize engineering quality, reproducibility, and AI collaboration, marimo is the superior choice over Jupyter.