baldrick

Autonomous AI coding loops that actually finish.

Baldrick

"I have a cunning plan!" - Baldrick, Blackadder

Quick Start

curl -sO https://raw.githubusercontent.com/jagreehal/baldrick/main/baldrick && chmod +x baldrick && ./baldrick init

Create a spec, run ./baldrick run 5, watch Claude implement it.

What Is This?

Based on Geoffrey Huntley's Ralph pattern — define what "done" looks like, run Claude repeatedly until it gets there. Baldrick adds structure:

Deterministic selection — bash picks tasks mechanically (priority/risk), prevents AI from skipping hard work
Redundant completion — 3 signals must agree (done file + token + all specs pass), prevents false victories
Feedback loops — build/test/lint every iteration, reviewer mode every Nth (configurable, default 5) checks progress and fixes issues, self-correcting
Full observability — progress.txt (what happened), tradeoffs.md (why), logs/ (raw data)
Works with any language — configure build/test/lint commands for your stack (see Configuration)

See design.md for the full philosophy.

Why Loops Work

Every AI coding session accumulates context: files read, commands run, wrong turns taken, half-baked plans. You can add but can't delete. Eventually it starts repeating itself, "fixing" the same bug different ways, confidently undoing its own work. That's context pollution.

Baldrick doesn't try to clean the context. It throws it away and starts fresh. Progress persists in files (specs, progress.txt, git). Failures evaporate with the session. Each iteration reconstructs reality from the filesystem, not chat history.

Getting Started

After running ./baldrick init, you'll have:

specs/                  # Your feature specs go here
templates/              # Spec templates (minimal, standard, detailed, example)
progress.txt            # Session progress log
baldrick-learnings.md   # Permanent patterns
tradeoffs.md            # Decision log (auto-created if missing)
logs/                   # Iteration logs (auto-created if missing)

Requires: Claude Code CLI (configured with appropriate permissions)

Now create a spec. This tells baldrick what "done" looks like:

Quick way:

./baldrick new hello --title "Add Hello Function"

Or manually:

cat > specs/hello.md << 'EOF'
---
title: "Add Hello Function"
passes: false
---
# Add Hello Function
## Done When
- [ ] Function `hello(name)` returns "Hello, {name}!"
- [ ] Tests pass
EOF

For better results, see Spec Templates or use Claude Code skills (spec-create, spec-vibe, spec-detailed)

passes: true means the spec's Done When criteria are met and quality gates (build/test/lint) are green.

Run it:

./baldrick run 1           # Run one iteration
./baldrick run 10 --verbose # Run with streaming output
./baldrick status          # Check progress

That's it. Claude implements the feature, runs quality gates (build/test/lint), and commits when all checks pass.

Run options:

--dry-run - Show what would run without executing
--stream-json - Use Claude stream-json + jq (requires jq)
--verbose or -v - Stream Claude output in real-time (instead of spinner)
--stream=PATH - Enable TUI event streaming (advanced)
--control=PATH - Enable TUI control socket (advanced)

For Non-Technical Users

Don't want to write specs manually? Use any AI assistant (ChatGPT, Gemini, Claude):

Chat about your feature - Describe what you want to build
Paste the generator template - Copy templates/spec-generator.md into your chat
Get a Baldrick spec - The AI generates a properly formatted spec
Save to specs folder - Copy output to specs/your-feature.md (Baldrick auto-discovers specs here)
Run Baldrick - ./baldrick run or ./baldrick once your-feature to run a specific spec

This bridges conversational AI (great for brainstorming) with Baldrick (great for execution).

Tip: If your feature is large, the AI will generate multiple specs. Save each one as a separate file.

The Development Cycle

baldrick

Autonomous AI coding loops that actually finish.

Baldrick

"I have a cunning plan!" - Baldrick, Blackadder

Quick Start

curl -sO https://raw.githubusercontent.com/jagreehal/baldrick/main/baldrick && chmod +x baldrick && ./baldrick init

Create a spec, run ./baldrick run 5, watch Claude implement it.

What Is This?

Based on Geoffrey Huntley's Ralph pattern — define what "done" looks like, run Claude repeatedly until it gets there. Baldrick adds structure:

Deterministic selection — bash picks tasks mechanically (priority/risk), prevents AI from skipping hard work
Redundant completion — 3 signals must agree (done file + token + all specs pass), prevents false victories
Feedback loops — build/test/lint every iteration, reviewer mode every Nth (configurable, default 5) checks progress and fixes issues, self-correcting
Full observability — progress.txt (what happened), tradeoffs.md (why), logs/ (raw data)
Works with any language — configure build/test/lint commands for your stack (see Configuration)

See design.md for the full philosophy.

Why Loops Work

Getting Started

After running ./baldrick init, you'll have:

specs/                  # Your feature specs go here
templates/              # Spec templates (minimal, standard, detailed, example)
progress.txt            # Session progress log
baldrick-learnings.md   # Permanent patterns
tradeoffs.md            # Decision log (auto-created if missing)
logs/                   # Iteration logs (auto-created if missing)

Requires: Claude Code CLI (configured with appropriate permissions)

Now create a spec. This tells baldrick what "done" looks like:

Quick way:

./baldrick new hello --title "Add Hello Function"

Or manually:

cat > specs/hello.md << 'EOF'
---
title: "Add Hello Function"
passes: false
---
# Add Hello Function
## Done When
- [ ] Function `hello(name)` returns "Hello, {name}!"
- [ ] Tests pass
EOF

For better results, see Spec Templates or use Claude Code skills (spec-create, spec-vibe, spec-detailed)

passes: true means the spec's Done When criteria are met and quality gates (build/test/lint) are green.

Run it:

./baldrick run 1           # Run one iteration
./baldrick run 10 --verbose # Run with streaming output
./baldrick status          # Check progress

That's it. Claude implements the feature, runs quality gates (build/test/lint), and commits when all checks pass.

Run options:

--dry-run - Show what would run without executing
--stream-json - Use Claude stream-json + jq (requires jq)
--verbose or -v - Stream Claude output in real-time (instead of spinner)
--stream=PATH - Enable TUI event streaming (advanced)
--control=PATH - Enable TUI control socket (advanced)

For Non-Technical Users

Don't want to write specs manually? Use any AI assistant (ChatGPT, Gemini, Claude):

Chat about your feature - Describe what you want to build
Paste the generator template - Copy templates/spec-generator.md into your chat
Get a Baldrick spec - The AI generates a properly formatted spec
Save to specs folder - Copy output to specs/your-feature.md (Baldrick auto-discovers specs here)
Run Baldrick - ./baldrick run or ./baldrick once your-feature to run a specific spec

This bridges conversational AI (great for brainstorming) with Baldrick (great for execution).

Tip: If your feature is large, the AI will generate multiple specs. Save each one as a separate file.

Help us improve

Find plugins for your project

Help us improve

baldrick

Popularity

Health & Quality

Confidence

What's Inside

Help us improve

README

baldrick

Quick Start

What Is This?

Why Loops Work

Getting Started

For Non-Technical Users

The Development Cycle

Similar Plugins

ralph-specum

claude-pilot

ralph-dev

tandemkit

director-mode-lite

quaestor

More by jagreehal

jagreehal-claude-skills

track-and-improve

baldrick

Quick Start

What Is This?

Why Loops Work

Getting Started

For Non-Technical Users

The Development Cycle