Name: designdoc
Author: spillwavesolutions

designdoc

Harness-engineered codebase documentation pipeline.

Walks any repo bottom-up and emits a validated docs/design/ tree: per-class docs, package rollups, mermaid diagrams (syntax + semantics validated), a system-design rollup, a tech-debt ledger, and a YAML file of unresolved human-in-the-loop disputes.

Install

As a CLI

Install the designdoc command into a uv-managed tool environment:

uv tool install git+https://github.com/SpillwaveSolutions/docgen
designdoc generate --repo /path/to/your/repo --budget 5.00

(Alternatively, pipx install git+https://github.com/SpillwaveSolutions/docgen if you prefer pipx. Once published to PyPI, pip install designdoc will also work.)

As a Claude Code plugin

claude plugin marketplace add SpillwaveSolutions/docgen
claude plugin install designdoc

Adds a /designdoc slash command (generate | resume | status | resolve).

From a clone (development)

git clone https://github.com/SpillwaveSolutions/docgen
cd docgen
uv sync
uv run designdoc generate --repo /path/to/your/repo --budget 5.00

Output lands in <repo>/docs/design/.

Status

v1 complete, v1.1 incremental-regeneration landed. All nine pipeline stages are implemented, tested, and measured end-to-end. See plans/2026_04_16_designdoc_gen_v1.md for the task plan.

Measured performance

Against the tests/fixtures/tiny_repo fixture (5 Python files, 3 classes, 1 dep) on claude-sonnet-4-6 via a Claude Max subscription:

Run	Wall clock	Cost (SDK-reported)	LLM invocations
Cold (first run, parallelism=3)	~16 min	~$3.98	58
Cold (parallelism=1 baseline)	~26 min	~$4.57	60
Warm (no source changes)	< 1 sec	$0.00	0

Two v1.1 optimizations combine here:

Parallelism: config.parallelism (default 3) caps concurrent doer/checker invocations in Stages 2/3/4/6 via asyncio.Semaphore. Cold-run wall clock drops ~37% on tiny_repo; bigger repos with more files will see larger gains. Tune via --parallelism N or [pipeline].parallelism in .designdoc.toml.
Incremental: the warm run skips every stage via content-hash comparison against prev_hashes / rollup_hashes in the pipeline state. Any single-file edit regenerates only that file's class doc + its package rollup + the system rollup — not the whole tree.

Run designdoc status to see which caches are primed. Reproduce with task test-e2e (requires claude CLI logged in and npx on PATH).

Design principles (Gen 3 harness engineering)

Control flow lives in Python, not prompts.
Checkers run in their own context window (no self-grading).
Scopes are small and bounded (file → class → package → system).
Failures are loud (schema-validated verdicts, HIL YAML on dispute).
Reliability over speed (max_attempts=3, bounded parallelism).
Mermaid is syntax + semantics validated before shipping.

See CLAUDE.md / AGENT.md for the full invariants.

Development

Prerequisites

Python 3.12+ (dev machine runs 3.13)
uv for env management
Task for running commands
@mermaid-js/mermaid-cli via npx (auto-fetched at Stage 5 preflight)
claude CLI (Claude Code) logged in to a Pro/Max subscription — used by the e2e / dogfood runs. No ANTHROPIC_API_KEY required.

Commands

task install         # uv sync — install deps
task test            # unit + integration, no real API
task test-unit       # unit tests only
task test-e2e        # e2e tests (requires API key + mmdc)
task lint            # ruff check
task format          # ruff format
task ci              # exactly what CI runs — must be green before push
task dogfood         # real pipeline run against tests/fixtures/tiny_repo

Run a single test:

uv run pytest tests/unit/test_loop.py::test_ships_with_hil_after_3_fails -v

Test-and-commit discipline

Every change follows TWRC: write the test, write the code, run task ci, commit.

CI parity: task ci must run the exact same commands as .github/workflows/test.yml. If you change one, change the other in the same commit. Every commit is a green checkpoint.

Help us improve

designdoc

Component Overview

Component Details

Commands (1)

README

designdoc

Install

As a CLI

As a Claude Code plugin

From a clone (development)

Status

Measured performance

Design principles (Gen 3 harness engineering)

Development

Prerequisites

Commands

Test-and-commit discipline

Layout

Similar Plugins

c4-architecture

documentation-generator

project-starter

context7-plugin

deep-wiki

More by spillwavesolutions

Help us improve

llm-wiki