Help us improve
Share bugs, ideas, or general feedback.
From authoring
Designs CLIs for both human users and LLM agents, covering subcommand structure, output streams, exit codes, JSON modes, TTY-aware color, and structured errors. Use when building or refactoring a CLI, adding machine-readable output, or making a tool agent-friendly.
npx claudepluginhub crouton-labs/crouton-kit --plugin authoringHow this skill is triggered — by the user, by Claude, or both
Slash command
/authoring:cli-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A CLI has two readers now: the human typing at a prompt and the LLM agent invoking it from a shell tool. Most production CLIs were designed only for the first, and the failure modes diverge sharply once an agent starts calling them. The `--help` text is the agent's tool description. The output format is the response schema. Exit codes are error codes. Designed poorly, a CLI costs an agent rough...
Designs CLI surfaces including args/flags/subcommands/help/output/errors/config for new tools. Audits existing CLIs for consistency, composability, and agent ergonomics.
Provides patterns for building production CLI tools in Python with Typer/Click, featuring parseable JSON output, predictable command structure, and composability for agentic AI workflows.
Design spec with 98 rules for building CLI tools that AI agents can safely use. Covers structured JSON output, error handling, input contracts, safety guardrails, exit codes, and agent self-description.
Share bugs, ideas, or general feedback.
A CLI has two readers now: the human typing at a prompt and the LLM agent invoking it from a shell tool. Most production CLIs were designed only for the first, and the failure modes diverge sharply once an agent starts calling them. The --help text is the agent's tool description. The output format is the response schema. Exit codes are error codes. Designed poorly, a CLI costs an agent roughly 35× the tokens of a well-designed one on equivalent workflows (MindStudio, measured against MCP equivalents) — and that's before the agent burns retries on errors it can't parse.
The good news: most of the rules that make a CLI agent-friendly also make it script-friendly and pipeline-friendly. The Unix philosophy and agent-friendliness converge on the same answers. The mistakes are specific and well-cataloged.
For tool-specific patterns, principle citations, and a long-form anti-pattern catalog with GitHub issues, see reference.md.
Every output path on every command should have a default human form and a machine form available on demand. Don't pick one — design both.
Where the split shows up:
git status for humans, git status --porcelain for scripts. Human format can evolve between versions; porcelain format is a versioned contract.NO_COLOR=1; honor FORCE_COLOR=1 as the override.--yes / --non-interactive. A CLI that prompts in a pipe hangs an agent indefinitely — this is the single most common production break.The cost of getting this wrong is concrete and well-documented. A CLI that emits ANSI in pipes corrupts downstream grep. A CLI that prompts in a non-TTY hangs CI and agent workflows. A CLI that decorates JSON output with table borders is unparseable. These bug reports exist against npm, yarn, AWS CLI, Claude Code, codex — every project relearns the lesson.
stdout is data the caller wants to capture. stderr is everything else.
Progress, warnings, "fetching...", success confirmations — all stderr. The contract: sis list -j | jq should produce clean JSON even when the daemon prints "connecting to socket..." along the way. The diagnostic test: cmd > /dev/null should hide nothing the user wanted to see, and cmd 2>/dev/null should hide all decoration.
The Rule of Silence applies: when a command succeeds and has nothing useful to print, say nothing. "✓ Task completed" on stdout is decorative and breaks composition. Modern dev tools relax this for friendliness — fine, but route the friendly message to stderr.
The moment you ship --json or --porcelain, the shape is a versioned API. Renaming a field breaks every script in the wild — and every agent that learned to parse it.
schema_version field (or use a versioned flag like --porcelain=v2, the way git does). Callers can branch on it; you can evolve safely.gh pr list --json number,title is leaner than --json '*' | jq '{number,title}'. Save the caller a step and save tokens — agents pay per byte.--json mode, errors are structured too. {"error":"validation_failed","code":"INVALID_EFFORT","message":"...","valid_values":[...]}. Agents branch on the code; humans read the message.An error is feedback the caller uses to correct itself. An opaque error produces retry loops; a road sign produces recovery. Every error needs three pieces: what was received, what was expected, and what to do next.
# Useless — agent loops on the same call
Error: invalid argument
# Recoverable — exact fix is implied
Error: --effort must be one of: low, medium, high, xhigh (got: 'xmedium')
# Recoverable with a path
Error: deck file not found: ./questions.json
Pass an absolute path, or run from the directory containing the file.
Exit codes are the second channel. 0 on success, non-zero on any failure is universal. Beyond that, document any specific codes the caller should branch on: git diff exits 1 to signal "differences exist" (a useful condition, not an error). Don't invent codes unless callers will branch on them — most CLIs need only 0 and 1. If you do specialize, write them down: 2 for usage error, 5 for conflict, 127 for not-found are the conventional choices.
Past about seven top-level verbs, organize them. Three shapes work; mixing them doesn't.
tar, curl, rg, jq): fast to type, breaks down past ~10 commands. Right for a focused tool with one purpose.kubectl get pods, cargo build): predictable when the same verbs apply to many objects. The shape is exploration-friendly — knowing kubectl get predicts that kubectl describe and kubectl delete exist. For agents, this is deterministic tree search through --help.docker container create, gh pr list): better when objects have many distinct operations that don't share verbs across types. Pairs naturally with --json field,field projections.Pick one and apply it uniformly. The worst CLIs mix shapes (tool deploy --app foo next to tool config set key val next to tool users list). Be deliberate about aliases — npm install, npm add, npm i look helpful in isolation but force scripts and agents to remember three names for one operation. Pick the canonical name and document the others as aliases, not equals.
The standard precedence is flags > env vars > project config > user config > defaults. Flags always win, so scripts can inject overrides without touching state.
A few rules with sharp edges:
- as a positional argument means "read from stdin"; pair it with an explicit --stdin flag when ambiguity matters.ps aux reveals them; shell history saves them. Use --password-file, stdin, or a credentials file.NO_COLOR, FORCE_COLOR, PAGER, EDITOR, XDG_*, TMPDIR. Don't reinvent these.cmd-a writes to ~/.cache/tool/state and cmd-b silently depends on it, agents calling cmd-b in a fresh shell will fail. Pass state explicitly.For destructive or expensive operations, separate preview from execution. The pattern compounds: it makes agents safer (they can plan before acting) and humans saner (no surprise commits).
--dry-run for "show what would happen." Should be cheap, side-effect-free, and structurally identical to the real run.terraform plan -out=file saves a plan; terraform apply file executes exactly that plan. No drift between preview and execute.--yes or fail. Never assume.apply -f config.yaml twice should converge to the same state as once. This is more valuable than perfect rollback.For agents, --help is the only documentation that reliably gets read. Treat it the way you would a function description in a tool schema.
input_examples in tool schemas.{id, name, status} objects."sis status for live state." Helps both humans and agents discover the right command.cmd --help and cmd help <sub>. Both are common patterns and cheap to support.Each of these is a real production bug, repeatedly:
NO_COLOR.--yes.--json output. Keep machine modes strictly plain.schema_version; deprecate over releases.less auto-launches when output is captured. Disable when stdout isn't a TTY.--debug.cd or env from command A. Pass state explicitly.ps and shell history. Use stdin or files.For each anti-pattern with the GitHub issues that filed them and the canonical fix, see reference.md.