From claude-mods
Spawns cheap headless Claude Code workers on non-Anthropic models (GLM via z.ai) for parallel, tool-using subtasks. Pairs with fleet-ops for test-gated landing.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-mods:fleet-workerWhen to use
Use when you have independent, well-scoped, tool-using subtasks (refactors, test-writing, doc edits, mechanical multi-file changes) that don't need Opus-level judgment, and you want them done cheaply in parallel while the orchestrator reviews and gates the results before they land. Not for tasks needing the orchestrator's conversation context or expensive-if-wrong unreviewed changes.
This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run a **cheap headless Claude Code worker on a non-Anthropic model** and let an
Run a cheap headless Claude Code worker on a non-Anthropic model and let an Opus orchestrator (this session) fan workers out in parallel, then verify and land their work. The worker keeps Claude Code's entire tool harness (Read/Write/Edit/Bash/Glob/Grep/Task/MCP/hooks) — only the brain is swapped to a cheaper model via env. GLM-5.2 on z.ai is the default worked example; the mechanism is provider-agnostic (any Anthropic-compatible endpoint).
This is the spawning layer. fleet-ops is the landing layer.
fleet-worker produces branches cheaply; fleet-ops lands them through a test gate
with your review. See references/fleet-ops-handoff.md.
ANTHROPIC_BASE_URL and the ANTHROPIC_DEFAULT_*_MODEL mapping vars are
process-global — read once per claude process, applied to every model
call it makes (including in-process Task subagents). There is no per-agent
override. So you cannot keep one Opus session and have its subagents secretly
run on GLM. The only way to pair a GLM-brained agent with an Opus orchestrator is
a separate OS process with its own env block. That process is fleet-worker.
On any machine also logged into a Claude.ai/Anthropic subscription, the naïve
"just set ANTHROPIC_AUTH_TOKEN" launcher fails with 401 token expired or incorrect — the host's stored subscription OAuth token (~/.claude.json
oauthAccount + forceLoginMethod) takes precedence and gets sent to the
non-Anthropic endpoint, which rejects it. --settings overrides do not fix
it. The fix is a dedicated, empty config dir:
export CLAUDE_CONFIG_DIR="$HOME/.fleet-worker/cfg" # no inherited OAuth/hooks
The launcher sets this automatically. It also gives each worker a clean hook/permission/MCP profile so it can't trip the host's hooks. Full analysis in references/fleet-worker-spec.md §4.
scripts/install.sh
they live at ~/.claude/skills/fleet-worker/scripts/. Either call them by that
path, or symlink onto PATH for convenience:
ln -s ~/.claude/skills/fleet-worker/scripts/fleet-worker ~/.local/bin/fleet-worker
ln -s ~/.claude/skills/fleet-worker/scripts/fleet-collect.sh ~/.local/bin/fleet-collect.sh
export ANTHROPIC_AUTH_TOKEN=<key>, orexport FLEET_WORKER_KEYRING_SERVICE=<svc> FLEET_WORKER_KEYRING_KEY=<name> (uses keyring get), orexport ZHIPU_API_KEY=<key> (or GLM_API_KEY).bash scripts/fleet-doctor.sh --offline (structural) or
--live (pings the endpoint; warns about the §4 oauth trap).| Var | Default | Purpose |
|---|---|---|
FLEET_WORKER_BASE_URL | https://api.z.ai/api/anthropic | Anthropic-compatible endpoint |
FLEET_WORKER_MODEL | GLM-5.2 | main model (opus+sonnet mapping) |
FLEET_WORKER_SMALL_MODEL | GLM-4.5-Air | background/cheap model (haiku mapping) |
FLEET_WORKER_CONFIG_DIR | ~/.fleet-worker/cfg | isolated config dir — one per parallel worker |
FLEET_WORKER_EFFORT | high | seeded effortLevel in the worker's settings |
Point FLEET_WORKER_BASE_URL/FLEET_WORKER_MODEL at any other Anthropic-compatible
gateway (this is the documented Claude Code custom-endpoint mechanism) to drive a
different cheap model.
| Delegate to a worker | Keep on the orchestrator |
|---|---|
| Independent, well-scoped, tool-using subtasks | Tasks needing this conversation's context |
| Refactors, test-writing, doc edits, mechanical multi-file changes | Judgment calls, architecture, ambiguous specs |
| Work where Opus-quality isn't required and a wrong edit is cheap to discard | Anything expensive-if-wrong and unreviewed |
The safety comes from the cage, not the model: isolated worktree (blast radius), isolated config dir (no host creds/hooks), and the orchestrator's merge gate (nothing lands without review).
cd <target-worktree>
fleet-worker --output-format json "Refactor src/parser.py to use the visitor pattern" \
> result.json
fleet-collect.sh result.json && echo "succeeded — review the diff"
fleet-collect.sh gates on is_error (the real success signal — subtype lies)
and prints the worker's final text. Exit 0 = success, 10 = worker failed.
Each task gets its own git worktree + branch and its own config dir so
N workers never clobber each other. Spawn from the orchestrator's Bash tool with
run_in_background: true, then collect by output file.
delegate() { # $1 = task-id, $2 = prompt
local id="$1" prompt="$2" wt=".fleet-work/$1"
git worktree add -q -b "fleet/$id" "$wt" HEAD
( cd "$wt"
FLEET_WORKER_CONFIG_DIR="$HOME/.fleet-worker/cfg-$id" \
fleet-worker --output-format json "$prompt" > "../$id.result.json" 2> "../$id.err"
)
}
delegate task-a "Add tests for the auth module" &
delegate task-b "Update the README install section" &
delegate task-c "Refactor utils.py duplications" &
wait # barrier
for id in task-a task-b task-c; do
if fleet-collect.sh ".fleet-work/$id.result.json" >/dev/null; then echo "fleet/$id OK"; fi
done
Keep concurrency modest (≤ 4–6) — the binding constraint is endpoint quota, not
local CPU. .gitignore the scratch dirs (.fleet-work/, .fleet-worker/).
The winning branches are ordinary git branches — land them with the sibling skill instead of merging by hand:
fleet track fleet/task-a fleet/task-b fleet/task-c # register as lanes
fleet land fleet/task-a # sequential, test-gated, you review each diff
Full walkthrough + recovery in references/fleet-ops-handoff.md.
Headless -p can't answer a permission prompt — it would stall. The launcher
bakes in --permission-mode bypassPermissions; safety comes from the cage
(worktree + isolated config + merge gate), not the prompt. Optionally constrain
with --disallowedTools (e.g. block WebFetch) or --add-dir scoping.
Worktree-under-
.claude/gotcha: Claude Code's sensitive-file guard runs beforebypassPermissionsfor anything under.claude/. Keep manual worker worktrees at the repo top (e.g..fleet-work/), not under.claude/.
FLEET_WORKER_SMALL_MODEL.--max-turns N and an orchestrator-side wall-clock
timeout per worker. Collect via background + notification; never block.total_cost_usd is Claude Code's internal
pricing table applied to a model it doesn't know — ignore it; account by
usage.*_tokens and your provider's plan.Key pulled at spawn time into a process-local env var, never written to the
script, args (ps-safe), or logs. Isolated config dir keeps worker creds/session
separate from the host — and the worker can't read the host's subscription
credentials. Avoid --debug in shared logs (may print headers).
Using Claude Code with a custom ANTHROPIC_BASE_URL is a documented feature,
and the worker's inference never touches Anthropic's API/subscription. But terms
change and vary by plan — verify both your Anthropic terms and your model
provider's terms for your own use. Two specifics worth knowing:
scripts/fleet-worker / scripts/fleet-worker.ps1 — the launcher (bash + PowerShell).
fleet-worker --help for the full env/flag contract.scripts/fleet-collect.sh — gate a --output-format json result; exit 0 success /
10 worker-failed; prints the final text. fleet-collect.sh --help.scripts/fleet-doctor.sh — --offline structural preflight + doc-consistency
(CI-safe); --live pings the endpoint to confirm the model still resolves and
flags the §4 oauth trap. fleet-doctor.sh --help.fleet track → fleet land walkthrough and recovery.settings.json the launcher drops into a fresh config dir (effortLevel: high).npx claudepluginhub 0xdarkmatter/claude-mods --plugin claude-modsLaunches and manages Claude Code, Codex, or Pi worker sessions as sub-processes. Useful for project managers that delegate tasks, assign work, monitor progress, review tool calls, and collect results via the `csd` CLI.
Orchestrates multi-agent coding tasks via Claude DevFleet — plan projects, dispatch parallel agents in isolated worktrees, monitor progress, and read structured reports.
Runs one-shot provider LLM subagents for fan-out tasks, bulk per-file work, or specialized model calls. Returns result directly, lock-free for parallel execution.