From harness-engineer
Scaffold a complete harness for any new project or repo so AI coding agents can work effectively across multiple sessions. Use whenever starting a new project with Claude Code, Codex, or any AI agent, or when an existing project has no harness and agents keep losing context between sessions. Creates: AGENTS.md table-of-contents, docs/ knowledge base, features.json with all features pre-marked failing, init.sh dev server startup script, claude-progress.txt handoff file, current_tasks/ multi-agent lock directory, layers.json architectural dependency graph, and wires the circuit breaker hooks. Trigger on: "set up harness", "scaffold this project", "init harness", "new project", "agent keeps forgetting context", "set up for agent-first development".
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-engineer:harness-initThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Scaffolds a production-ready harness for agent-first development. Run once per project.
Scaffolds a production-ready harness for agent-first development. Run once per project. Subsequent agent sessions use harness-onboard to orient themselves from the scaffold.
project-root/
├── AGENTS.md ← 100-line TOC pointing into docs/
├── docs/
│ ├── architecture.md ← Domain map + package layering
│ ├── design/ ← Feature design docs
│ ├── plans/ ← Versioned execution plans (one per complex feature)
│ ├── history/ ← Archived progress notes (GC moves old entries here)
│ ├── quality.md ← Per-domain quality grades (updated each GC run)
│ └── beliefs.md ← Core agent-first operating principles
├── features.json ← All features, initially passes=false
├── claude-progress.txt ← Handoff log between sessions
├── init.sh ← Start dev server + smoke test
├── current_tasks/ ← Multi-agent task locks
├── layers.json ← Architectural dependency constraints
└── .harness/
├── hooks/ ← Circuit breaker scripts
├── config.json ← Dev server port, thresholds
└── state/ ← Runtime state (gitignored)
Before generating any files, ask the agent (or read from existing repo):
.harness/config.json model)browser_automation.enabled=true if yes)If run on an existing repo: run find . -type f | head -50 and cat package.json (or equivalent) to infer stack answers automatically. Do not ask if you can infer.
Keep it under 100 lines. It is a MAP, not an encyclopedia.
Use the templates/AGENTS.md.template as the base. Key sections to fill in:
The example in the template is the canonical form. Do not simplify it.
Expand the user's prompt into a comprehensive feature list. For a web app, aim for 50–200 features. Each feature must be testable end-to-end by a human executing the steps in a real browser — not by reading code.
Use the templates/features.json.template as the base. The _verification_contract field is mandatory and must not be removed — it is the slop-prevention mechanism.
Feature schema:
{
"id": "feat-001",
"category": "functional",
"priority": 1,
"description": "User can open the app and see the home screen",
"steps": [
"Navigate to localhost:[port]",
"Verify home screen loads without errors",
"Check that primary navigation is visible"
],
"passes": false,
"in_progress": false,
"circuit_broken": false,
"break_reason": null
}
Step-writing rules:
Agent field permissions (embed in _instructions):
passes, in_progress, circuit_broken, break_reason: agent-writable#!/usr/bin/env bash
# Harness init script — run at the start of every agent session
set -e
echo "=== HARNESS: Starting dev environment ==="
# 1. Install dependencies if needed
[[ -f package.json ]] && npm install --silent
[[ -f requirements.txt ]] && pip install -r requirements.txt -q
# 2. Start dev server in background (customize per stack)
npm run dev &> .harness/state/dev-server.log &
DEV_PID=$!
echo $DEV_PID > .harness/state/dev-server.pid
# 3. Wait for server to be ready
echo "Waiting for server on port [PORT]..."
for i in {1..30}; do
curl -s http://localhost:[PORT] > /dev/null 2>&1 && break
sleep 1
done
# 4. Smoke test — verify baseline still works
echo "Running smoke test..."
curl -sf http://localhost:[PORT] > /dev/null || { echo "ERROR: App not responding"; exit 1; }
echo "=== HARNESS: Environment ready ==="
echo "Dev server: http://localhost:[PORT] (PID: $DEV_PID)"
Customize PORT and start command from the detected stack.
Default web app layer structure (customize as needed):
{
"layers": [
{ "name": "types", "order": 1, "description": "Type definitions, interfaces, constants" },
{ "name": "config", "order": 2, "description": "Configuration, environment variables" },
{ "name": "repo", "order": 3, "description": "Data access, database queries" },
{ "name": "service", "order": 4, "description": "Business logic, domain operations" },
{ "name": "runtime", "order": 5, "description": "API routes, controllers, middleware" },
{ "name": "ui", "order": 6, "description": "Frontend components, pages" }
],
"rule": "Each layer may only import from layers with lower order numbers",
"enforcement": ".harness/scripts/check-layers.sh"
}
Create these files at scaffold time:
docs/plans/README.md — explains the plans directory:
# Execution Plans
Versioned work artifacts for complex features. One plan per feature that requires
more than one agent session or has non-obvious architectural decisions.
**When to create a plan:** Any feature that touches 3+ files, requires a decision
between competing approaches, or is likely to span multiple sessions.
**Template:** See `.harness/../templates/execution-plan.md.template`
**Naming:** `plan-[feature-id].md` (e.g., `plan-feat-042.md`)
Plans are the memory of WHY decisions were made. Future agents read them.
Any decision made in your head that isn't here doesn't exist.
docs/quality.md — structured grading template (see Step 8 below)
docs/beliefs.md — core operating principles for this specific project:
# Agent Operating Beliefs
These are the core principles that govern how agents work on this project.
Updated as the project evolves — add new beliefs when patterns emerge.
1. **Repo is the world.** Nothing outside this repository exists for agents.
2. **One feature, one commit.** Atomicity prevents cascading failures.
3. **Test behavior, not code.** The steps array is the spec. Execute it.
4. **Failure = signal.** Every agent mistake points to a missing harness component.
5. **Clean state is non-negotiable.** Never leave the codebase in a state you wouldn't merge.
# Quality Grades
Updated by harness-gc on each run. Tracks health per domain and layer.
Scale: A (excellent) → B (good) → C (needs work) → D (broken) → F (critical)
| Domain/Layer | Grade | Last Updated | Notes |
|---|---|---|---|
| types | — | [date] | Not yet assessed |
| config | — | [date] | Not yet assessed |
| repo | — | [date] | Not yet assessed |
| service | — | [date] | Not yet assessed |
| runtime | — | [date] | Not yet assessed |
| ui | — | [date] | Not yet assessed |
## Assessment Criteria
- **A**: All features in domain passing, no layer violations, clean test coverage
- **B**: Minor issues, no critical failures, 1–2 tech debt items
- **C**: Some features failing or untested, architectural concerns present
- **D**: Multiple failures, layer violations, agent struggles to work in this area
- **F**: Domain is broken, blocks other work, requires human intervention
## Open Issues by Domain
[Updated by harness-gc — leave empty until first GC run]
Copy hooks from the plugin and run install.sh to wire .claude/settings.json.
See ../../hooks/ and ../../install.sh.
git add AGENTS.md features.json init.sh claude-progress.txt layers.json docs/ current_tasks/ .harness/
git commit -m "harness: initial scaffold via harness-engineer plugin"
After init, tell the user:
.harness/config.jsonThen stop. Your role as initializer is complete.
The next agent session is a coding agent — it will pick up features via harness-onboard. Do NOT begin implementing features in this session. Do NOT write application code. Commit the scaffold and hand off.
First command for the user to run next: bash init.sh to verify the environment is wired correctly.
npx claudepluginhub lauraflorentin/skills-marketplace --plugin harness-engineerCreates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.