Search everything...

Marketplace

harness-engineering

Dual-track workflow plugin for Claude Code: surgical fixes and spec-driven features with TDD, context budgets, and micro-task decomposition.

npx claudepluginhub emingenc/harness-engineering

README

View full README on GitHub

1 Plugin

harness-engineering

Dual-track workflow plugin for Claude Code: Track 1 (surgical fixes) and Track 2 (spec-driven features). Enforces TDD, context budgets, PTC scripts, and micro-task decomposition via the MACHINE framework.

v0.1.0

Related Marketplaces

superpowers-dev

165.8K

0plugins

Development marketplace for Superpowers core skills library

mem0-plugins

54.3K

0plugins

Official Mem0 plugins for Claude

mempalace

50.1K

0plugins

No description available.

Stats

Plugins1

UpdatedApr 6, 2026

Links

View on GitHub View Marketplace JSON

harness-engineering

Dual-track workflow plugin for Claude Code: surgical fixes and spec-driven features with TDD, context budgets, and micro-task decomposition.

npx claudepluginhub emingenc/harness-engineering

README

Harness Engineering

A Claude Code plugin that keeps AI output quality high by keeping context clean.

Stop one-shotting entire apps. Start engineering the harness.

The Problem

You've seen it happen. Claude starts strong — clean code, sharp reasoning — then 40 minutes in, it loses the thread. Repeats itself. Forgets decisions it made 10 messages ago. Hallucinates file states. The code quality drops off a cliff.

This isn't a model failure. It's context rot.

┌─────────────────────────┐
│                         │
│    Context Rot Zone     │  ← Quality degrades here.
│    ···················  │    The model is "drunk"
│    ···················  │    on its own noise.
│                         │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤  ← ~50% utilization threshold
│                         │
│                         │
│     Quality Zone        │  ← Sharp, coherent output.
│                         │    This is where you want
│                         │    to stay.
│                         │
└─────────────────────────┘
       Context Window

Research from both Anthropic and OpenAI confirms it: past ~40-50% context utilization, model performance degrades. The bigger the task, the faster you hit the rot zone. That's why "just asking Claude to build the whole thing" doesn't scale.

Why This Exists

The LLM is a brain. Its "IQ" — the quality of its output — depends entirely on what's in its context window.

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph TD
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph "LLM — The Brain"
        IQ["Output Quality<br/><i>what you actually get</i>"]
    end

    W["1. Weights<br/><small>Training data — fixed</small>"] --> IQ
    P["2. Prompt & History<br/><small>Your instructions + conversation</small>"] --> IQ
    D["3. Dynamic Sources<br/><small>RAG, MCP servers, tools, files</small>"] --> IQ

    style IQ fill:#2d7d46,stroke:#1a5c30,color:#fff
    style W fill:#4a4a4a,stroke:#333,color:#fff
    style P fill:#2563eb,stroke:#1e4fba,color:#fff
    style D fill:#7c3aed,stroke:#5b21b6,color:#fff

You can't change the weights. But you can engineer what goes into the prompt, history, and dynamic context. That's what this plugin does.

All we're trying to do is optimize context to maximize output quality.

The Three Disciplines

This isn't a new idea — it's the natural evolution of how we work with LLMs:

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph LR
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph HE["Harness Engineering"]
        subgraph CE["Context Engineering"]
            subgraph PE["Prompt Engineering"]
                pe_desc["Craft better prompts<br/><small>roles, examples, formatting</small>"]
            end
            ce_desc["Manage what enters the<br/>context window<br/><small>RAG, tools, trimming, MCP</small>"]
        end
        he_desc["Orchestrate the full<br/>development lifecycle<br/><small>tasks, TDD, state, hooks</small>"]
    end

    style PE fill:#dbeafe,stroke:#2563eb,color:#1e3a5f
    style CE fill:#ede9fe,stroke:#7c3aed,color:#3b1d6e
    style HE fill:#fef3c7,stroke:#d97706,color:#78350f

Discipline	What it optimizes	Example
Prompt Engineering	The instruction itself	"You are a senior engineer. Write tests first."
Context Engineering	What's in the window	PTC scripts return 50 tokens instead of 2000. Sub-agents get fresh context.
Harness Engineering	The entire workflow	Track routing, TDD gates, micro-task decomposition, state recovery across sessions.

Each layer contains the previous. Prompt engineering alone can't save you from context rot. Context engineering alone can't enforce TDD. You need the full harness.

Think of It Like Water Bottles

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph TB
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph PE_COL["Prompt Engineering"]
        direction TB
        POUR["Pour water<br/><small>craft tokens</small>"]
        BOTTLE1["🫙 One bottle"]
    end

View full README on GitHub

1 Plugin

harness-engineering

v0.1.0

Related Marketplaces

superpowers-dev

165.8K

0plugins

Development marketplace for Superpowers core skills library

mem0-plugins

54.3K

0plugins

Official Mem0 plugins for Claude

mempalace

50.1K

0plugins

No description available.

Stats

Plugins1

UpdatedApr 6, 2026

Links

View on GitHub View Marketplace JSON

Harness Engineering

A Claude Code plugin that keeps AI output quality high by keeping context clean.

Stop one-shotting entire apps. Start engineering the harness.

The Problem

This isn't a model failure. It's context rot.

┌─────────────────────────┐
│                         │
│    Context Rot Zone     │  ← Quality degrades here.
│    ···················  │    The model is "drunk"
│    ···················  │    on its own noise.
│                         │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤  ← ~50% utilization threshold
│                         │
│                         │
│     Quality Zone        │  ← Sharp, coherent output.
│                         │    This is where you want
│                         │    to stay.
│                         │
└─────────────────────────┘
       Context Window

Why This Exists

The LLM is a brain. Its "IQ" — the quality of its output — depends entirely on what's in its context window.

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph TD
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph "LLM — The Brain"
        IQ["Output Quality<br/><i>what you actually get</i>"]
    end

    W["1. Weights<br/><small>Training data — fixed</small>"] --> IQ
    P["2. Prompt & History<br/><small>Your instructions + conversation</small>"] --> IQ
    D["3. Dynamic Sources<br/><small>RAG, MCP servers, tools, files</small>"] --> IQ

    style IQ fill:#2d7d46,stroke:#1a5c30,color:#fff
    style W fill:#4a4a4a,stroke:#333,color:#fff
    style P fill:#2563eb,stroke:#1e4fba,color:#fff
    style D fill:#7c3aed,stroke:#5b21b6,color:#fff

You can't change the weights. But you can engineer what goes into the prompt, history, and dynamic context. That's what this plugin does.

All we're trying to do is optimize context to maximize output quality.

The Three Disciplines

This isn't a new idea — it's the natural evolution of how we work with LLMs:

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph LR
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph HE["Harness Engineering"]
        subgraph CE["Context Engineering"]
            subgraph PE["Prompt Engineering"]
                pe_desc["Craft better prompts<br/><small>roles, examples, formatting</small>"]
            end
            ce_desc["Manage what enters the<br/>context window<br/><small>RAG, tools, trimming, MCP</small>"]
        end
        he_desc["Orchestrate the full<br/>development lifecycle<br/><small>tasks, TDD, state, hooks</small>"]
    end

    style PE fill:#dbeafe,stroke:#2563eb,color:#1e3a5f
    style CE fill:#ede9fe,stroke:#7c3aed,color:#3b1d6e
    style HE fill:#fef3c7,stroke:#d97706,color:#78350f

Discipline	What it optimizes	Example
Prompt Engineering	The instruction itself	"You are a senior engineer. Write tests first."
Context Engineering	What's in the window	PTC scripts return 50 tokens instead of 2000. Sub-agents get fresh context.
Harness Engineering	The entire workflow	Track routing, TDD gates, micro-task decomposition, state recovery across sessions.

Each layer contains the previous. Prompt engineering alone can't save you from context rot. Context engineering alone can't enforce TDD. You need the full harness.

Think of It Like Water Bottles

%%{init: {"theme": "base", "themeVariables": {"primaryTextColor": "#111827", "clusterTextColor": "#111827", "clusterBkg": "#f8fafc", "clusterBorder": "#e2e8f0", "lineColor": "#94a3b8", "fontFamily": "sans-serif"}}}%%
graph TB
    classDef default fill:#ffffff,stroke:#94a3b8,stroke-width:2px,color:#0f172a,rx:4,ry:4;
    subgraph PE_COL["Prompt Engineering"]
        direction TB
        POUR["Pour water<br/><small>craft tokens</small>"]
        BOTTLE1["🫙 One bottle"]
    end