Skill

Harness Engineering: From Claude Code to AI Coding

Reads, navigates, and applies insights from Harness Engineering book analyzing Claude Code architecture, agent loops, prompt engineering, context management, and AI coding best practices.

Anthropic

Rust

Github

ai-ml

Install

npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-1 --plugin aradotso-trending-skills-37

Tool Access

This skill uses the workspace's default tool permissions.

Preview

```markdown

SKILL.md

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

139.2k

mcp-builder

9 files

Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).

anthropics-skills-13

124.2k

canvas-design

20 files

Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.

anthropics-skills-13

124.2k

Stats

Stars36

Forks8

Last CommitApr 7, 2026

Actions

View Source View Plugin View on GitHub View README

Clone & Serve

git clone https://github.com/ZhangHanDong/harness-engineering-from-cc-to-ai-coding.git
cd harness-engineering-from-cc-to-ai-coding

# Build the book
mdbook build book

# Serve locally with live reload
mdbook serve book
# Default: http://localhost:3000

Build Output

book/
├── src/           # Markdown source files
│   ├── SUMMARY.md # Table of contents / navigation
│   ├── assets/    # Images (cover, diagrams)
│   └── **/*.md    # Chapter files
└── book.toml      # mdBook configuration

Book Structure

The book has 7 main parts and 4 appendices:

Part	Topic
Part 1	Architecture — Claude Code's overall design, Agent Loop, tool orchestration
Part 2	Prompt Engineering — System prompts, tool prompts, model-specific tuning
Part 3	Context Management — Auto-compression, micro-compression, token budgets
Part 4	Prompt Caching — Cache strategies, prompt cache design patterns
Part 5	Security & Permissions — Permission modes, rule systems, YOLO classifier, Hooks
Part 6	Advanced Subsystems — Multi-agent orchestration, skill systems, Feature Flags, unreleased pipelines
Part 7	Lessons for AI Agent Builders — Production best practices, limitations, inspirations

Key Concepts Covered

1. Agent Loop Architecture

Claude Code's agent loop processes tasks through repeated cycles of:

Observe — Read context (files, shell output, user input)
Plan — Decide next action via LLM reasoning
Act — Execute tool calls (bash, file read/write, search)
Reflect — Incorporate results into next context window

Key insight from the book: The agent loop is not a simple REPL — it manages token budgets dynamically and compresses context when approaching limits.

2. Context Management Strategies

Context Window
├── System Prompt (static, cached)
├── Tool Definitions (static, cached)
├── Conversation History (dynamic)
│   ├── Full messages (recent)
│   └── Compressed summaries (older)  ← Auto-compression kicks in here
└── Current Task Buffer

Auto-compression trigger: When context approaches ~80% of max tokens, Claude Code summarizes older conversation turns to free space.

Micro-compression: Individual long tool outputs (e.g., large file reads, bash stdout) are truncated/summarized inline before being added to history.

3. Permission System

Claude Code uses a layered permission model:

Permission Levels (least → most privileged)
├── read-only       — File reads, directory listing, search
├── write           — File writes, creation, deletion
├── execute         — Shell commands (bash tool)
└── network         — HTTP requests, external services

YOLO mode: Bypasses interactive permission prompts. The book analyzes the YOLO classifier that determines which operations are auto-approved.

Hooks system: Pre/post action hooks allow custom permission logic, logging, or side effects to be injected into the tool execution pipeline.

4. Prompt Caching

Claude Code aggressively caches:

System prompt — Rarely changes; always at top of cache
Tool definitions — Static per session; cached after system prompt
File contents — Frequently accessed files cached as pseudo-static context

Pattern from the book:

[CACHED BOUNDARY]
System Prompt
Tool Definitions
Common File Contents    ← Extended cache prefix
[DYNAMIC BOUNDARY]
Conversation turns      ← Not cached (changes each turn)
Current user message

5. Multi-Agent Orchestration

The book reveals Claude Code's sub-agent spawning for parallel tasks:

Orchestrator Agent
├── Sub-agent A: "Search codebase for X"
├── Sub-agent B: "Write tests for Y"
└── Sub-agent C: "Fix linting errors in Z"
        ↓
Results merged back into orchestrator context

Applying Book Insights: Code Patterns

Pattern 1: Implement a Token Budget Manager

Based on Part 3 (Context Management):

class TokenBudgetManager:
    def __init__(self, max_tokens: int, compression_threshold: float = 0.8):
        self.max_tokens = max_tokens
        self.compression_threshold = compression_threshold
        self.history: list[dict] = []

    def should_compress(self, current_token_count: int) -> bool:
        return current_token_count / self.max_tokens > self.compression_threshold

    def compress_history(self, llm_client, history: list[dict]) -> list[dict]:
        """Summarize older turns to free token budget."""
        if len(history) <= 4:
            return history  # Keep at least 2 exchange pairs

        to_compress = history[:-4]  # Keep last 4 messages verbatim
        recent = history[-4:]

        summary_prompt = (
            "Summarize the following conversation history concisely, "
            "preserving key decisions, file paths, and technical details:\n\n"
            + "\n".join(f"{m['role']}: {m['content']}" for m in to_compress)
        )

        summary = llm_client.complete(summary_prompt)

        compressed = [{"role": "system", "content": f"[History Summary]: {summary}"}]
        return compressed + recent

    def add_message(self, message: dict, token_count: int, llm_client=None):
        self.history.append(message)
        if self.should_compress(token_count) and llm_client:
            self.history = self.compress_history(llm_client, self.history)

Pattern 2: Layered Permission Gate

Based on Part 5 (Security & Permissions):

from enum import IntEnum
from typing import Callable

class PermissionLevel(IntEnum):
    READ = 1
    WRITE = 2
    EXECUTE = 3
    NETWORK = 4

class PermissionGate:
    def __init__(self, auto_approve_level: PermissionLevel = PermissionLevel.READ):
        self.auto_approve_level = auto_approve_level
        self.hooks: list[Callable] = []

    def register_hook(self, hook: Callable):
        """Register a pre-execution hook (logging, auditing, etc.)"""
        self.hooks.append(hook)

    def request(self, action: str, level: PermissionLevel, details: dict) -> bool:
        # Run pre-execution hooks
        for hook in self.hooks:
            hook(action=action, level=level, details=details)

        # Auto-approve if within threshold
        if level <= self.auto_approve_level:
            return True

        # Interactive prompt for elevated permissions
        response = input(
            f"\n⚠️  Permission required: {action}\n"
            f"   Level: {level.name}\n"
            f"   Details: {details}\n"
            f"   Allow? [y/N]: "
        )
        return response.strip().lower() == "y"


# Usage
gate = PermissionGate(auto_approve_level=PermissionLevel.READ)

# Audit hook
def audit_log(action, level, details):
    print(f"[AUDIT] {action} ({level.name}): {details}")

gate.register_hook(audit_log)

# Auto-approved (READ <= READ threshold)
gate.request("read_file", PermissionLevel.READ, {"path": "src/main.py"})

# Prompts user (EXECUTE > READ threshold)
gate.request("run_bash", PermissionLevel.EXECUTE, {"command": "rm -rf /tmp/old"})

Pattern 3: Prompt Cache Prefix Builder

Based on Part 4 (Prompt Caching):

def build_cached_prompt_prefix(
    system_prompt: str,
    tool_definitions: list[dict],
    pinned_files: dict[str, str],  # {path: content}
) -> list[dict]:
    """
    Constructs the static (cacheable) portion of the context.
    Place this before dynamic conversation history.
    """
    messages = []

    # 1. System prompt — always first for maximum cache reuse
    messages.append({
        "role": "system",
        "content": system_prompt,
        "cache_control": {"type": "ephemeral"}  # Anthropic API cache hint
    })

    # 2. Tool definitions block
    if tool_definitions:
        tools_block = "# Available Tools\n" + "\n".join(
            f"## {t['name']}\n{t['description']}" for t in tool_definitions
        )
        messages.append({
            "role": "system",
            "content": tools_block,
            "cache_control": {"type": "ephemeral"}
        })

    # 3. Pinned file contents (frequently accessed files)
    for path, content in pinned_files.items():
        messages.append({
            "role": "system",
            "content": f"# File: {path}\n```\n{content}\n```",
            "cache_control": {"type": "ephemeral"}
        })

    return messages

Contributing to the Book

The book source is Markdown files in book/src/. To contribute:

# 1. Fork and clone
git clone https://github.com/YOUR_USERNAME/harness-engineering-from-cc-to-ai-coding.git

# 2. Edit chapters
vim book/src/chapter-name.md

# 3. Update SUMMARY.md if adding new chapters
vim book/src/SUMMARY.md

# 4. Preview changes
mdbook serve book

# 5. Build to verify no errors
mdbook build book

# 6. Submit PR

SUMMARY.md Format

# Summary

[Introduction](./intro.md)

# Part 1: Architecture
- [Chapter 1: Agent Loop](./part1/agent-loop.md)
- [Chapter 2: Tool Orchestration](./part1/tools.md)

# Appendix
- [Appendix A: Glossary](./appendix/glossary.md)

Troubleshooting

mdBook not found

# Verify cargo installation
cargo --version

# Install mdbook
cargo install mdbook

# Add cargo bin to PATH if needed
export PATH="$HOME/.cargo/bin:$PATH"

Build fails with missing file

error: Chapter 'X' not found at expected path 'src/...'

Check book/src/SUMMARY.md — every linked .md file must exist.

Port 3000 already in use

mdbook serve book --port 3001

Chinese characters display incorrectly

Ensure your terminal and browser use UTF-8 encoding. The book is UTF-8 throughout.

Key Takeaways for AI Agent Builders

From Part 7 of the book:

Token budget is first-class — Design your agent loop with token counting from day one, not as an afterthought
Cache the static, compress the dynamic — System prompts and tool definitions should never eat into your conversation budget
Permission granularity matters — Distinguish read/write/execute/network at the tool level, not just at the agent level
Hooks over inheritance — Claude Code uses hooks for extensibility; prefer composition over deep inheritance in agent frameworks
Fail loud on context overflow — Silent truncation causes subtle bugs; always surface context pressure to the operator
Multi-agent for parallelism, single-agent for coherence — Don't spawn sub-agents unless tasks are truly independent

Resources

Book online: https://zhanghandong.github.io/harness-engineering-from-cc-to-ai-coding/
English preview: https://zhanghandong.github.io/harness-engineering-from-cc-to-ai-coding/en/
GitHub repo: https://github.com/ZhangHanDong/harness-engineering-from-cc-to-ai-coding
mdBook docs: https://rust-lang.github.io/mdBook/
Claude Code (analyzed version): v2.1.88
License: MIT