Run spec-driven AI development cycles: sketch projects into testable, agnostic kits; map to dependency-ordered task graphs; autonomously implement tasks with parallel agents, git commits, test gates, and design enforcement; iteratively check, peer-review, and revise until convergence and spec compliance.
npx claudepluginhub juliusbrussee/cavekitInspect the last loop: gap analysis against kits + peer review code review for bugs, security, and quality
Show or update Cavekit execution model presets and runtime-layer keys (budgets, parallelism, hooks)
Create or update the project's DESIGN.md — the visual design system spec that constrains all UI implementation
Show Cavekit commands and usage
Bootstrap the context hierarchy and runtime — creates context/, .cavekit/ state dir, detects capabilities, writes .gitignore entries. Use --tools-only to just re-detect available tools.
Like /ck:make but uses parallel ck:task-builder subagents in isolated git worktrees (opt-in; the safe default is /ck:make)
Implement a build site or plan — automatically parallelizes independent tasks and progresses through tiers autonomously
Generate a build site from kits — the task dependency graph that drives building
Deep research for grounding kits in evidence — current best practices, library landscape, and codebase analysis
Resume a Cavekit loop after a crash, lock conflict, or manual interrupt. Rebuilds context from .cavekit/ and picks up at the first unblocked task.
Review the current branch end-to-end — kit compliance and code quality, with optional Codex second opinion. Modes: full (default), --mode gap, --codex, --tier.
Trace recent manual code fixes back into kits and context files. With --trace, run the single-failure backpropagation protocol.
One-shot end-to-end: describe a feature, get it built — sketch, map, make, check with no user gates. For tiny features and throwaways.
Write kits: decompose what you're building into domains with testable requirements
Show build site progress and live runtime state. With --watch, tails the dashboard until interrupted. `--team` delegates to `cavekit team status`.
Team coordination for Cavekit. Opt-in: run `/ck:team init` first, then `/ck:team join` in each checkout.
Generates framework-specific implementation plans from kits. Use when running /ck:map.
Implements the highest-priority unblocked work from plans. Use when running /ck:make command.
Reviews cavekit documents for completeness, consistency, and readiness for the Architect phase. Dispatched automatically after cavekit generation in the Draft phase review loop.
Lightweight classifier. Scores a task description against the five-axis rubric from the `complexity-detection` skill and returns JSON (score, depth, axes, overrides). Always runs on haiku.
Monitors iteration loop progress, detects convergence vs ceiling, reports on test pass rates and change velocity.
Reviews DESIGN.md for completeness, consistency, and actionability. Validates the 9-section Stitch format. Dispatched automatically after design system generation in the Design command review loop.
Generates implementation-agnostic kits from reference materials or existing code. Use when running /ck:sketch (including --from-code mode).
Reviews another agent's work with a critical eye, finding bugs, missed requirements, security issues, and cavekit gaps.
Multi-source research before complex or novel tasks. Queries the repo first (optionally via the graphify knowledge graph), then external references, then web. Produces a brief with citations. Invoked by /ck:research and as an upstream dependency of thorough tasks.
Compares built software against kits to find gaps, over-builds, and missing coverage.
Implements a single task from a build site. Dispatched by /ck:make for parallel execution.
Goal-backward verification. Given a set of kit acceptance criteria and a set of code changes, confirm that each criterion is actually met by the code (not just that tasks are marked done). Runs at tier boundaries and at /ck:check time.
How Cavekit's autonomous execution loop works — state machine, stop hook, completion sentinel, lock, budgets, iteration cap. Read this skill any time you are about to run /ck:make, debug a stuck session, or write a new command that needs the loop. Trigger phrases: "how does the loop work", "stop hook", "autonomous", "completion sentinel", "why is the session looping".
Step-by-step process for adopting Cavekit on an existing codebase. Covers the 6-step brownfield process, bootstrap prompt design, spec validation against existing behavior, and the decision between brownfield adoption vs deliberate rewrite. Trigger phrases: "brownfield", "existing codebase", "add Cavekit to existing project", "adopt Cavekit", "layer kits on code", "retrofit kits"
Detect which MCP servers, Claude Code plugins, and CLI tools are available in the current environment, so kits and build sites can bind to real capabilities instead of imagined ones. Runs via /ck:init --tools-only or `cavekit-tools.cjs discover`. Writes .cavekit/capabilities.json. Trigger phrases: "what do we have available", "what's installed", "detect tools", "can we use X", "setup tools".
How to write Cavekit-quality kits that AI agents can consume effectively. Covers implementation-agnostic cavekit design, testable acceptance criteria, hierarchical structure, cross-referencing, cavekit templates, greenfield and rewrite patterns, cavekit compaction, and gap analysis. Trigger phrases: "write kits", "create kits", "cavekit this out", "define requirements for agents", "how to write kits for AI"
Internal token-compression protocol for Cavekit agent artifacts — handoff notes, artifact summaries, context bundles, review notes, status lines. NOT the user-facing /caveman skill; that one compresses assistant replies to the user. This skill compresses machine-to-machine prose so the next agent in the loop reads fewer tokens. Three intensities (lite / full / ultra) with automatic selection based on budget pressure. Used by task-builder, builder, inspector, verifier, convergence-monitor, and the stop-hook status block.
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested. Integrated into Cavekit: enabled by default for build, inspect, and subagent phases via caveman_mode config. See scripts/bp-config.sh for caveman_mode and caveman_phases.
Classify a task or feature as quick, standard, or thorough so the rest of the pipeline (budget sizing, model routing, review depth) can right-size itself. Used by /ck:sketch to default the kit's complexity, by /ck:map to assign task depth, and by /ck:make for per-task budgets. Also invoked by the ck:complexity agent with the haiku model. Trigger phrases: "how complex", "what depth", "pick a depth", "classify this task".
Progressive disclosure architecture for organizing project context as a DAG (directed acyclic graph). Agents enter at the root and traverse only the subgraph relevant to their task. Covers the 4-tier information flow (refs → kits → plans → impl), CLAUDE.md hierarchy across context/ and source tree, index files as DAG hub nodes, nesting rules, and backward compatibility. Trigger phrases: "context architecture", "progressive disclosure", "organize context for agents", "context directory structure", "how to structure docs for AI", "context hierarchy"
Detecting whether agent iterations are converging toward a stable solution or hitting a ceiling. Covers convergence signals, ceiling detection, non-convergence diagnosis, test pass rate as a convergence metric, and forward progress tracking for large projects. Trigger phrases: "convergence", "is the agent converging", "ceiling detection", "when to stop iterating", "diminishing returns"
How to write and maintain a DESIGN.md in the 9-section Google Stitch format. Covers the 9-section structure, design token conventions, quality standards, integration with kits and build tasks, revision patterns, and collection import. Trigger phrases: "design system", "DESIGN.md", "visual design spec", "design tokens", "create design system", "import design system", "visual identity", "UI spec", "design language"
Inverts the traditional documentation flow from code-to-wiki-for-humans (which rots) into code-to-CLAUDE.md-to-skills-for-agents (which stays current). Each module gets a machine-readable CLAUDE.md, navigation skills teach agents how to explore libraries, and plugins package skills for on-demand loading. Documentation structured for machine consumption -- hierarchical, cross-referenced, with clear entry points -- rather than narrative human reading. This is a fundamental shift: build documentation for agents, not people. Triggers: "documentation inversion", "skills as docs", "living documentation", "docs for agents", "machine-readable docs", "agent-first documentation".
Optional knowledge-graph integration. When `graphify-out/graph.json` is present, architect/researcher/reviewer/task-builder agents can query symbol-level dependencies (IMPORTS, CALLS, EXTENDS, IMPLEMENTS, DEPENDS_ON) instead of grepping. Degrades gracefully when the graph is missing. Trigger phrases: "knowledge graph", "graphify", "blast radius", "dependency graph".
Implementation tracking documents for maintaining living records of what was built, what is pending, what failed, and what dead ends were explored. Covers the full tracking document template, dead ends prevention, cross-iteration continuity, spec compaction, and inter-session feedback protocol. Trigger phrases: "implementation tracking", "track progress", "session tracking", "what did the agent do", "dead ends", "failed approaches"
Behavioral guardrails for Cavekit agents. Four principles — think before coding, simplicity first, surgical changes, goal-driven execution — that prevent over-engineering, silent assumptions, scope creep, and unfocused work. Every task-builder, reviewer, planner, and inspector must internalize these before writing a single line. Trigger phrases: "guardrails", "karpathy", "scope creep", "over-engineering", "stop adding features", "surgical fix".
Core Cavekit methodology — the master skill that teaches the Hunt lifecycle and routes to all sub-skills. Covers the Specify Before Building principle, the scientific method analogy, the four-phase Hunt lifecycle, decision matrix for when to use Cavekit, and build pipeline analogy. Trigger phrases: "use Cavekit", "cavekit methodology", "start Cavekit project", "cavekit methodology", "how should I structure this project for AI agents"
Patterns for using a second AI agent or model to challenge the primary builder agent's work. Covers six review modes (Diff Critique, Design Challenge, Threaded Debate, Delegated Scrutiny, Deciding Vote, Coverage Audit), how to set up peer review with any model via MCP server, peer review iteration loops that alternate builder and reviewer prompts, the Codex Loop Mode (Cavekit + Ralph Loop + Codex as reviewer via CLI or MCP fallback), and prompt templates for each strategy. The peer reviewer's job is to find what the builder missed, not to agree. Triggers: "peer review", "peer review agent", "use another model to review", "second opinion on code", "cross-model review", "peer review loop", "ralph loop with codex", "cavekit ralph", "cross-model loop", "codex peer reviewer".
How to design the numbered prompt pipeline that drives Hunt phases in Cavekit. Covers greenfield 3-prompt patterns, rewrite 6-9 prompt patterns, shared principles, prompt engineering best practices, task templates, and time guards. Trigger phrases: "prompt pipeline", "design prompts for SDD", "create Hunt prompts", "pipeline prompts", "how many prompts do I need"
Trace bugs and manual fixes back to kits and prompts, then fix at the source so the iteration loop can reproduce the fix autonomously. Covers the 6-step revision process for commit sweeps AND the single-failure backpropagation protocol (six steps: TRACE, ANALYZE, PROPOSE, GENERATE, VERIFY, LOG) that runs automatically via the auto-backprop hook on test failure and manually via /ck:revise --trace. Trigger phrases: "revise", "revision", "trace bug to cavekit", "fix the cavekit not the code", "why did this bug happen", "update kits from bug", "trace this bug", "backprop", "why wasn't this caught", "fix the kit", "regression test".
A pipeline execution strategy where downstream stages start before upstream stages finish, using staggered timing with configurable delays. The leader begins first, and followers start after a delay, building from whatever partial output exists. Combined with convergence loops, early follower output self-corrects as upstream artifacts solidify. Cuts total pipeline time dramatically -- a 3-stage pipeline that takes 12 hours sequentially can finish in roughly 7 hours with speculative-pipeline staggering. Triggers: "speculative-pipeline", "staggered pipeline", "parallel prompts with delay", "overlap pipeline stages", "faster pipeline".
Authoritative guide for implementing stunning, accessible, performant UI. Synthesizes design engineering philosophy, accessibility standards, animation principles, spatial design, typography, color systems, and component craft into a single actionable reference. Complements the design-system skill (which covers DESIGN.md spec writing) by covering the HOW of implementation. Trigger phrases: "build UI", "create component", "landing page", "make it look good", "frontend", "design", "polish UI", "implement design", "make it beautiful", "UI implementation", "component styling", "animation", "accessibility"
Validation-first design for AI agent output — every spec requirement must be automatically verifiable. Covers the 6-gate validation pipeline, phase gates between Hunt phases, merge protocol, completion signals, and acceptance criteria design patterns. Trigger phrases: "validation gates", "quality gates", "validation-first design", "how to validate agent output", "acceptance criteria design"
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Matches all tools
Hooks run on every tool call, not just specific ones
Executes bash commands
Hook triggers when Bash tool is used
Uses power tools
Uses Bash, Write, or Edit tools
Complete developer toolkit for Claude Code
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Access thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams
Comprehensive toolkit for developing Claude Code plugins. Includes 7 expert skills covering hooks, MCP integration, commands, agents, and best practices. AI-assisted plugin creation and validation.
Uses power tools
Uses Bash, Write, or Edit tools