From ck
Guides writing Cavekit-quality kits for AI agents: agnostic design, testable criteria, hierarchical structure, templates, greenfield/rewrite patterns, compaction, gap analysis.
npx claudepluginhub juliusbrussee/cavekitThis skill uses the workspace's default tool permissions.
Kits are **implementation-agnostic**. They define what the system must do and how to verify it, but never prescribe a specific framework, language, or architecture.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.
Kits are implementation-agnostic. They define what the system must do and how to verify it, but never prescribe a specific framework, language, or architecture.
This is the fundamental distinction in Cavekit:
When kits avoid prescribing HOW, they become:
Bad cavekit requirement: "Use React useState hook to manage form state" Good cavekit requirement: "Form state persists across user interactions within a session. Acceptance: entering values, navigating away, and returning preserves all entered values."
This is the single most important rule in Cavekit writing. If an agent cannot automatically validate a requirement, that requirement will not be met.
Every requirement must answer: "How would an automated test verify this?"
| Weak Criterion | Strong Criterion |
|---|---|
| "UI should look good" | "All interactive elements have minimum 44x44px touch targets" |
| "System should be fast" | "API responses return within 200ms at p95 under 100 concurrent users" |
| "Handle errors gracefully" | "Network failures display a retry prompt with exponential backoff (1s, 2s, 4s)" |
| "Support authentication" | "Valid credentials return a session token; invalid credentials return 401 with error message" |
Each criterion should be:
**Acceptance Criteria:**
- [ ] {Action} results in {observable outcome}
- [ ] Given {precondition}, when {action}, then {result}
- [ ] {Metric} meets {threshold} under {conditions}
Kits must be organized as a hierarchy — one index file linking to domain-specific sub-kits. This enables progressive disclosure: agents read the index first, then only the sub-kits relevant to their task.
Create a cavekit-overview.md as the entry point:
# Cavekit Overview
## Domains
| Domain | Cavekit File | Summary |
|--------|-----------|---------|
| Authentication | cavekit-auth.md | User registration, login, session management, OAuth |
| Data Models | cavekit-data-models.md | Core entities, relationships, validation rules |
| API | cavekit-api.md | REST endpoints, request/response formats, error handling |
| UI Components | cavekit-ui-components.md | Shared components, accessibility, responsive behavior |
| Notifications | cavekit-notifications.md | Email, push, in-app notification delivery |
## Cross-Cutting Concerns
- Security requirements: see cavekit-auth.md R3, cavekit-api.md R7
- Performance budgets: see cavekit-api.md R12, cavekit-ui-components.md R5
- Accessibility: see cavekit-ui-components.md R8-R10
Related kits must link to each other. Cross-references prevent requirements from being lost at domain boundaries.
## Cross-References
- **Depends on:** cavekit-auth.md R1 (session tokens required for API access)
- **Depended on by:** cavekit-notifications.md R4 (uses user preferences from this cavekit)
- **Related:** cavekit-ui-components.md R6 (error display components used by this domain)
Use this template for every domain cavekit:
# Cavekit: {Domain Name}
## Scope
{One paragraph describing what this spec covers and its boundaries.}
## Requirements
### R1: {Requirement Name}
**Description:** {What must be true — stated in terms of behavior, not implementation.}
**Acceptance Criteria:**
- [ ] {Testable criterion 1}
- [ ] {Testable criterion 2}
- [ ] {Testable criterion 3}
**Dependencies:** {Other specs/requirements this depends on, or "None"}
### R2: {Requirement Name}
**Description:** {What must be true}
**Acceptance Criteria:**
- [ ] {Testable criterion 1}
- [ ] {Testable criterion 2}
**Dependencies:** {Dependencies}
### R3: ...
## Out of Scope
{Explicit list of things this cavekit does NOT cover. This is critical — it prevents
agents from over-building and clarifies domain boundaries.}
- {Thing explicitly excluded and why}
- {Another exclusion}
## Cross-References
- See also: cavekit-{related-domain}.md — {why it is related}
- Depends on: cavekit-{dependency}.md R{N} — {what is needed}
- Depended on by: cavekit-{dependent}.md R{N} — {what depends on this}
When building from scratch, you start with reference materials and derive kits from them.
context/refs/ context/kits/
├── prd.md → ├── cavekit-overview.md
├── design-doc.md → ├── cavekit-auth.md
├── api-draft.md → ├── cavekit-api.md
└── research/ → ├── cavekit-data-models.md
└── ... → └── cavekit-ui.md
context/refs/cavekit-overview.md — index with domain summariescavekit-{domain}.md per identified domainThe first prompt in a greenfield pipeline (typically 001-generate-kits-from-refs.md) should:
context/refs/cavekit-overview.md as the indexWhen rewriting an existing system, the existing code becomes your reference material. But you never go directly from old code to new code — you always extract kits first.
Existing codebase context/refs/ context/kits/
├── src/ → ├── ref-apis.md → ├── cavekit-overview.md
├── tests/ → ├── ref-data-models.md → ├── cavekit-auth.md
└── docs/ → ├── ref-ui-components.md → ├── cavekit-api.md
└── ref-architecture.md → └── cavekit-data.md
Rewrites typically use more prompts because of the reverse-engineering step:
001: Generate reference materials from old code002: Generate kits from references + feature scope003: Validate kits against existing codebase004+: Plans and implementationThe key difference from greenfield: step 003 validates that your kits actually describe what the old system does, before you start building the new one.
When implementation tracking or cavekit files grow beyond approximately 500 lines, they become unwieldy for agents to process efficiently. Spec compaction compresses large files while preserving active context.
impl/archive/impl-domain-v1.md)Never delete information — move it to an archive. Agents can still find archived context if needed, but it will not consume context window during normal operations.
Gap analysis compares what was built against what was intended, identifying where kits, plans, or validation fell short.
| Status | Meaning |
|---|---|
| Complete | All acceptance criteria pass |
| Partial | Some criteria pass, others do not |
| Missing | Requirement not implemented at all |
| Over-built | Implementation exceeds cavekit (may indicate cavekit gap) |
Gap analysis is not a one-time activity. Run it:
The Draft phase (/ck:sketch) now embeds brainstorming principles directly. When running in interactive mode (no arguments), the drafter follows a collaborative design process before generating any files:
This process applies to EVERY project regardless of perceived simplicity. The design can be short for simple projects, but it must happen.
Visual companion: For projects involving visual elements (UI, architecture diagrams), the Draft phase can use a browser-based visual companion to show mockups and diagrams during the design conversation. See references/visual-companion.md.
YAGNI enforcement: During the design conversation and cavekit generation, actively strip requirements the user did not ask for. Smaller kits are better kits.
ck:design-systemWhen DESIGN.md exists at the project root, kits for UI domains should reference design tokens in acceptance criteria. This creates a traceable chain: DESIGN.md -> cavekit acceptance criterion -> plan task -> implementation.
| Acceptance Criterion Type | Design Reference |
|---|---|
| "Button has primary CTA appearance" | DESIGN.md Section 4, primary button variant |
| "Text follows heading hierarchy" | DESIGN.md Section 3, type scale |
| "Card has subtle elevation" | DESIGN.md Section 6, elevation level 1 |
| "Layout uses 12-column grid" | DESIGN.md Section 5, grid system |
| "Colors adapt for dark mode" | DESIGN.md Section 2, dark mode mapping |
Do NOT duplicate DESIGN.md content into kits. Reference by section/token name only. If a color changes in DESIGN.md, kits should not need updating.
When a cavekit needs a visual pattern not yet defined in DESIGN.md, note it in the acceptance criterion:
- [ ] Component uses card-like container [DESIGN.md: pattern not yet defined — flag for design update]
ck:validation-firstEvery acceptance criterion in a cavekit must map to at least one validation gate. When writing kits, think about which gate will verify each requirement:
| Acceptance Criterion Type | Likely Gate |
|---|---|
| "Code compiles without errors" | Gate 1: Build |
| "Function returns correct output for input X" | Gate 2: Unit Tests |
| "User can complete workflow end-to-end" | Gate 3: E2E/Integration |
| "Response time under N ms" | Gate 4: Performance |
| "Application starts and displays main screen" | Gate 5: Launch Verification |
| "UI matches design intent" | Gate 6: Human Review |
ck:context-architectureKits live in the context/kits/ directory. See ck:context-architecture for the full context directory structure, CLAUDE.md conventions, and multi-repo strategies.
ck:impl-trackingAs kits are implemented, progress is tracked in context/impl/ documents. Dead ends discovered during implementation should be recorded to prevent future agents from retrying failed approaches.
Wrong: "Use PostgreSQL with a users table containing columns: id (UUID), email (VARCHAR), ..." Right: "User accounts have a unique identifier and email. Email must be unique across all accounts. Acceptance: creating two accounts with the same email fails with a duplicate error."
Wrong: "System handles errors properly" Right: "When a network request fails, the UI displays an error message within 2 seconds and offers a retry action. Acceptance: simulating network failure shows error banner with retry button."
Every cavekit needs explicit exclusions. Without them, agents will over-build or make assumptions.
Domains do not exist in isolation. If cavekit-auth defines session tokens that cavekit-api uses, both kits must cross-reference each other.
A single 1000-line cavekit file defeats progressive disclosure. Decompose into domains with a clear index.
Writing kits for AI agents follows these rules: