Help us improve
Share bugs, ideas, or general feedback.
From kernel
Explores 2-3 solution approaches, evaluates tradeoffs like code size and deps, picks simplest after research verification, then implements. Triggers: build, implement, create, feature, add.
npx claudepluginhub ariaxhan/kernel-claude --plugin kernelHow this skill is triggered — by the user, by Claude, or both
Slash command
/kernel:buildThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Minimal code through maximum research. The best code is code you don't write.
Implements features using parallel subagents with scope control, reflection, and MCP servers for memory/context. Activates on implement/build/create requests in JS/TS projects.
Guides multi-AI development using the Double Diamond framework with Codex and Gemini CLIs. Detects dev vs knowledge context and enforces an execution contract.
Orchestrates Commands, Agents, and Skills through research, planning, implementation, and review phases for complex, multi-file features.
Share bugs, ideas, or general feedback.
Minimal code through maximum research. The best code is code you don't write. Your first solution is never right. Explore, compare, choose simplest.
Prerequisite: AgentDB read-start has already run. Tier classification done via /kernel:ingest.
Reference: skills/build/reference/build-research.md
GOAL: [What are we building?]
CONSTRAINTS: [Limitations, requirements, must-haves]
INPUTS: [What do we have to work with?]
OUTPUTS: [What should exist when done?]
DONE-WHEN: [How do we know it's complete?]
Interview pattern for large features: For ambiguous or complex features, instead of filling in the template manually, prompt Claude to interview you:
"I want to build [brief description]. Interview me using the AskUserQuestion tool. Ask about technical implementation, UI/UX, edge cases, and tradeoffs. Don't ask obvious questions — dig into the hard parts. Then write a complete spec to _meta/plans/."
Start a fresh session after the spec is written. The implementation session gets clean context focused entirely on execution, with a written spec to reference throughout.
Rule: Generate 2-3 approaches minimum. Never implement first idea.
Per solution, document:
Evaluation criteria (ordered):
Rules:
_meta/plans/{feature}.mdRULE: Research without verification is theory fiction. (LRN-F11) Every research finding must be verified with a minimal test before it drives implementation. Cached research is a starting point, not a conclusion. If the cache says "use approach X", build a 10-line proof before committing to X across your codebase.
Before web search, check for cached research in _meta/research/.
Cache format — research files use frontmatter:
---
query: "{original search query}"
date: "YYYY-MM-DD"
ttl: 7 # days
domain: "{tech domain}"
---
TTL rules:
Cache check protocol:
ls _meta/research/ for topic matchestoday - date < ttl: use cached result, skip web searchCold start: No behavior change when cache empty — search normally, create cache entry.
Note: Cache hits still check agentdb for learnings. Learnings are never cached — always fresh.
Confirm (not guess) max 6 per category:
BEFORE each step: review research doc, check if fewer lines possible. DURING: use researched package, minimal changes, follow existing patterns, one commit per logical unit. AFTER: verify works, count lines (can reduce?), commit, update plan.
If tier 2+: You are the surgeon. Follow contract scope exactly.
Automated (run what exists):
npm test / pytest / cargo test / go testeslint / ruff / clippytsc --noEmit / mypyManual: walk through done-when criteria. Document how verified.
Edge cases (at least 3): empty/null, boundary, error/failure path.
git checkout or git stashReport: feature name, branch, files changed, validation results, next steps.
agentdb write-end '{"skill":"build","feature":"X","files":["Y"],"approach":"Z"}'
The field has shifted from "prompt engineering" to context engineering: optimizing everything the model sees, not just the instruction.
The context payload = system prompt + tools + examples + conversation history + available files. Each element is a lever. Optimize all of them, not just the user message.
Explore-Plan-Act loop (three permission-escalating phases):
Anthropic internal data: unguided attempts (Act only, skip Explore+Plan) succeed ~33% of the time. Structured Explore-Plan-Act: ~80%+. The phases aren't ceremony — they're the multiplier.
CLAUDE.md hygiene: Reference separate files for large domain docs. Inlining >1KB into CLAUDE.md
consumes token budget before work starts. Use _meta/reference/ and load on demand.
Writer/Reviewer session split: Implementation session and review session use separate contexts to avoid confirmation bias. Session A implements. Session B opens a fresh context, reads only the diff and the spec, and reviews without any memory of the implementation choices.
Without this split, the implementing agent unconsciously validates its own code. Fresh context has no investment in prior decisions — it catches what the implementing session rationalized past.
Worktree isolation for parallel features: Use claude --worktree <branch-name> to create
an isolated branch + context per feature. Multiple worktrees run in parallel without file conflicts.
Preferred over spawning agents on the same working tree when features touch overlapping files.
Subagent scoping: When spawning agents for implementation, scope each agent to a single file or function boundary. Cross-file agents produce merge conflicts and silent overrides.
Prefer Read before Write: Always read the target file before editing it, even when the task is purely additive. Prevents format drift and ensures you match existing style.
Minimal footprint: Request only the permissions and file access actually needed. Touch the minimum viable set of files. Unanticipated side effects compound across agents.
Interrupt-safe commits: Commit every working state, not just at milestone boundaries. If an agent is interrupted mid-task, the last commit must be valid and buildable.
Clarify before long tasks: For tasks estimated >5 min, surface ambiguities before starting. Mid-task clarification requests cause partial-state problems.
Long build sessions degrade model performance as context fills. Mitigate:
/compact before context degrades. Signal: responses
getting shorter, earlier instructions being ignored, more mistakes per edit.
Use /compact <instructions> to control what survives: e.g. /compact Focus on API changes only.
For partial compaction, use Esc+Esc → /rewind, select a checkpoint, choose "Summarize from here"./clear between unrelated tasks./btw: Quick questions that don't need to stay in context go in /btw.
The answer appears as a dismissible overlay and never enters conversation history.
Use it for "what does this function do?" or "what's the flag for X?" without growing context._meta/context/progress.json before context fills. Include: completed steps, current position,
next action, key decisions made. New context reads this file first and resumes where the previous left off.
Models now understand their own token budget in real-time — use this to trigger state saves proactively.The Velocity Paradox (METR 2025): Developers with AI assistance feel ~20% faster but measure ~19% slower. Root cause: shifting to ~10% planning / ~90% implementation — AI makes coding cheap, so people skip planning. Fix: 50-70% planning / 30-50% implementation → 50% fewer refactors, 3x overall velocity.
Invest in:
Verification-first multiplier: Providing tests, screenshots, or expected outputs BEFORE asking Claude to implement changes quality dramatically. Claude can run verification against its own output throughout the session — not just at the end. State verification criteria at session START.
Adaptive thinking (Claude 4.6): Claude Opus/Sonnet 4.6 uses adaptive thinking, not budget_tokens.
When spawning agents for deep reasoning, guide effort via instruction:
"After reviewing tool results, reflect carefully before proceeding""This is straightforward, implement directly"Effort parameter (Opus 4.7): Opus 4.7 uses an explicit effort parameter, not instruction-based guidance.
xhigh: production code, architecture decisions, deep multi-file reasoninghigh: test generation, code review, moderate complexity tasksmedium: standard implementation (default if unspecified)low: validation, linting, trivial edits — model does only what's asked, no elaboration
Use effort: xhigh when spawning surgeon agents for tier 2+ work. At low, the model won't generalize — be explicit about scope.Explicit scope instructions (Opus 4.7 literal following): Opus 4.7 interprets instructions precisely — it won't auto-generalize.
--quick: skip confirmations, minimal prompts--plan-only: stop after planning--resume: continue in-progress work--validate-only: skip to validation