By incubyte
Bee — a spec-driven TDD workflow navigator for AI-assisted development. Guides developers through the right level of process: triage, spec, architecture, TDD planning, execution, verification, and review.
npx claudepluginhub incubyte/ai-plugins --plugin beeArchitectural health assessment grounded in domain language. Compares how a product describes itself against how the code is structured.
Start a brainstorming session. Open-ended, collaborative idea generation for product, architecture, UX, or any problem space. Researches online, builds on your ideas, and helps narrow to the best path forward.
Run browser-based regression tests against specs. Verifies acceptance criteria in a running app via Chrome MCP, produces pass/fail reports with screenshots. Read-only — does not modify code.
Analyze your development sessions and get actionable coaching insights
Start a discovery session. A PM persona that interviews you (or synthesizes from transcripts) and produces a client-shareable PRD. Use standalone or let /bee invoke it automatically.
Explain Bee's features interactively — what each command does, when to use it, what artifacts it produces. Adapts to your project context.
Analyze a legacy and new codebase to produce a prioritized, independently-shippable migration plan.
Interactive developer onboarding for existing projects. Analyzes the codebase and delivers an adaptive walkthrough with knowledge checks.
Run ping-pong TDD on a spec. Two agents alternate — one writes a failing test, the other makes it pass — until all acceptance criteria are implemented.
Quality coverage analysis with hotspot-driven test planning. Finds high-risk untested code and produces a prioritized test plan.
Standalone code review with hotspot analysis, tech debt prioritization, and developer coaching.
Use spec to drive development. Works with or without a pre-built spec. With a spec path, skips to context → architecture → slice loop. Without a spec (or with a task description), runs the full workflow including triage → discovery → spec → architecture → code → test → verify → review.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion"` to load it. It is a deferred tool and will fail if called without loading first. Do this once at the start of your work.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion"` to load it. It is a deferred tool and will fail if called without loading first. Do this once at the start of your work.
You are an architecture test generator. You turn architecture assessment findings into runnable boundary tests that document good structure and expose where architecture leaks from the domain model.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion"` to load it. It is a deferred tool and will fail if called without loading first. Do this once at the start of your work.
You are a codebase analyst. Quick and efficient.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, `WebSearch`, or `WebFetch`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion,WebSearch,WebFetch"` to load them. These are deferred tools and will fail if called without loading first. Do this once at the start of your work.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, `WebSearch`, or `WebFetch`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion,WebSearch,WebFetch"` to load them. These are deferred tools and will fail if called without loading first. Do this once at the start of your work.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion` or `WebFetch`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion,WebFetch"` to load them. These are deferred tools and will fail if called without loading first. Do this once at the start of your work.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion"` to load it. It is a deferred tool and will fail if called without loading first. Do this once at the start of your work.
You are Bee's programmer. Your job: turn a reviewed TDD plan into working, clean code — one test at a time, no shortcuts.
You are a specialist agent that synthesizes review outputs into a prioritized test plan. You receive analysis from three review agents (behavioral, tests, coupling) and produce a plan that tells the developer exactly where to invest in test coverage for maximum impact.
You are Bee handling a trivial fix.
You are the recap agent. After an SDD iteration completes, you produce a structured walkthrough of what was built — so the developer understands the code they now own without digging through git logs or spec files.
You are a specialist review agent focused on AI ergonomics — how well this codebase supports LLM-assisted development. Code that's ergonomic for AI is faster to work with, produces fewer hallucinations, and generates more correct results.
You are a specialist review agent focused on behavioral analysis — what the git history reveals about where problems cluster and what's coupled.
You are a specialist review agent focused on code quality — the craftsmanship principles that make code maintainable, readable, and correct.
You are a specialist review agent focused on structural coupling — how tightly connected are the modules, and where does coupling create unnecessary change cost?
You are a specialist review agent focused on organization and project standards — the rules and conventions defined in the target project's CLAUDE.md and related documentation.
You are a specialist review agent focused on team practices — the habits that show up in git history and PR reviews. These are team health signals, not individual judgments.
You are a specialist review agent focused on test quality — not just "are there tests?" but "are they the right tests, testing the right things, in the right way?"
You are Bee doing the final review. All slices are verified. Now step back and look at the complete body of work as a whole — not slice by slice, but the full picture.
You are Bee verifying a completed SDD slice. Your job: confirm the work is solid before moving on — or catch what needs fixing while the context is fresh.
You are Bee's slice-coder — the SDE in the spec-driven development workflow. Your job: write production code for one slice's worth of acceptance criteria, guided by the spec and architecture.
You are Bee's slice-tester — the SDET in the spec-driven development workflow. Your job: write tests for production code that was just written by the slice-coder.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, `WebSearch`, or `WebFetch`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion,WebSearch,WebFetch"` to load them. These are deferred tools and will fail if called without loading first. Do this once at the start of your work.
You are the coding half of a ping-pong TDD pair. Your ONLY job: write the minimum production code to make a failing test pass, then refactor.
You are the ping-pong TDD orchestrator. You manage a RED-GREEN-REFACTOR cycle by alternating between two specialist sub-agents:
You are an expert TDD Coach specializing in Split Test-Driven Development for CQRS (Command Query Responsibility Segregation) architectures. You use TDD as a design tool — the tests don't just verify code, they force a clean separation between the write side (commands that change state) and the read side (queries that return data).
You are an expert TDD Coach specializing in Contract-First Test-Driven Development for event-driven architectures. You use TDD as a design tool — the tests don't just verify code, they force the code into clean event contracts, reliable producers, and resilient consumers.
You are an expert TDD Coach specializing in Outside-In Test-Driven Development for MVC architectures. You use TDD as a design tool — the tests don't just verify code, they force the code into clean MVC layers with thin controllers, focused services, and well-tested models.
You are an expert TDD Coach specializing in Outside-In Test-Driven Development that drives Onion (Hexagonal) Architecture. You use TDD as a design tool — the tests don't just verify code, they force the code into the correct architectural shape.
You are an expert TDD Coach creating simple, behavior-driven test plans. No layers, no ports, no architecture ceremony. Just test → implement → refactor, one behavior at a time.
You are the test-writing half of a ping-pong TDD pair. Your ONLY job: write exactly ONE failing test, run it, confirm it fails, and report back.
**IMPORTANT — Deferred Tool Loading:** Before calling `AskUserQuestion`, you MUST first call `ToolSearch` with query `"select:AskUserQuestion"` to load it. It is a deferred tool and will fail if called without loading first. Do this once at the start of your work.
You are Bee verifying a completed slice. Your job: confirm the work is solid before moving on — or catch what needs fixing while the context is fresh.
This skill should be used when evaluating how well code supports LLM-assisted development. Contains context window friendliness, explicitness, module boundaries, test-as-spec, and naming criteria.
This skill should be used when explaining why Bee follows spec-first TDD workflows, or when teaching mode is on. Contains the reasoning behind spec → plan → test → code ordering.
This skill should be used when evaluating architecture options, checking dependency direction, or deciding between onion, MVC, and simple patterns. Contains YAGNI-based decision criteria.
This skill should be used when updating workflow state in bee-state.local.md. Contains the full command reference for update-bee-state.sh including init, set, get, clear, all flags, and multi-line field syntax.
This skill should be used after discovery completes on a greenfield project when the discovery document contains a Module Structure section. Contains the full procedure for parsing module boundaries, generating BOUNDARIES.md, and optionally linking it from CLAUDE.md.
This skill should be used when the user wants to brainstorm, explore ideas, think through options, says 'let's brainstorm', 'what are our options', 'how might we', 'what if we', 'help me think through', 'explore approaches', or presents an open-ended problem without a clear solution. Covers cross-domain thinking where changing one dimension (product, tech, UX) can simplify another.
This skill should be used when running browser-based verification. Supports two providers: Claude-in-Chrome (primary) and chrome-devtools-mcp (fallback). Contains tool reference, dev server detection, screenshot conventions, and graceful degradation.
This skill should be used when writing, reviewing, or refactoring code. Contains SRP, DRY, YAGNI, naming, error handling, dependency direction, and Kent Beck's four rules of simple design.
This skill should be used when performing code reviews, analyzing git hotspots, or detecting coupling. Contains Adam Tornhill's code-as-crime-scene methodology and effort sizing.
This skill should be used when processing @bee annotations in documents or managing [ ] Reviewed gates. Contains the exact comment card format and review loop rules. Load before handling any @bee comment.
This skill should be used when diagnosing failures, fixing bugs, or investigating why tests don't pass. Contains systematic debugging: reproduce first, read before you change, assume nothing, find root cause.
This skill should be used when making UI/UX decisions, designing layouts, choosing colors and typography, writing UI specs, or reviewing visual components. Contains the two-path design flow (existing system vs greenfield discovery), accessibility rules, typography pairing, color palette guidance using Sanzo Wada's Dictionary of Color Combinations, spatial composition, anti-generic rules, and visual quality checklist.
Interview the user relentlessly about a plan, design, or idea until reaching shared understanding. Use when the user wants to stress-test their thinking, asks to be grilled, says 'poke holes in this', 'challenge my design', 'what am I missing', or presents a plan and wants it pressure-tested before committing to it.
This skill should be used when performing precise dependency analysis with LSP tools. Contains availability checking, graph reasoning, and graceful degradation.
Rewrites software requirements to lead to better code design and architecture. Use this skill whenever a user shares requirements, user stories, feature specs, acceptance criteria, or PRDs and wants them improved before development begins. Also trigger when a user says 'review my requirements', 'improve this spec', 'make this requirement better', 'rewrite this for better design', 'architecture-aware requirements', or asks how to write requirements that produce cleaner code. Trigger even when users paste requirements without explicitly asking for review — if the requirements contain design-limiting patterns (conditional chains, implementation details, ambiguous boundaries), proactively suggest improvements.
This skill should be used when resolving git merge conflicts. Parses conflict markers, shows clean side-by-side diffs, explains why each conflict occurred using git history, and presents resolution options via AskUserQuestion with reasoning for the recommended choice. Use this skill whenever the user mentions merge conflicts, says 'resolve conflicts', 'fix conflicts', or when git status shows unmerged paths.
This skill should be used when writing specs, acceptance criteria, or slicing features into vertical slices. Contains adaptive depth by risk and out-of-scope capture.
This skill should be used when writing a TDD plan, reviewing test quality, choosing between unit and integration tests, or applying red-green-refactor. Contains outside-in double-loop, test isolation, and test naming.
This skill should be used after all slices are verified and before review. Contains templates for generating contextual 'Try it yourself' manual verification steps based on what was built (frontend, backend, CLI, greenfield, library).
A collection of Claude Code plugins by Incubyte.
Spec-driven development that scales process to match the task. Triages by size and risk, then navigates you through the right workflow: triage, context gathering, spec, architecture, code, test, verify, review.
@bee inline annotationsEntry point: /bee:sdd
Learn any technology by building real projects. Claude guides you step-by-step — you write every line of code yourself.
Entry point: /learn:start
Builds a compounding wiki from raw documents. Drop articles into clippings/, ingest them into an Obsidian-compatible wiki with cross-references, then query your knowledge base with synthesized answers and citations.
[[wikilinks]] and YAML frontmatterEntry point: /second-brain:ingest
Take a raw product idea through ten guided phases and walk away with a structured PRD. Pushes back on vague metrics, refuses to write PRDs for ideas that should die.
discovery-state.mdEntry point: /discovery:start
# Add the Incubyte marketplace
/plugin marketplace add incubyte/ai-plugins
# Install a plugin
/plugin install bee@incubyte-plugins
/plugin install learn@incubyte-plugins
/plugin install discovery@incubyte-plugins
See individual plugin directories for license details.
Unified toolkit for Context-Driven Development with spec-first planning, TDD workflow, and Beads integration
Uses power tools
Uses Bash, Write, or Edit tools
Has parse errors
Some configuration could not be fully parsed
Share bugs, ideas, or general feedback.
Production-ready development workflows with TDD orchestration, feature development, security hardening, and 100+ specialized technical agents.
Plan-first AI development with batched parallelism. Native Claude Code implementation of the Agent Hive workflow.
Essential development commands for coding, debugging, testing, optimization, and documentation
Comprehensive Behavior-Driven Development principles, practices, and collaboration patterns.
Production-ready Claude Code configuration with role-based workflows (PM→Lead→Designer→Dev→QA), safety hooks, 44 commands, 19 skills, 8 agents, 43 rules, 30 hook scripts across 19 events, auto-learning pipeline, hook profiles, and multi-language coding standards