Help us improve
Share bugs, ideas, or general feedback.
From bee
Guides writing specs, acceptance criteria, and vertical feature slices with rules for testable behaviors, error cases, code contracts, and shippable increments over horizontal layers.
npx claudepluginhub incubyte/ai-plugins --plugin beeHow this skill is triggered — by the user, by Claude, or both
Slash command
/bee:spec-writingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Each criterion is a single, testable statement. It describes one observable behavior in one sentence.
Creates clear, testable specifications from feature descriptions with user stories, acceptance criteria, and measurable success metrics.
Generates structured Given/When/Then acceptance criteria for user stories or features, covering happy path, edge cases, error states, and non-functional requirements for engineering/QA handoff.
Guides requirements engineering with INVEST user stories, Given-When-Then acceptance criteria including edge cases, MoSCoW prioritization, vertical slices, story mapping, decomposition patterns, non-functional requirements, and Definition of Done. Use for writing stories, epic breakdowns, and backlog prioritization.
Share bugs, ideas, or general feedback.
Each criterion is a single, testable statement. It describes one observable behavior in one sentence.
Good:
- [ ] User can create an account with email and password
- [ ] Shows error when email is already taken
- [ ] Password must be at least 8 characters
- [ ] Sends welcome email after successful signup
Bad:
- [ ] Handle user registration (How? What happens? What does the user see?)
- [ ] The registration feature works correctly (What does "correctly" mean?)
- [ ] Use bcrypt for password hashing with 12 salt rounds (Implementation, not behavior)
The test: Can a developer write a test from this criterion without asking clarifying questions? If yes, it's good. If no, it needs more detail.
Small indicative code is fine when it clarifies intent:
GOOD — API shape:
POST /api/orders
{ userId, items: [{ productId, qty }] }
→ 201 { orderId, total, status }
GOOD — Data structure:
Order: { id, userId, items[], total, status, createdAt }
BAD — Implementation logic:
const order = await db.insert('orders', { ... });
Use code to show contracts and shapes. Never use code to show logic — that's the TDD planner's job.
These principles apply at every size — SMALL, FEATURE, and EPIC. They are not optional techniques for large tasks.
Always slice by user-visible capability, not by technical layer.
Vertical (correct):
Horizontal (wrong):
Why vertical wins:
A good slice is: independently releasable, testable in isolation, small enough for one TDD plan, and delivers user-visible value.
Ordering: Start with the walking skeleton — the thinnest end-to-end path. Each subsequent slice adds capability. Later slices can assume earlier slices work.
Outside-in within each slice: Order ACs from what the user experiences inward — UI behavior first, then API contract, then data. This ensures the spec reads like a user journey, not a technical blueprint.
Not every spec needs the same rigor:
Low risk (internal tool, easy to revert):
Moderate risk (user-facing, business logic):
High risk (payments, auth, data migration):
Explicitly state what you are NOT doing. This prevents scope creep during implementation and gives the AI clear boundaries.
Examples:
Without an explicit out-of-scope list, the AI may build features you didn't ask for.
# Spec: [Feature Name]
## Overview
[1-2 sentences: what and why]
## Acceptance Criteria
- [ ] [Testable behavior]
- [ ] [Error case]
- [ ] [Edge case]
## API Shape (if applicable)
[Indicative code — endpoints, request/response shapes]
## Out of Scope
- [What we're explicitly not doing]
## Technical Context
- Patterns to follow: [from codebase]
- Key dependencies: [existing code this integrates with]
- Risk level: [LOW/MODERATE/HIGH]
# Spec: [Feature Name]
## Overview
[1-2 sentences: what and why]
## Slice 1: [Name — the walking skeleton]
- [ ] [AC]
- [ ] [AC]
## Slice 2: [Name — builds on Slice 1]
- [ ] [AC]
- [ ] [AC]
## Out of Scope
- [What we're explicitly not doing]
## Technical Context
- Patterns to follow: [from codebase]
- Risk level: [LOW/MODERATE/HIGH]
Checkboxes (- [ ]) track progress. Each AC gets checked off [x] when the verifier confirms it has a passing test.