Sharpen a single work item into a /spec-ready story seed — problem framing (SCR), multi-dimensional value with intersection reasoning, falsifiable invariants, temporally-tagged non-goals, acceptance criteria, assumptions with verification plans. Use when: refining a story before /spec, sharpening an ad-hoc idea into a structured seed, enumerating invariants and non-goals for a work item, preparing a handoff from product thinking to technical specification.
From sharednpx claudepluginhub inkeep/team-skills --plugin sharedThis skill uses the workspace's default tool permissions.
references/quality-examples.mdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Deploys Linkerd service mesh on Kubernetes with patterns for installation, proxy injection, mTLS, service profiles (retries/metrics), traffic splits (canary), and authorization policies.
Sharpen a single work item into a story seed — a structured artifact precise enough that a downstream specification process can start investigating the solution without re-deriving the problem.
The seed captures WHAT and WHY at the product level. It does NOT make technical architecture decisions, design APIs, choose data models, or specify implementation approaches — those belong to the specification process.
Load (on entry): Load /structured-thinking skill. If the skill is not available (Skill tool returns an error), stop and inform the user: "The /stories skill requires /structured-thinking for shared vocabulary (SCR format, disambiguation protocol, value dimensions, decision taxonomy). Cannot proceed without it."
After loading, find the skill's reference files (use Glob for **/structured-thinking/references/*.md). Read:
references/challenge-posture.md (co-driver stance, anti-sycophancy rules, investigate-vs-judgment boundary)references/extraction-protocol.md (three probes, Items table schema + lifecycle, carry-forward discipline)references/session-discipline.md (investigation escalation ladder, multi-answer parsing, progress scorecard, interaction cadence)Before starting any work, create a task for each phase using TaskCreate with addBlockedBy to enforce ordering.
Mark each task in_progress when starting and completed when done. On re-entry, check TaskList first and resume from the first non-completed task.
If input is rich (from a project decomposition output with dimensional value and constraints), mark task #3 as deleted — grounding is not needed.
Determine what the user brought and how to proceed.
Input quality heuristic: Input is "rich" when it includes multi-dimensional value articulation and explicit constraints (typically from a project decomposition output or a detailed brief). All other input is "bare."
Routing check — detect wrong-tool scenarios:
If input comes from a PROJECT.md: Read the upstream Items table. Respect Decided (Locked) items as constraints — do not reopen without new evidence. Carry Assumed items into this story's Items table for early verification. Note Parked items for context. See extraction-protocol.md §5 (carry-forward discipline).
If the input is a story: proceed. Determine bare vs rich and move to Phase 2 (Scaffold) → Phase 3 or Phase 4.
Create the living document infrastructure before any substantive work. This ensures event-driven writing has a home from the first finding.
<stories-dir>/<story-name>/STORY.md with section headers from the output template (empty — populated progressively through Phases 3-4)evidence/ directorymeta/_changelog.md with initial entry: date, story description, input source (upstream PROJECT.md reference if applicable)If carrying forward items from an upstream PROJECT.md, populate the Items table with carried items (Decided → constraint context in Notes, Assumed → verify early, Parked → awareness).
Where to save:
| Priority | Source |
|---|---|
| 1 | User says so in the current session |
| 2 | Env var CLAUDE_STORIES_DIR (check for resolved-stories-dir in the SessionStart hook output at the top of your conversation context; if not present, the hook may not be configured — fall back to priority 3-5) |
| 3 | AI repo config (CLAUDE.md, AGENTS.md, etc.) declares stories-dir: |
| 4 | Default (in a repo): <repo-root>/stories/<story-name>/STORY.md |
| 5 | Default (no repo): ~/.claude/stories/<story-name>/STORY.md |
The directory name uses kebab-case semantic naming (e.g., stories/add-cli-agent-runner/).
When the user arrives with a bare idea (no project context, no structured upstream), build grounding before sharpening.
Dispatch /worldmodel --depth light as a subagent: spawn a general-purpose subagent via the Agent tool. Include --depth light in the subagent's prompt text:
"Before doing anything, load /worldmodel skill. Run with --depth light on [topic]. [Include the story description and any user-provided links or context.]"
Light mode runs all channels at reduced depth — inline code scanning, 2 web probes, report catalogue scan, OSS README. This surfaces: what currently exists in the codebase, the connection landscape, 3P context, and dimensional awareness.
Read the worldmodel output. Use it to probe the user with informed questions rather than blank interrogation:
Write findings to evidence/ immediately. Grounding findings are facts — they don't need user validation to be captured. Use frontmatter to distinguish raw proof from synthesized understanding (see extraction-protocol.md §8).
If /worldmodel is unavailable: Fall back to direct investigation — Read/Grep/Glob for codebase scanning, WebSearch for web context, read the reports catalogue manually. Note in the seed: "automated grounding not performed — manual investigation used."
Proceed to Phase 4 once the problem space is sufficiently explored — the 5-probe stress test (Phase 4, criterion #1) is the exit gate for grounding.
Read these reference files from /structured-thinking (already loaded on entry):
references/disambiguation-protocol.md — the 5-step protocol (challenge/probe/surface/explore/verify) applied throughoutreferences/problem-framing.md — SCR format and 5-probe stress test for criterion #1references/value-dimensions.md — dimension-trace diagnostic and intersection reasoning for criterion #2references/decision-taxonomy.md — temporal non-goals and confidence vocabulary for criteria #3-#4 and #7Read references/quality-examples.md from this skill's directory for incorrect/correct pairs on the highest-risk criteria. Use these to calibrate your quality enforcement before working through the criteria.
From this phase onward, write to artifacts as items surface and resolve — not at the end.
meta/_changelog.md.Apply the three probes from extraction-protocol.md at story level as items surface during sharpening:
Capture items in the Items table. Follow the load-bearing heuristic: track formally when the item creates precedent, is customer-facing, is foundational tech, is a one-way door, is cross-cutting, or creates divergence.
Follow the session discipline from session-discipline.md:
Work through the 7 completeness criteria in whatever order the input demands, spending effort where the input is weakest. The user may redirect at any time. Check scope coherence (the 2-3 sentence test) as soon as the problem statement takes shape — don't wait until all criteria are done to discover the story is actually 3 stories.
The criteria fall into two conceptual groupings — user story aspects (problem, value, connections) and technical implications (invariants, non-goals, AC, assumptions). Default to user story first, but follow the energy when input demands it. This is organizational guidance, not a hard sequential gate.
| # | Criterion | What to do | Quality gate |
|---|---|---|---|
| 1 | Problem clarity | Draft SCR at story level (Situation → Complication → Resolution). Run the 5-probe stress test: demand reality, status quo, narrowest wedge, observation, future-fit. Before accepting this framing, check: is this a problem or a solution-in-disguise? "Add webhook support" is a solution. "Enable deployment event visibility" is the problem. Which framing gives the downstream specification process more room to find the right solution? When input is rich: verify the framing holds at this granularity. When bare: elicit through conversation. | Problem is real (not hypothetical), correctly scoped (not 3 stories bundled), and worth doing (cost of inaction is concrete). |
| 2 | Dimensional value and goals | Run the dimension-trace diagnostic: does this trace to at least one value dimension? Probe across dimensions — customer, platform, GTM, internal. Ensure intersection reasoning is present, not just dimension labels. Define observable success criteria. | An engineer reading the value section can articulate the tradeoff space and make informed decisions when dimensions conflict. Success criteria are observable. |
| 3 | Invariants and constraints | Extract what MUST be true. Push every invariant to be falsifiable. Check claims against the codebase when possible. Extract what bounds the solution space — technical limitations, dependencies, appetite. | Every invariant has an observable definition. No subjective language without concrete criteria. Every constraint identifies what it bounds and why. |
| 4 | Non-goals and boundaries | Actively probe: "You said CLI — does that include Windows? Interactive or headless only? Plugin support?" Each answer becomes an invariant, non-goal, or constraint. Tag every non-goal: NEVER / NOT NOW / NOT UNLESS. | Every non-goal has a temporal tag with rationale and (for NOT NOW / NOT UNLESS) a revisit trigger or condition. The section is actively probed, not passively collected. |
| 5 | Acceptance criteria | Derive observable, testable outcomes from invariants, goals, and non-goals. Every invariant maps to at least one AC. AC describe outcomes, not implementation. | An engineer could write tests from the AC without guessing intent. |
| 6 | Connections and context | Capture pointers: what bet/project this traces to, what siblings share dependencies, what future work this enables. Not deep analysis — enough for a downstream specification process to understand blast radius. | Downstream consumers can see the blast radius of their design decisions without re-discovering connections. |
| 7 | Assumptions | Surface what we're treating as true but haven't verified. Each assumption gets: the claim, confidence (HIGH/MEDIUM/LOW), and a verification plan. These become the specification process's investigation agenda. Track assumptions as Items with Assumed status in the Items table. | The specification process checks assumptions early rather than building on them blindly. |
When the user can't provide information for a criterion — investigate before accepting the gap. Follow the investigation escalation ladder from session-discipline.md:
/analyze or /research as a subagent (Pattern C). For /analyze: include any worldmodel output in the prompt and tell it to skip its own worldmodel phase — subagents can't nest further subagents. For /research: include --headless in the prompt (research's scoping gate needs auto-confirmation since no human is present in the subagent).Assumed status.If /analyze or /research are unavailable: Skip the dispatch. Note: "deep investigation not performed — gap flagged as LOW confidence assumption with verification plan."
When you fill a gap through autonomous investigation rather than user input, mark the fill with its source:
This lets downstream consumers distinguish user-provided context (high trust) from agent-inferred context (needs verification).
Provenance marking and evidence references are complementary. Provenance marks WHO inferred the content and flags it for verification. Evidence references ((evidence/<filename>.md)) point to WHERE the proof is. When an inference is captured in an evidence file, include both: the provenance marking in the text and the evidence reference after the claim. Example: "Platform value: this establishes the API pattern for marketplace integrations. [Inferred from existing read-only endpoints in api/v2/ — verify with product owner.] (evidence/api-patterns.md)"
If the "2-3 sentence test" fails — you can't describe the story in 2-3 sentences — it may be multiple stories bundled:
Before finalizing, verify:
If any criterion is missing without an N/A reason, return to Phase 4 to address it.
Simulate — can an engineer take this seed to a specification process without re-deriving the problem framing? If they'd need to ask "what problem does this solve?", "what's out of scope?", or "why does this matter beyond the obvious dimension?" — the seed isn't done.
Writing is lighter because most content was written progressively during Phase 4. Review STORY.md for completeness, coherence, and consistency. Fill any remaining gaps. Log completion in meta/_changelog.md.
When a downstream specification process reads this STORY.md, it maps Items by status:
Decided → Decision Log entries (the spec's separate-table model)Parked → Future Work entries with contextAssumed → Assumptions table entries (extract confidence + verification from Notes)Open/Exploring → Open Questions (should be rare — this skill should resolve before handoff)Present the completed seed to the user for review. The seed is the deliverable — not the conversation.
No headless mode. This skill requires interactive human input (probing for invariants, non-goals, dimensional gaps). Defer headless support to a future version if orchestrator invocation is needed.
# Story: [verb-first title]
**Last verified:** YYYY-MM-DD <!-- date this seed's content was last verified as current -->
## Problem (SCR-lite)
**Situation:** [what exists today]
**Complication:** [why this matters — intersection of dimensions, not just one]
**Resolution:** [what must change]
## Value and goals
[Multi-dimensional articulation — which dimensions apply and how they intersect.
Prose that connects them, not a bullet list of individual dimensions.
Observable success criteria: what "done" looks like across dimensions.]
## Invariants
- [Each is falsifiable with an observable definition]
## Constraints
- [Each identifies what it bounds and why — technical, dependency, appetite]
## Non-goals
- [NEVER] [item + why it's fundamentally out]
- [NOT NOW] [item + revisit trigger]
- [NOT UNLESS] [item + condition that changes the calculus]
## Acceptance criteria
- [Observable, testable outcomes — not implementation prescriptions]
## Items
| ID | Item | Type | Priority | Status | Notes |
|---|---|---|---|---|---|
| PQ1 | ... | Product | P0 | Decided | Decision + rationale (evidence/auth-patterns.md) |
| TQ1 | ... | Technical | P0 | Assumed | Claim. Confidence: Medium. Verify by: [plan] (evidence/api-surface.md) |
| XQ1 | ... | Cross-cutting | P2 | Parked | Options + why not now + trigger |
## Context
- Traces to: [bet/project if known]
- Lateral: [sibling stories that depend on or share with this]
- Forward: [future work this enables]
## Evidence & References
### Evidence Files
- [evidence/<file>.md](evidence/<file>.md) — [one-line: what it contains]
### Research Reports
- [reports/<name>/REPORT.md](reports/<name>/REPORT.md) — [what it covers]
### Code Repositories
- [org/repo](URL) — [what was examined]
### External Sources
- [Title](URL) — [brief description]
### Upstream Artifacts
- [<PROJECT.md path>](<path>) — source project
| Anti-pattern | What it looks like | Correction |
|---|---|---|
| Separate tracking tables | Creating separate Open Questions and Decision Log and Assumptions tables instead of using the unified Items table | One Items table. Status column distinguishes item types. Assumptions are Items with Assumed status — confidence and verification plan go in the Notes column. |
| Accepting claims without verification | User says "the auth layer supports this" → agent proceeds without checking | Check the codebase. The cheapest time to discover a false assumption is now. |
| Accepting "I don't know" without investigation | User can't provide dimensional value → agent flags as assumption immediately | Investigate first: codebase, reports, web. Only flag as assumption after investigation fails. |
| Vague invariants | "It should be fast" / "Auth must be transparent" | Push for falsifiable definitions: "Sub-100ms p95 latency" / "Developer never encounters an auth prompt during agent execution." |
| Dimension lists without intersection reasoning | "Customer: query traces. Platform: API patterns. GTM: none." | Connect them: "Trace querying (customer) AND the API pattern it establishes (platform) — the pattern is load-bearing because the marketplace story needs it." |
| Non-goals without temporal tags | "No Windows support" | Tag it: "[NOT NOW] No Windows CLI — 95% macOS/Linux. Revisit if: Windows devs exceed 20% of active users." |
| Implementation prescriptions as acceptance criteria | "Use OpenTelemetry SDK for trace querying" | Rewrite as observable outcome: "Developer can retrieve traces for a specific agent run within 5 minutes of completion." |
| Bureaucratic interrogation | 15 questions before doing any work | Investigate autonomously first. Only surface questions that require human judgment. When you do need to ask, group related questions in a single turn rather than asking one at a time. |
| Attempting technical design | Proposing API shapes, data models, or architecture | Stop. The seed captures WHAT and WHY. The specification process investigates HOW. |
| Silently accepting scope incoherence | Story hides 4-5 features but the skill proceeds | Run the 2-3 sentence test. If it fails, split (if trivially separable) or flag (if intertwined). |
| Deferring all writing to the end | 60 minutes of conversation, then "let me write the seed" | Event-driven writing from Phase 3 onward. Evidence files written immediately. STORY.md sections updated after user confirms synthesis. |
| Items table bloat | 40+ items where most are implementation details | Apply the load-bearing heuristic: track formally only when the item creates precedent, is customer-facing, is foundational tech, is a one-way door, is cross-cutting, or creates divergence. |