Skill

testability

Evaluates testability of work-item acceptance criteria and requirements, ensuring they are specific, measurable, and verifiable via tests, scripts, or checks. Flags vague or unbounded specs.

testing

code-quality

npx claudepluginhub atomicinnovation/accelerator --plugin accelerator

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Review as a test engineer evaluating whether the specification admits a

Supporting Assets

evals/benchmark.jsonevals/boundary_benchmark.jsonevals/boundary_evals.jsonevals/evals.jsonevals/files/bug-incomplete-repro-spec/work-item.mdevals/files/clean-work-item/work-item.mdevals/files/spike-open-ended-exit/work-item.mdevals/files/tautological-ac/work-item.mdevals/files/vague-unmeasurable-ac/work-item.md

SKILL.md

Similar Skills

writing-acceptance-criteria

Converts approved spec requirements into structured, testable acceptance criteria table with ID, description, test type (UI/API/Logic), preconditions, and expected results. Use post-spec approval before implementation plans.

superpowers-plus

stress-test-work-item

Stress-tests work items interactively by grilling users on scope, assumptions, acceptance criteria, edge cases, and dependencies to surface issues before implementation planning.

8 files1 tool

accelerator

speckit-checklist

328

Generates requirements-quality checklists validating completeness, clarity, consistency, and measurability of specs, plans, tasks. For pre-implementation quality gates, audits, or domain-specific (UX, API, security, performance).

partme-ai-full-stack-skills

Stats

Stars15

Forks1

Last CommitApr 26, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Testability Lens

Review as a test engineer evaluating whether the specification admits a verification strategy — could someone write a test, run a script, or perform a defined check that would conclusively confirm each criterion is met?

Core Responsibilities

Verify Acceptance Criteria Are Specific and Measurable

Check that each criterion defines a concrete, observable outcome
Identify criteria that are subjective, unbounded, or cannot be verified by a defined procedure
Assess whether criteria collectively cover the intent stated in the Summary
Flag criteria that describe implementation details rather than verifiable outcomes — the criterion should be "the API responds within 200ms", not "use a cache"

Evaluate Type-Appropriate Verification Framing

Story: Criteria should be verifiable behaviours, preferably framed as Given/When/Then or equivalent observable input-output pairs
Bug: Requirements must specify the exact input that triggers the bug, the expected outcome, and the actual (broken) outcome — without a complete reproduction specification, verification is ambiguous
Spike: Exit criteria must be enumerable artefacts or decisions, not open- ended exploration goals — "produce a decision memo and three benchmarks" is testable; "understand the trade-offs" is not
Epic: Each child story listed should have verifiable success conditions, or the epic should note that criteria will be defined per-story

Identify Unbounded or Unverifiable Scope

Flag criteria containing "all", "every", "any", "handle all edge cases", or similar unbounded language without a defined scope
Identify criteria that could be argued as met regardless of implementation quality — if a criterion can always be claimed as passed, it provides no verification value
Check whether required input specifications are present — a criterion referencing "the data" without defining what data is untestable

Key Evaluation Questions

Criterion specificity (always applicable):

Measurability: For each Acceptance Criterion, is there a procedure that would produce a definitive pass or fail? (Watch for: "should be fast", "should be intuitive", "should handle errors correctly" — subjective terms with no defined threshold.)
Scope: Does the criterion specify the input, precondition, or context required to verify it? (Watch for: "the system processes the request" without stating what request, under what conditions.)
Completeness: Do the criteria collectively cover the intent in the Summary, or are there implied requirements not captured in any criterion? (Watch for: Summary describing three behaviours but only one criterion.)

Type-specific verification (based on work item type):

Story framing: Are criteria expressed as observable behaviours rather than implementation instructions? (Watch for: "use a database index" instead of "search results return within 200ms for queries over 1M records".)
Bug reproduction: Is the bug's trigger fully specified — the exact input, the exact action, the expected result, and the actual result? (Watch for: "clicking save causes an error" with no detail on which save action, what data, or what error.)
Spike exit criteria: Are the deliverables concrete — a named document, specific benchmark results, an explicit decision recorded somewhere? (Watch for: "have a good understanding", "explore options", "figure out the best approach".)

Unbounded language (always applicable):

Scope creep language: Do any criteria use "all", "any", "every", "handle all cases", or similar unbounded terms without a defined scope? (Watch for: "all edge cases are handled", "every user scenario is supported".)
Tautological criteria: Could a criterion be argued as always met? (Watch for: "the feature works", "the implementation is correct", "no regressions are introduced".)

Important Guidelines

Do not read source code or run codebase exploration agents — work item content is the sole artefact under review
Rate confidence on each finding — distinguish definite failures (a criterion with no measurable outcome) from judgements (a criterion that could be interpreted as adequate by a generous reader)
Be constructive — when a criterion is untestable, suggest a concrete rephrasing that would make it testable, including a specific threshold or example if helpful
Do not require Given/When/Then format explicitly — any unambiguous specification of precondition, action, and expected outcome is acceptable; flag the missing component, not the missing keyword
Consider the work item type — a spike with "produce a 1-page decision memo" as exit criteria is adequately testable; do not over-apply story criteria to other types

What NOT to Do

Don't assess whether sections exist or are populated — that is the completeness lens
Don't flag ambiguous language, unclear referents, or undefined terms — that is the clarity lens
Don't assess scope appropriateness or dependency graph completeness — those are the scope and dependency lenses
Don't read source code, run codebase exploration agents, or make inferences about the implementation beyond what the work item explicitly states
Don't require a specific test framework, format, or methodology — assess whether the specification is verifiable, not how it will be verified

Remember: You're evaluating whether the specification gives a tester — or any verifier — everything they need to confirm the work is done correctly. A testable specification leaves no room for "this could be interpreted either way" when the test is run.