Help us improve
Share bugs, ideas, or general feedback.
Your AI writes fast. Temper makes it last. Quality gates, blast radius analysis, and intent-driven development.
npx claudepluginhub galando/temperTemper: Quality gates, blast radius analysis, and intent-driven development for AI-generated code
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 80 focused plugins, 185 specialized agents, and 153 skills - optimized for granular installation and minimal token usage
Agent skills for building and maintaining promptfoo evaluations
Share bugs, ideas, or general feedback.
Your AI writes fast. Temper makes it last.
Intent-driven development with behavioral testing and quality gates for AI-generated code
Website | Getting Started | Releases
AI writes code fast. But "fast" without "right" creates bugs, technical debt, and features that miss the point.
"Why not just tell Claude to be careful?"
You can. And it helps. But AI-generated code has structural failure patterns that "be careful" doesn't address. These aren't sloppiness — they're limitations of how LLMs generate code:
These map to three unanswered questions:
| Question | What Goes Wrong Without It |
|---|---|
| Did we solve the problem? | Feature works but nobody uses it. Wrong problem solved. |
| Does it do the right things? | Happy path works, edge cases ship broken. |
| Does the code work? | Tests pass, but they test implementation details, not behaviors. |
Most AI tools answer only the third. Temper answers all three.
Temper combines three development methodologies in a single artifact called intent.md. Each layer answers a different question and is enforced at a different stage of the pipeline:
intent.md
|
+-- Intent Section (IDD) WHY are we building this?
| | Problem statement
| | Success criteria (each with a Validate: type)
| | Constraints
| |
+-- Scenarios Section (BDD) WHAT should it do?
| Gherkin Given/When/Then
| Derived BEFORE architecture
| Every planned file traces to a scenario
|
+-- /temper:build (TDD) HOW do we build it?
Tests written from scenarios
RED -> GREEN -> REFACTOR
Question: Did we solve the problem?
When: Defined during /temper:plan, validated during /temper:review
IDD captures the why behind a feature. Not "add a password reset endpoint" but "users should be able to reset their password without contacting support, completing the flow in under 2 minutes."
The Intent section of intent.md contains:
Validate: type that tells review how to check itEach success criterion gets a validation type. This is what makes IDD mechanical instead of subjective:
| Type | What It Means | How Review Checks It | Example |
|---|---|---|---|
scenario | Criterion is satisfied when a linked BDD scenario's test passes | Finds the test, runs it, checks PASS | "Users can reset password" -> linked to scenario "Successful password reset" |
code | Criterion is satisfied when specific code exists | Greps the codebase for the pattern | "POST /api/reset endpoint exists" -> greps for route definition |
metric | Can't be verified before deployment | Flags for post-deploy monitoring | "Support tickets decrease 30%" -> requires production data |
manual | Requires human judgment | Flags for human review, non-blocking | "Reset flow feels intuitive" -> UX review needed |