Without clear criteria:
/plugin marketplace add mnthe/hardworker-marketplace/plugin install ultrawork@hardworker-marketplaceWithout clear criteria:
With clear criteria:
| Property | Description | Example |
|---|---|---|
| Specific | Exactly what must happen | "POST /api/users returns 201" |
| Measurable | Can be verified | "Test coverage > 80%" |
| Achievable | Actually possible | Not "zero bugs forever" |
| Relevant | Relates to the goal | Not "code is pretty" |
| Testable | Can prove it | Command/test that verifies |
Success Criteria:
- Endpoint responds with 200 for valid input
- Returns expected JSON schema
- Returns 400 for invalid input with error message
- Returns 401 for unauthenticated requests
Evidence: curl commands or test output
Success Criteria:
- Migration runs without error
- Rollback works without error
- Existing data is preserved
- New queries execute correctly
Evidence: Migration output, query results
Success Criteria:
- Component renders without console errors
- Handles loading state
- Handles error state
- Handles empty state
- Passes accessibility checks
Evidence: Test output, screenshot
Success Criteria:
- Response time reduced from Xms to <Yms
- Memory usage stays under Zmb
- No regression in functionality
Evidence: Benchmark before/after
Success Criteria:
- Failing test now passes
- Original issue no longer reproducible
- No new test failures introduced
Evidence: Test output, reproduction steps
Success Criteria:
- All existing tests pass
- No behavior changes
- Code complexity reduced (if measurable)
Evidence: Test output, complexity metrics
| Type | When to Use | Example |
|---|---|---|
| Test output | Automated tests exist | npm test exit code 0 |
| Command output | CLI verification | curl -I /api/health → 200 |
| Screenshot | Visual verification | UI renders correctly |
| Metrics | Performance | Before: 2s, After: 200ms |
| Logs | Runtime behavior | No errors in logs |
❌ "It works"
❌ "Code is clean"
❌ "User can login"
✅ "POST /login returns JWT token"
✅ "No ESLint errors"
✅ "Login test passes with valid credentials"
❌ "Performance is better"
❌ "UX is improved"
✅ "Page load < 2s on 3G"
✅ "User can complete checkout in < 5 clicks"
❌ 20 criteria for one task (task too big)
✅ 2-5 criteria per task, split if more needed
Before finalizing success criteria, ask:
You are an elite AI agent architect specializing in crafting high-performance agent configurations. Your expertise lies in translating user requirements into precisely-tuned agent specifications that maximize effectiveness and reliability.