Help us improve
Share bugs, ideas, or general feedback.
From oracle
> Generate a framework-appropriate test from a natural-language
npx claudepluginhub bri-stevenski/oracle-test-ai-agent --plugin oracleHow this skill is triggered — by the user, by Claude, or both
Slash command
/oracle:oracle-generate-testThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Generate a framework-appropriate test from a natural-language
Provides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
Share bugs, ideas, or general feedback.
Generate a framework-appropriate test from a natural-language requirement. Routes through Oracle's classify → recommend → generate pipeline, writes the test under
tests/generated/, and optionally executes it.
unit | api | e2e | performance, ask one targeted question
before invoking the pipeline. The classifier will guess, but a wrong
guess costs a regeneration.Invoke the orchestrator via CLI or programmatically:
python -m agent.cli generate "<requirement>"
Read the printed classification + recommendation. Verify the
resolved test_type and framework match intent. If they don't,
refine the prompt and re-run — do not hand-edit the generated file
to compensate for a misclassification.
Locate the output. The CLI prints an absolute path under
tests/generated/<category>/. The orchestrator return dict's
output_path is authoritative.
--execute to the generator). For api/e2e tests, run against a
known-good environment first.docs/ORACLE_STATE.md so downstream sessions can
pick up context.If the generated test passes review and belongs in the committed suite,
use the oracle-promote-test skill.
Promotion is its own workflow — don't collapse it into this one.
python -m agent.cli generate <prompt> — Primary entry. Runs the
full pipeline; supports --execute to immediately run the generated
test.OracleOrchestrator.run(prompt, execute=False) — Programmatic
entry. Returns the structured pipeline-result dict.agent/frameworks/registry.json — Maps test_type → framework.
Edit here when adding a new framework; never hard-code framework
choices in callers.agent/llm/factory.py — Provider selection. Override via
ORACLE_LLM_PROVIDER=<anthropic|gemini|openai|mock> env.test_type matches the user's actual intentexecute=True)| Rationalization | Why It Is Wrong |
|---|---|
| "The classifier picked the wrong type but I'll hand-fix the output" | The hand-fix masks a real classifier gap. Refine the prompt or file a classifier issue — don't paper over routing bugs in the generated file. |
| "I'll commit the generated test as-is, it's good enough" | Generated tests live in tests/generated/ for a reason — they're unreviewed scratch. Promote intentionally. |
"The registry doesn't have an entry for this test_type, I'll add framework: null" | Null breaks the contract. Every test_type must map to a framework. Add a real entry or change the classifier output. |
| "I'll skip the validation phase, the test looks right" | LLM output that looks right and runs are different things. Always execute (or dry-run) before promoting. |
Prompt: Test that POST /v1/orders returns 201 with a valid payload
Pipeline trace:
Classification: intent=generate_tests, test_type=api, confidence=0.85
Recommendation: framework=requests-pytest, ext=.py, category=api
Output: tests/generated/api/orders_post_201.py
Execution: returncode=0, 1 passed in 0.42s
Action: Review the generated file, promote to
tests/api/orders_post_201.py, drop the timestamped header.
Prompt: Test the new orders feature
Action: Do NOT invoke the pipeline yet. Ask: "Is this an end-to-end
UI test of the checkout flow, an API contract test for /v1/orders, or
a unit test of the order-validation function?" Only after the user picks
should you run generate.
Prompt: Load test /v1/search at 200 RPS for 5 minutes, p95 latency < 300ms
Pipeline trace:
Classification: test_type=performance, confidence=0.95
Recommendation: framework=k6, ext=.js, category=performance
Output: tests/generated/performance/search_load.js
Validation: Run against a staging environment, not prod. Compare p95 to the threshold; if the test passes locally but the threshold was unrealistic, surface that to the requester before promoting.
test_type:
Stop and file a registry update. Do not invent a framework name.ORACLE_LLM_PROVIDER before assuming a bug.