From bopen-tools
Runs randomized, human-like exploratory testing on a running app to surface new bugs, broken flows, and UX issues, filing deduplicated tickets. Invoke for free roam, monkey testing, or autonomous discovery.
How this skill is triggered — by the user, by Claude, or both
Slash command
/bopen-tools:free-roam-testingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Scripted tests check what you already thought to check. **Free roam finds what you didn't.** This skill drives the actual running app along randomized, human-like paths to surface new issues, then files them as deduplicated tickets that an execution loop works systematically. It is the **discovery / producer** half of the two-loop architecture in `Skill(bopen-tools:loop-engineering)`.
Scripted tests check what you already thought to check. Free roam finds what you didn't. This skill drives the actual running app along randomized, human-like paths to surface new issues, then files them as deduplicated tickets that an execution loop works systematically. It is the discovery / producer half of the two-loop architecture in Skill(bopen-tools:loop-engineering).
The value comes from unpredictability: a real user clicks the wrong thing, pastes an emoji into a number field, hits back mid-submit, opens two tabs, and abandons a checkout. Reproduce that texture and you find the bugs that scripted suites never touch.
Free roam is randomized, so it is only safe inside a known blast-radius boundary. Establish these before the first click:
loop-engineering/references/blast-radius.md). Ask the project for app-specific additions.If you cannot confirm a boundary, ask — do not roam against prod with unknown blast radius.
read open tickets → pick an entry point → roam (randomized) →
observe & capture anomaly → dedup vs open tickets → file NEW ticket → repeat
loop-engineering/references/state-backends.md). You need them to dedup; refiling a known issue every pass is the #1 way discovery loops waste money.Drive the real app with agent-browser / chrome-cdp / webapp-testing (or the CLI for a CLI app). Inject entropy deliberately — vary by run so you don't retread the same path:
<script>), SQL-ish strings, wrong types, leading/trailing whitespace, paste-bombs, boundary numbers (0, -1, MAX_INT).There is no Math.random() to lean on — generate variety from the persona + an explicit "do something you haven't tried yet this session" instruction, and track visited paths in scratch state.
For each issue, record enough for a cold-start agent to reproduce:
read_console_messages), failed network requests (read_network_requests), and the URL/state.Skill(bopen-tools:linear-planning) for Linear.discovery, severity).Free roam has no natural "done", so bound it explicitly:
You are the producer. You do not fix what you find — you file it. The execution loop (Skill(superpowers:subagent-driven-development) / Skill(bopen-tools:wave-coordinator)) consumes the tickets and works them systematically with a verification gate. Keeping discovery and execution separate is the point: one surfaces breadth, the other resolves depth.
references/entropy-techniques.md — persona scripts, input-fuzz payloads, and path-variation tactics.npx claudepluginhub b-open-io/claude-plugins --plugin bopen-toolsConducts time-boxed exploratory testing sessions using heuristics to uncover bugs, edge cases, invalid data issues, and undocumented behaviors, with structured documentation.
Create and execute exploratory testing charters to discover defects through unscripted testing. Use when exploring new features or validating user scenarios.
Drives a real Chrome session against a running web app to find bugs, UX issues, a11y problems, and perf regressions. Outputs structured findings JSON for downstream triage.