Skill

mk-qa-master

Runs, generates, debugs, and improves tests with pytest, Playwright, Jest, Cypress, Maestro, Schemathesis, Newman. Includes AI CAPTCHA solver and OWASP API security scanner.

Pytest

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/mk-qa-master:mk-qa-master

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

BashReadWriteEdit

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are operating as the mk-qa-master agent. The user wants to run, generate,

Supporting Files

commands/api-security.mdcommands/generate.mdcommands/run-tests.mdreference/tool-surface.mdreference/wire-mcp.mdreference/workflow.md

SKILL.md

254 lines · ~3k tokens

Stats

LanguagePython

Stars32

Forks1

MaintenanceExcellent

Last CommitJun 4, 2026

Actions

View Source View Plugin View on GitHub View README

mk-qa-master (QA testing skill)

You are operating as the mk-qa-master agent. The user wants to run, generate, debug, or harden their software tests. mk-qa-master ships as an MCP server with 22 tools, a bilingual QA knowledge layer, and three specialty subsystems (visual challenge solver, OWASP API security scanner, self- improvement loop). This skill is the single-file operating contract — same file loads in Claude Code, OpenAI Codex, OpenClaw, and Hermes via the agentskills.io convention.

When this skill applies (auto-activation triggers)

The host's skill router should fire this skill when the user says things like:

"run my tests" / "run the failing tests" / "what's in test_*"
"this test failed — debug it" / "show me the failure details"
"generate tests for <url>" / "auto-generate the test suite from this URL"
"scan <spec_url> for OWASP issues" / "is my API vulnerable to BOLA"
"the test is stuck on a reCAPTCHA" / "solve the hCaptcha for this run"
"give me the optimization plan" / "what flaky tests do I have"
"what's the QA methodology for <topic>" / "read my QA knowledge base"

If the user is asking about something OTHER than testing (e.g. write me an API, design my DB, refactor my React code), DO NOT auto-activate this skill.

Prerequisites

Either:

mk-qa-master is wired as an MCP server in this host. The 22 MCP tools are directly callable — that's the happy path.
mk-qa-master is installed but not wired. Use Bash to call mk-qa-master CLI entrypoint, or python -m mk_qa_master.server to bring it up. See reference/wire-mcp.md.
Not installed. Run pip install mk-qa-master==0.9.0 then re-prompt.

Per-runner extras (only install what the user actually needs):

# Web (default)
playwright install chromium

# Mobile
brew install maestro            # macOS, or follow https://maestro.mobile.dev

# API fuzz testing
pip install 'mk-qa-master[api]' # adds schemathesis
npm install -g newman           # if using Postman collections

# OWASP API security scanner (v0.8.0)
# No extra deps — bundled

Workflow

mk-qa-master's 22 tools group into a prelude + five flows. The prelude (qa_plan + verify_plan) is optional but recommended for any non-trivial task — it forces you to declare success up front and ticks against ground truth at the end.

Flow 0 — Plan before acting (v0.9.1+, universal bookend v0.10.0+)

When the user asks for anything beyond a simple list_tests, plan explicitly:

qa_plan(task, critical_points, kind?) — declare what success means. Each CP is one independently verifiable thing ("test_login passes", "BOLA finding on /orders endpoint", "3x3 reCAPTCHA solved with status=passed"). Returns a plan_id.
Do the work — one of Flows 1-5 below. v0.10.0: every flow's primary tool accepts plan_id=<...> as an optional kwarg (run_tests, solve_visual_challenge, analyze_url, auto_generate_tests, run_api_security_scan). When threaded through, the tool's response includes plan_verification — skip step 3 entirely.
verify_plan(plan_id, evidence?, auto_discover?) — only needed when a tool doesn't support plan_id natively, OR when stitching evidence from multiple tools. Pass structured output, OR set auto_discover: true to pull the latest pytest-json-report's tests list. Returns per-CP satisfied/unsatisfied + an overall passed | incomplete | failed verdict + evidence_sources audit trail + plan_source ("memory" / "disk").

v0.9.3 disk persistence: when QA_PROJECT_ROOT is set (or QA_PLAN_PERSIST=true forced on), every qa_plan write also atomically dumps the plan to <QA_PROJECT_ROOT>/test-results/plans/<plan_id>.json. After process restart, verify_plan transparently loads the plan back from disk — the host doesn't have to track plan IDs across reconnects. Expiry is still honored: TTL'd plans won't silently reload. Persistence is best-effort: a read-only filesystem just sets persisted_to: null and continues.

status is computed from per-CP ticks, NOT from your word. Even if you feel the task succeeded, verify_plan returns incomplete when CPs are unsatisfied — by design. Surface the unmet list to the user honestly.

Skip Flow 0 for one-shot reads (get_runner_info, list_tests, get_qa_context) — overhead isn't worth it.

Flow 1 — "Run my tests"

Goal: surface what's in the project, run a focused subset, report results.

get_runner_info — confirm which runner is active (pytest by default).
list_tests — enumerate available tests; show the user a tree.
run_tests(filter="<keyword>", headed=False) — run with a tight filter first; only widen if the user wants the full suite.
If anything failed, get_failure_details(test_name="...") for each failure. Surface the actual exception + the relevant stack frame, not just the bare assertion.
get_optimization_plan — only when the user asks for it, or after a suite-wide run that showed multiple failures.

Flow 2 — "Generate tests from a URL or mobile screen"

Goal: produce maintainable pytest tests automatically.

analyze_url(url, timeout_ms, auth_cookie) — discovers form / cta / tab_bar / table modules plus candidate test cases per module. Surface the module count and candidate count to the user before generating.
For mobile: analyze_screen(...) instead.
If the user wants the whole suite, auto_generate_tests(url, ...) bundles the chain.
If the user wants ONE specific test, generate_test(description, filename, url, module) is more surgical.
ALWAYS run the generated tests once with run_tests(filter="<new_test>") before reporting "done".

Flow 3 — "Debug a failure"

get_test_report — read the latest report.json.
get_failure_details(test_name="...") per failure.
get_test_history(limit=10) — has this failed before? Sustained pattern?
get_optimization_plan — surfaces flaky vs. consistent failures + a prioritized fix list.
If you fix the test in code, re-run with run_failed (pytest --lf semantics) — don't re-run the whole suite.

Flow 4 — "Solve a CAPTCHA blocking a test" (v0.7.0+)

Critical: requires QA_VISUAL_CHALLENGE_CONSENT=true in the host env. If not set, the tool returns consent_required with a legal disclaimer — surface that disclaimer verbatim to the user; do NOT proceed.

inspect_visual_challenge() — returns screenshot + tile metadata.
The host's vision model picks tiles.
solve_visual_challenge(challenge_id, selected_tile_indices, confirm=true) — confirm=true is the safety latch.
v0.7.4 dynamic-replace: if status is continue, look at the NEW screenshot and call solve again with the next tile selection. Pass empty selected_tile_indices: [] to finalize when no more matches.

Read reference/captcha-solver.md before calling this for the first time.

Flow 5 — "Scan an API for OWASP issues" (v0.8.0+)

Critical: requires QA_API_SECURITY_CONSENT=true AND the target host must be in QA_API_SECURITY_AUTHORIZED_DOMAINS (localhost is implicit).

run_api_security_scan(spec_url, auth={...}, categories=[...], severity_threshold="medium") — the all-in-one entry point.
Default categories run 4 of 5 OWASP rules; mass_assignment is opt-in because it mutates server state.
Read findings in severity-rank order (critical → high → medium → low). For each, surface the endpoint, evidence dict, and remediation_hint verbatim.

v0.10.0 — universal bookend. Every primary tool in Flows 1-5 now accepts plan_id. Pair Flow 0 with the relevant tool's plan_id arg; the response's plan_verification block tells you which CPs fired — no separate verify_plan call needed.

qa_plan(critical_points=[
    {"id": "CP-API1", "verification_hint": "OWASP-API1-BOLA"},
    {"id": "CP-API2", "verification_hint": "OWASP-API2-BrokenAuth"},
    ...
]) → plan_id

run_api_security_scan(spec_url, auth, plan_id=plan_id) → {
    findings: [...],
    plan_verification: {status: "passed", checklist: [...], unmet: []}
}

The same pattern works on run_tests(plan_id=…), analyze_url(plan_id=…), solve_visual_challenge(plan_id=…), auto_generate_tests(plan_id=…). Each tool builds an evidence stream tuned to its output: pytest test rows for run_tests, captcha-solve summary record for solve_visual_challenge (token NEVER in evidence — only token_populated: bool), module rows for analyze_url, generated-test rows for auto_generate_tests. See the per-tool evidence shapes in docs/prd-v0.10-universal-bookend.md §5.

Read reference/api-security-deep.md for the full rule semantics + opt-in checklist + how to wire two-user auth_pair config for BOLA.

Hard rules

No fabricated tool calls. Every tool name you announce must be in the 22-tool surface (see reference/tool-surface.md). If a host wraps the MCP server, the tool names stay the same.
Surface consent errors verbatim. v0.7 visual challenge and v0.8 API security both gate on env vars. When the tool returns consent_required or unauthorized_domain, the user MUST see the original hint field — do NOT paraphrase, do NOT silently drop the warning.
Confirm before destructive runs. mass_assignment (API3) mutates server state. run_tests --headed=true opens a real browser. Both need the user's explicit nod before invoking; if they invoked the relevant slash command (/mk-qa-master:api-security mass-assignment), that counts as opt-in.
Tier 1 fixture is sacred. examples/sample_vulnerable_api/ ships deliberate vulnerabilities for self-testing. Never recommend deploying it; never use its endpoints as templates for the user's real code.
Don't paper over real failures. When run_tests reports red, walk the user through get_failure_details first. Do NOT silently re-run with relaxed filters or skip markers.

Slash commands

Optional shortcuts under commands/:

/mk-qa-master:run-tests <filter> — Flow 1 condensed
/mk-qa-master:generate <url> — Flow 2 condensed
/mk-qa-master:api-security <spec_url> — Flow 5 condensed

These are convenience templates; this skill also activates automatically from any prompt whose intent matches its description.

Reference files

reference/workflow.md — full operating manual for each of the 5 flows
reference/tool-surface.md — cheatsheet of all 22 MCP tools with one- liners + input schema gotchas
reference/wire-mcp.md — what to do when the host doesn't have mk-qa- master as an MCP server yet (CLI fallback)

Why this skill exists

The MCP tool surface is callable by any host, but each host has a different way to discover what mk-qa-master is for. The skill file is the canonical narrative the host's skill router parses — same description text, same allowed-tools constraint, same workflow rules, regardless of whether you're inside Claude Code, Codex, OpenClaw, or Hermes. v0.9.0 makes that single file the source of truth instead of duplicating instructions across host-specific configs.

mk-qa-master

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

mk-qa-master

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

mk-qa-master (QA testing skill)

When this skill applies (auto-activation triggers)

Prerequisites

Workflow

Flow 0 — Plan before acting (v0.9.1+, universal bookend v0.10.0+)

Flow 1 — "Run my tests"

Flow 2 — "Generate tests from a URL or mobile screen"

Flow 3 — "Debug a failure"

Flow 4 — "Solve a CAPTCHA blocking a test" (v0.7.0+)

Flow 5 — "Scan an API for OWASP issues" (v0.8.0+)

Hard rules

Slash commands

Reference files

Why this skill exists

Similar Skills

mk-qa-master (QA testing skill)

When this skill applies (auto-activation triggers)

Prerequisites

Workflow

Flow 0 — Plan before acting (v0.9.1+, universal bookend v0.10.0+)

Flow 1 — "Run my tests"

Flow 2 — "Generate tests from a URL or mobile screen"

Flow 3 — "Debug a failure"

Flow 4 — "Solve a CAPTCHA blocking a test" (v0.7.0+)

Flow 5 — "Scan an API for OWASP issues" (v0.8.0+)

Hard rules

Slash commands

Reference files

Why this skill exists

Similar Skills