From cli-anything-web
Performs Phase 4 quality gate for cli-web Python CLIs: 3-agent implementation review, 75-check checklist, pip package publishing, and read/write smoke tests.
How this skill is triggered — by the user, by Claude, or both
Slash command
/cli-anything-web:standardsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Quality gate for cli-web-* CLIs. This skill owns the complete Phase 4:
Quality gate for cli-web-* CLIs. This skill owns the complete Phase 4: independent implementation review, structural quality checklist, publishing, and end-user smoke testing. Nothing ships until this phase passes.
Do NOT start unless:
<APP>.md (API map) exists and documents all endpointsIf tests are not passing, invoke the testing skill first.
Not all checks apply to every CLI. When evaluating, consider the site profile:
auth login takes a key argument, not playwright-cli.
auth refresh is not applicable — use auth logout instead.Mark inapplicable checks as "N/A — [reason]" rather than creating dead-code stubs.
Before checking structure or publishing, verify the code actually does the right thing. Tests prove it runs; this step proves it's correct.
Dispatch 3 plugin agents in the same message using the Agent tool:
traffic-fidelity-reviewer — API coverage (reads .md + client.py + commands/)harness-compliance-reviewer — Code conventions (reads HARNESS.md + all source)output-ux-reviewer — User experience (runs --help, checks REPL, validates JSON)Pass each agent: APP_PATH={app}/agent-harness, APP_NAME={app}, and site
profile (auth_type, is_read_only). The agents are defined in the plugin's
agents/ directory.
| Agent | Focus | What it reads | What it catches |
|---|---|---|---|
| Traffic Fidelity | API coverage | <APP>.md + client.py + commands/ | Missing endpoints, wrong params, broken response parsing, dead client methods, stale API map |
| HARNESS Compliance | Code quality | HARNESS.md + checklist + all source | click.ClickException bypass, missing to_dict(), retry_after lost, auth retry missing, stderr UTF-8 |
| Output & UX | User experience | --help output, --json output, REPL | Protocol leaks, stale REPL help, dead command files, broken entry points |
Each agent scores findings on a 0-100 confidence scale. When all 3 return:
Gate: Do not proceed to Step 2 until Critical count = 0.
Run the 75-check quality checklist from references/quality-checklist.md.
This covers directory structure, required files, CLI patterns, core modules,
tests, documentation, packaging, code quality, REPL, error handling, and UX.
setup.py with:
find_namespace_packages for cli_web.*console_scripts entry point: cli-web-<app>click>=8.0, httpxextras_require={"browser": ["playwright>=1.40.0"]}pip install -e .which cli-web-<app>cli-web-<app> --helpThis is the most critical verification step. The agent MUST simulate what a real
end user would do after pip install cli-web-<app>. If this fails, the pipeline
is NOT complete -- go back and fix the issue.
If no-auth site: Skip steps 5-6 (auth). Go directly to step 7 (READ).
If read-only site: Skip step 8 (WRITE). Verify reads return real data.
5. Authenticate as an end user would:
cli-web-<app> auth login
This uses Python sync_playwright() -- opens a browser, user logs in, cookies saved. This is what end users will run. If this fails, the CLI is broken for end users.
6. Verify auth status shows LIVE VALIDATION OK:
cli-web-<app> auth status
Must show: cookies present, tokens valid. If it shows "expired", "redirect", or any auth failure -- STOP. Fix auth before proceeding.
7. Run a READ operation and verify real data:
cli-web-<app> --json <first-resource> list
This must return real data from the live API -- NOT an error, NOT empty, NOT "auth not configured". Verify the JSON response contains expected fields.
8. Run a WRITE operation and verify it actually worked: This is the step the agent most commonly skips. Reading data is easy -- the real test is whether the CLI can CREATE, UPDATE, or GENERATE something.
# For CRUD apps (Monday, Notion, Jira):
cli-web-<app> --json <resource> create --name "smoke-test-$(date +%s)"
cli-web-<app> --json <resource> list # verify the created item appears
cli-web-<app> --json <resource> delete --id <id-from-create>
# For generation apps (Suno, Midjourney, NotebookLM audio):
cli-web-<app> --json <resource> generate --prompt "test" --wait
# Verify: JSON response contains a real ID, status=complete, not an error
# If the command has --output, verify the file was downloaded and size > 0
# For search/query apps:
cli-web-<app> --json search "test query"
# Verify: results array is non-empty
If ANY write/generate command fails, the pipeline is NOT complete. Reading a list of existing items only proves auth works -- it does NOT prove the CLI can actually do useful work. The whole point is to CREATE things, not just read them.
9. Only after steps 5-8 ALL pass, declare the pipeline complete.
auth login works (Python playwright, API key, or N/A for no-auth)auth status shows valid (or N/A for no-auth)--json output (see below)Run every command with --json and check for raw protocol leaks (wrb.fr, af.httprm,
empty [], null required fields). See methodology/SKILL.md "Mandatory Smoke Check" for
the full red flags list.
#1 gap to watch for: Agent runs list (GET with auth — easy), declares done, but
never tests create/generate (POST with CSRF, encoding). Always test at least one write.
After smoke tests pass, these tasks remain — all independent, dispatch in parallel:
┌─ Agent 1: Generate Claude Skill (.claude/skills/<app>-cli/SKILL.md)
│ ALSO copy to cli_web/<app>/skills/SKILL.md (package-portable)
├─ Agent 2: Update repository README.md (add CLI to examples table)
├─ Agent 3: Write/update cli_web/<app>/README.md (package docs)
├─ Agent 4: Update registry.json + CLAUDE.md Generated CLIs table
└─ Agent 5: Add CLI to CI test matrix (.github/workflows/tests.yml)
│ + Add entry to CHANGELOG.md under [Unreleased]
All are independent — launch in one message with run_in_background: true
Use the templates at cli-anything-web-plugin/templates/ as the canonical
structure for SKILL.md and README.md — fill in the {{placeholders}} with
actual CLI data from <app> --help and <APP>.md.
Goal: Create a project-local Claude skill so that Claude can use this CLI automatically in future conversations — no manual lookup required.
IMPORTANT: The skill must exist in TWO locations:
.claude/skills/<app>-cli/SKILL.md — for Claude Code discovery (project-level)<app>/agent-harness/cli_web/<app>/skills/SKILL.md — portable with pip install
(included via package_data in setup.py)Create the skill once, then copy it to both locations.
Create <git-root>/.claude/skills/<app>-cli/SKILL.md:
cli-web-<app> --help + <resource> --help<app>-cli, description with specific trigger phrases
("whenever the user asks about X, Y, Z. Always prefer cli-web- over manually
fetching the website.")--jsonnotebooklm-cli, futbin-cli) as reference examplesAdd the new CLI to the examples table in README.md (CLI name, website, protocol,
auth type, description) and add a quick-start example in the "Try Them" section.
Add the new CLI to registry.json at the repo root:
{
"name": "cli-web-<app>",
"website": "<website>",
"protocol": "<detected protocol>",
"auth": "<auth type>",
"directory": "<app>/agent-harness",
"namespace": "cli_web.<app>",
"commands": ["<cmd1>", "<cmd2>", ...],
"install": "pip install -e <app>/agent-harness"
}
Also add to the Generated CLIs table in CLAUDE.md.
The pipeline is NOT done until ALL of these are checked:
.claude/skills/<app>-cli/SKILL.md exists (Claude Code discovery)cli_web/<app>/skills/SKILL.md exists (portable with pip install)cli-anything-web-plugin/templates/SKILL.md.template as starting pointsetup.py has package_data={"": ["skills/*.md", "*.md"]}__main__.py exists for python -m cli_web.<app> supportcli_web/<app>/README.md exists (used templates/README.md.template)<APP>.md API map existstests/TEST.md has Part 1 (plan) + Part 2 (results)README.md — new row in examples table + "Try them" sectionREADME.md — badge count updated (CLIs_generated-N and N_CLIs hero badge)CLAUDE.md — new row in Generated CLIs tableregistry.json — entry with name, website, protocol, auth, commands, installdocs/registry/index.html — entry added to JS data array with correct categoryCHANGELOG.md — entry added under [Unreleased] → Added.github/workflows/tests.yml — new CLI added to CI test matrix (see below)Every new CLI MUST be added to .github/workflows/tests.yml so unit tests run
on every push/PR. Do both steps — missing either blocks merges.
Step 1: Add to test matrix in .github/workflows/tests.yml:
- { name: <app>, dir: <app>/agent-harness, pkg: <app_underscore> }
Where <app_underscore> replaces hyphens with underscores (e.g., gh-trending → gh_trending).
Step 2: Add to branch protection required checks so PRs require the new check:
# Get current checks, append the new one, update
gh api repos/<owner>/<repo>/branches/main/protection/required_status_checks \
-X PATCH --input - <<EOF
{"strict": true, "contexts": [...existing..., "<app>"]}
EOF
Verify the entry runs: python -m pytest <dir>/cli_web/<pkg>/tests/test_core.py -v
All key rules (naming, auth, --json, REPL, rate limits) are defined in HARNESS.md "Critical Rules" and CLAUDE.md "Critical Conventions".
| Relationship | Skill |
|---|---|
| Preceded by | testing (Phase 3) |
| Followed by | None — this is the final phase |
| References | HARNESS.md (Generated CLI Structure, Naming Conventions) |
testing skill -- Phase 3 test planning/writing/documentationmethodology skill -- Phase 2 analyze/design/implementcapture skill -- Phase 1 traffic recording/cli-anything-web:validate -- Command to run the full 75-check validationnpx claudepluginhub itamarzand88/cli-anything-web --plugin cli-anything-webWrites unit (mocked), E2E live, subprocess, and VCR integration tests for Python cli-web-* CLIs using pytest; documents plans and results in TEST.md.
Audits Claude Code plugins for structure validation, frontmatter quality, deprecations, feature adoption, security patterns, and documentation. Ensures changelog compatibility and best practices for releases.
Provides 98 rules for building or auditing agent-safe CLI tools, covering JSON output, error handling, input contracts, safety guardrails, exit codes, and self-description. Use for new CLIs, retrofits, pipelines, audits.