From universal-scraping-architect
Route, extract, and validate a scraping job (URL or local file) via the universal-scraping-architect skill — refuses to deliver unvalidated data.
How this command is triggered — by the user, by Claude, or both
Slash command
/universal-scraping-architect:cs-scrape <url-or-file-path> [desired output: csv|json|markdown]The summary Claude sees in its command listing — used to decide when to auto-load this command
# /cs-scrape
Run a gated extraction pipeline for `$ARGUMENTS` using `skills/universal-scraping-architect/SKILL.md`.
## Pre-flight gates (stop if any fails)
1. **Target stated?** If `$ARGUMENTS` is empty, ask for the URL or file path plus the desired output format — do not guess.
2. **Live-site etiquette:** for URLs, check `robots.txt` and plan rate limits; refuse disallowed targets.
3. **Privacy:** if the target is a local/sensitive file, do not send it to an external API — force Mode 2 (local Python).
4. **Secrets:** Firecrawl key only via `os.getenv('FIRECRAWL_API_KEY')`; if a key appe...Run a gated extraction pipeline for $ARGUMENTS using skills/universal-scraping-architect/SKILL.md.
$ARGUMENTS is empty, ask for the URL or file path plus the desired output format — do not guess.robots.txt and plan rate limits; refuse disallowed targets.os.getenv('FIRECRAWL_API_KEY'); if a key appears inline anywhere, fix that first.Route — state the mode and why (per the skill's routing rules): Mode 1 Firecrawl (public/JS-heavy URL, bulk crawl) · Mode 2 local Python (local files, private data, simple static HTML) · Mode 3 hybrid (Firecrawl extract + pandas clean).
Budget — estimate API quota / token limits before multi-page jobs; add checkpointing + pagination.
Extract — start from the matching runner template (run from the plugin root; --sample previews the summary shape offline):
python3 skills/universal-scraping-architect/scripts/firecrawl_example.py --sample
python3 skills/universal-scraping-architect/scripts/local_bs4_example.py --sample
Validate (mandatory, exit-code gated):
python3 skills/universal-scraping-architect/scripts/validate_extraction.py extracted_output.json --json
status: ok) → continuewarning = empty output, error = malformed JSON) → fix and re-extract; never deliver unvalidated dataThen check required fields and duplicates against the job spec.
Deliver — CSV (tabular) / JSON (nested) / Markdown (docs, chunked), per the user's requested format, with a summary of mode chosen, row counts, empty values, and the validation verdict.
npx claudepluginhub richyboy170/agentic-sdlc-internship --plugin universal-scraping-architect