Writes, fixes, or updates Python scripts using shub_workflow base classes for Scrapy Cloud operations: scheduling spiders, querying jobs, aggregating stats, or running as crawl managers/monitors.
How this skill is triggered — by the user, by Claude, or both
Slash command
/shub-workflow-toolkit:shub-workflow-scriptsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
shub-workflow scripts subclass a base class in
shub-workflow scripts subclass a base class in
shub_workflow/script.py
and get, for free: argument parsing (+ reusable -g/-v "programs"), project-id resolution, the
ScrapinghubClient, job scheduling with flow/name tagging, paginated+retrying job queries, a Scrapy
stats collector and an FSHelper. The deep reference is the wiki:
Appendix B: Script Classes.
A brand-new file isn't yet registered in the project's setup.py, so you can't tell from the repo.
When asked to "create a script", confirm with the user whether it will deploy to or operate on
Scrapy Cloud (run as an SC job, schedule spiders/scripts, scan/query SC jobs, aggregate stats,
etc.). If yes → use these base classes. If it's just a local utility with no SC interaction, a plain
script is fine and this skill doesn't apply. When editing an existing file that already imports
shub_workflow.script, this skill applies directly.
| Base class | Use when | Template |
|---|---|---|
BaseScript | one-shot: parse args, do work in run(), exit (the default) | examples/plain_script.py |
BaseLoopScript | must repeat work on an interval / run continuously until stopped | examples/loop_script.py |
BaseLoopScriptAsyncMixin (+ BaseLoopScript) | the loop cycle is asyncio-based (schedules/awaits many things at once) | examples/async_loop_script.py |
ArgumentParserScript | only argparse + PROGRAMS, no SC access (rare; the base the others build on) | — |
Projects usually add a shared base mixin (common CLI options/helpers) that every concrete script inherits — see examples/project_base_mixin.py. Check whether the project already has one and build on it rather than re-adding shared options.
description (property) and add your arguments in add_argparser_options() —
always call super() first so --project-id/-g/-v/etc. survive.run() for BaseScript; workflow_loop() (returns bool; async def for the async mixin) plus optional on_start()/on_close() for loop scripts.__main__ boilerplate (below).setup.py. Deployment itself is handled by the
scrapy-cloud-deployment skill, not here.super().add_argparser_options() before adding arguments, or you lose the framework
flags (--project-id, --flow-id, -g/-v, loop flags).class X(ProjectMixin, BaseScript) — mixin first, concrete base last. The mixin
must inherit the typing-only BaseScriptProtocol (never BaseScript), so the implementation
isn't duplicated in the MRO.BaseLoopScriptAsyncMixin script's run() is a coroutine —
launch it with asyncio.run(script.run()), and make workflow_loop an async def.self.project_id is the target (where you schedule/query, from
--project-id); the script's own running project can differ. Don't hardcode ids; pass
--project-id or set default_project_id.workflow_loop() returns bool: True keeps looping (with --loop-mode/loop_mode);
False stops immediately. A loop with loop_mode = 0 runs its body once.project_required = False only for scripts that genuinely never touch SC.-g/-v PROGRAMS mechanism (predefined command-line shortcuts), see the
scanjobs-programs skill — scanjobs.py is the canonical example of a heavily
PROGRAMS-driven script.if __name__ == "__main__":
import logging
from shub_workflow.utils import get_kumo_loglevel
logging.basicConfig(format="%(asctime)s %(name)s [%(levelname)s]: %(message)s", level=get_kumo_loglevel())
script = MyScript()
script.run() # ... or asyncio.run(script.run()) for a BaseLoopScriptAsyncMixin script
npx claudepluginhub scrapinghub/shub-workflow --plugin shub-workflow-toolkitBuilds, updates, and troubleshoots shub-workflow crawl managers that schedule Scrapy Cloud spider jobs and react to outcomes. Covers base class selection, generator pattern, hooks, and concurrent scheduling.
Deploy Scrapy projects to Scrapy Cloud / Zyte Cloud, schedule spiders, list and stop jobs, and help inspect items and logs via the web UI.
Guides using Claude Code dynamic workflows to orchestrate many subagents for large-scale tasks like codebase sweeps or migrations.