From zenml-io-skills
Migrate Dagster assets, ops, graphs, jobs, and software-defined asset workflows to idiomatic ZenML pipelines. Handles concept mapping (asset->step output, job->pipeline, IOManager->artifact store/materializer + explicit IO steps), asset-boundary planning, code translation, scheduling, retry config, resources/config migration, and flags unsupported patterns (asset selection, partitions/backfills, sensors, declarative automation, freshness policies, observable source assets) for human review. Use this skill whenever the user mentions Dagster migration, converting Dagster assets or jobs, porting workflows from Dagster, replacing Dagster with ZenML, or asks how a Dagster concept maps to ZenML -- even if they do not explicitly say "migrate". Also use when they paste Dagster code and ask to make it work with ZenML, or when they describe a workflow using Dagster terminology (`@asset`, `@multi_asset`, `Definitions`, `IOManager`, `ConfigurableResource`, partitions, sensors, asset checks) in a ZenML context. If the user just asks a quick conceptual question ("what is the ZenML equivalent of an IOManager?" or "how should I think about Dagster assets in ZenML?"), answer it directly from the concept map -- no need to run the full migration workflow.
npx claudepluginhub joshuarweaver/cascade-ai-ml-engineering --plugin zenml-io-skillsThis skill uses the workspace's default tool permissions.
This skill translates Dagster projects into idiomatic ZenML pipelines. It handles the full migration workflow: analyzing Dagster code, classifying each pattern, deciding where Dagster asset boundaries become ZenML pipeline boundaries, translating what maps cleanly, flagging what needs redesign, and producing a working ZenML project.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
This skill translates Dagster projects into idiomatic ZenML pipelines. It handles the full migration workflow: analyzing Dagster code, classifying each pattern, deciding where Dagster asset boundaries become ZenML pipeline boundaries, translating what maps cleanly, flagging what needs redesign, and producing a working ZenML project.
Dagster and ZenML are both orchestration systems, but they organize work around different primary objects.
That means a Dagster -> ZenML migration is not mainly a decorator rename. The hard part is semantic:
Think of it like moving a library into a workshop. In Dagster, the shelves themselves are the first-class object. In ZenML, the assembly line is the first-class object. The books may stay the same, but the floor plan changes.
Every Dagster concept falls into one of these categories:
| Type | Meaning | Action |
|---|---|---|
| Direct | Clean 1:1 mapping exists | Translate automatically |
| Approximate | Conceptual equivalent exists but semantics differ | Translate with caveats noted in migration report |
| Absent | No ZenML equivalent | Flag for human review with redesign suggestions |
See references/concept-map.md for the full mapping tables.
Ask the user for their Dagster codebase, or the relevant files if it is too large. Read the code thoroughly before doing anything else. Inventory the project in two passes:
Determine whether the codebase is primarily:
@asset, @multi_asset, define_asset_job, asset checks, partitions, and asset automation@op, @graph, @jobThis matters because @op / @job code usually maps more cleanly to ZenML than asset-heavy code.
For each module, identify:
@asset, @multi_asset, @graph_asset, @graph_multi_asset, @op, @graph, @job, DefinitionsIOManager, ConfigurableIOManager, SourceAsset, observable source assets, metadata, asset checksConfig, RunConfig, ConfigurableResource, EnvVarDynamicOut, DynamicOutput, runtime fan-out, asset subsets, dynamic partitionsWhen the codebase uses asset-heavy features, open references/gaps-and-flags.md early. That file is the safety rail for migration.
For each component identified in Phase 1, classify it using the mapping type (direct / approximate / absent). Use the quick guide below and the full tables in references/concept-map.md.
Direct translations (translate automatically):
@op -> @step@job -> @pipelineRetryPolicy -> StepRetryConfigApproximate translations (translate with caveats):
@asset -> step output artifact inside a pipeline@multi_asset -> multi-output step@graph_asset -> helper steps plus a terminal output artifactConfigurableResource -> stack components + secrets + service connectors + step-local helper objectsSchedule(...) on supported orchestratorsIOManager -> artifact store/materializer plus explicit source/sink stepsSourceAsset -> ExternalArtifact or explicit source-loading stepDynamicOutput fan-out -> dynamic pipeline or explicit redesignAbsent / needs redesign (flag for human review):
@multi_asset subset semanticsBefore writing any code, make an explicit decision about the migration shape. This is the single most important Dagster-specific step.
Choose one of these:
Single ZenML pipeline
Multiple ZenML pipelines in one migrated project
Partial migration + flagged redesign
# TODO(migration) markers, and make the redesign requirements explicit.Present this decision clearly to the user before generating code:
"Here is the migration shape I recommend:
- Pipeline boundary decision: [single pipeline / multiple pipelines / partial migration]
- Why: [concrete explanation tied to the Dagster code]
- Direct translations: [list]
- Approximate translations: [list]
- Needs redesign: [list with brief explanation]
Shall I proceed with this migration plan?"
If there are HIGH-severity flags, explain each one concretely: what the Dagster code does, why ZenML cannot replicate it directly, and what redesign would preserve the intent most honestly.
Translate the Dagster project into a ZenML project. Follow these conventions strictly.
Every migrated project MUST use this layout:
migrated_pipeline/
├── steps/ # One file per step
├── pipelines/
│ ├── __init__.py
│ ├── main_pipeline.py
│ └── extra_pipeline.py # If the Dagster project becomes multiple pipelines
├── materializers/ # Custom materializers (if needed)
├── configs/
│ ├── dev.yaml
│ └── prod.yaml
├── run.py # CLI entry point (argparse, not click)
├── README.md
└── pyproject.toml
This matches the zenml-pipeline-authoring skill's conventions. Key rules:
steps/run.py uses argparsepyproject.toml uses zenml>=0.94.1 and requires-python = ">=3.12"zenml init at the project rootconfigs/dev.yaml and configs/prod.yamlREADME.md explaining the migrated pipeline(s), how to run them, and what still needs manual attention# Migration note: comments for semantic differences# TODO(migration): comments only where genuine redesign work remainsUnlike Airflow- or Databricks-style migrations, a Dagster migration may honestly need multiple ZenML pipelines. Do not force a single pipeline just for symmetry.
run.py behaviorrun.py may run that pipeline by default.run.py should expose a --pipeline argument so the user can choose which pipeline entry point to run.run.py should also expose the relevant parameters (--partition-key, --start-date, --end-date, etc.).Move the compute body into a @step function, type-hint the inputs and outputs, and wire steps through function calls in a @pipeline.
See references/code-patterns.md for side-by-side examples covering:
@multi_asset, @graph_asset, and dynamic fan-outWhen translating approximate patterns, add brief inline comments in the generated code explaining the semantic difference:
@step
def load_orders() -> pd.DataFrame:
# Migration note: the original Dagster asset could be materialized
# independently inside the asset graph. In ZenML this data is produced
# as part of a pipeline run and persisted as an artifact.
...
For patterns that have no ZenML equivalent, do NOT silently approximate them. Instead:
# TODO(migration) comment in the generated codeExample:
# TODO(migration): UNSUPPORTED -- Dagster asset selection / subset materialization
# was part of the original workflow here. ZenML does not support first-class
# asset selection semantics. Recommended redesign: split the asset group into
# separate pipelines and load shared upstream results via ExternalArtifact or
# an explicit source step.
After generating the ZenML project, produce a MIGRATION_REPORT.md in the project root:
# Migration Report: [Dagster Project] -> [ZenML Project]
## Summary
- **Source**: Dagster project `[name]`
- **Target**: ZenML project `[name]`
- **Project style**: asset-centric / op-job-centric / mixed
- **Components migrated**: X direct, Y approximate, Z flagged
## Pipeline Boundary Decisions
| Dagster run unit / asset slice | ZenML pipeline | Why split or combine |
|---|---|---|
| daily_orders assets | pipelines/orders_daily.py | Dagster users materialized this slice independently |
## Direct Translations
| Dagster Component | ZenML Component | Notes |
|---|---|---|
| `train_model` op | `steps/train_model.py` | Clean op -> step translation |
## Approximate Translations
| Dagster Component | ZenML Component | What Changed |
|---|---|---|
| `cleaned_orders` asset | `steps/clean_orders.py` | Asset became a step output artifact inside a pipeline |
| warehouse IOManager | `steps/load_orders.py` + artifact store | Business logic moved from IO manager into explicit step |
## Flagged for Review
| Dagster Pattern | Severity | Issue | Suggested Redesign |
|---|---|---|---|
| Asset selection | HIGH | No first-class subset materialization in ZenML | Split into multiple pipelines |
| Daily partitions + partition mappings | HIGH | No native partition engine | Explicit partition-key params + external backfill driver |
| Sensor cursor | HIGH | No sensor/cursor API | External event trigger service |
## IO / Storage Migration
[Summarize what was preserved and what moved out of IO managers]
## Partition / Backfill Strategy
[Explain how partition keys and backfills are handled after migration]
## Automation and Scheduling Gaps
[Explain schedules, sensors, freshness, declarative automation, and what changed]
## What's NOT Migrated
[List the Dagster semantics or platform features left outside the migrated code]
## What You Get for Free After Migration
- **Artifact versioning and lineage**
- **Step caching**
- **Stack abstraction**
- **Service connectors**
- **Model Control Plane** (if relevant)
## Recommended Next Steps
1. Run the `zenml-quick-wins` skill for metadata logging, experiment tracking, and alerters
2. Install the ZenML docs MCP server: `claude mcp add zenmldocs --transport http https://docs.zenml.io/~gitbook/mcp`
3. Follow the docs links for flagged patterns
4. Use `zenml-pipeline-authoring` for deeper customization
After migration is complete, always include a "Recommended Next Steps" section in the migration report AND communicate it to the user.
zenml-quick-wins skillAlways suggest this as the immediate next step:
"Now that the migration is done, I recommend running the
zenml-quick-winsskill to add metadata logging, experiment tracking, alerts, and other production improvements."
For every flagged pattern, include relevant ZenML documentation links. Prefer stable, high-level docs areas when the exact implementation path depends on the user's stack:
https://docs.zenml.io/user-guides/starter-guide/manage-artifactshttps://docs.zenml.io/concepts/steps_and_pipelines/dynamic_pipelineshttps://docs.zenml.io/concepts/steps_and_pipelines/schedulinghttps://docs.zenml.io/concepts/deploymenthttps://docs.zenml.io/stacks/orchestratorshttps://docs.zenml.io/stacks/service-connectorshttps://docs.zenml.io/user-guides/best-practices"For easier access to ZenML documentation while you work, you can install the ZenML docs MCP server:
claude mcp add zenmldocs --transport http https://docs.zenml.io/~gitbook/mcp"
When the migration has 2+ HIGH-severity flags, generate a pre-made Slack message for zenml.io/slack. Include:
When the migration reveals a genuine missing feature in ZenML -- not just "this works differently", but a real capability gap that multiple users would benefit from -- offer to open a GitHub issue on zenml-io/zenml.
/simplify on the generated codeAfter migration is complete, always suggest running /simplify on the generated code to reduce migration noise, consolidate repetitive helper code, and make the result feel more like production code.
zenml-pipeline-authoringUse zenml-pipeline-authoring for:
These are the most common sources of confusion after migration. Always mention the relevant ones in the migration report.
A Dagster asset is a named data product with graph semantics around materialization, selection, partitions, and checks. A ZenML step is a unit of compute. The closest migration shape is usually:
@stepIf the original Dagster IO manager says, in effect, "when someone asks for this asset, go load table X from warehouse Y", then the real story is not serialization. The real story is data access logic. That logic usually belongs in a ZenML source/sink step, not only in a materializer.
Passing partition_key="2026-04-07" into a ZenML pipeline preserves the label. It does not automatically preserve partition mappings, backfills, freshness, or asset-reconciliation rules. Those must be rebuilt explicitly.
A Dagster sensor is usually better reimagined as an external trigger or polling service. Otherwise you risk turning a lightweight orchestration rule into an expensive long-running container.
| Anti-pattern | Why it is wrong | What to do instead |
|---|---|---|
| Treating every asset as its own pipeline | Destroys meaningful execution grouping | Group assets by real operational boundary |
| Forcing the entire asset graph into one pipeline | Hides the loss of subset materialization semantics | Split into multiple pipelines when needed |
| Translating every IOManager into a materializer | Loses business/data-access behavior | Separate serialization from explicit source/sink logic |
| Replacing sensors with infinite polling steps | Burns compute and changes operational behavior | Use external triggers or bounded polling logic |
| Collapsing partition logic into a single untyped string without documenting the loss | Drops critical orchestration semantics | Preserve partition parameters explicitly and document gaps |
| Treating asset checks as comments instead of executable validation | Loses enforcement | Create validation steps and log metadata |
https://docs.dagster.io/https://docs.zenml.io/