From zenml-io-skills
Migrate Vertex AI Pipelines (Kubeflow Pipelines v2 / PipelineJob workflows) to idiomatic ZenML pipelines. Handles concept mapping (`@dsl.pipeline` -> `@pipeline`, `@dsl.component` -> `@step`, `PipelineJob.create_schedule(...)` -> `Schedule(...)`), artifact-contract translation (`Input[Dataset]`, `InputPath`, `.uri`, `.path`), Google Cloud Pipeline Components (GCPC) rewrites, dynamic control flow (`dsl.If`, `dsl.ParallelFor`, `dsl.Collected`), resource/config migration, and flags unsupported patterns (compiled template workflows, `dsl.ExitHandler`, path-coupled artifacts, schedule lifecycle parity) for human review. Use this skill whenever the user mentions Vertex AI Pipelines migration, KFP v2 to ZenML, PipelineJob migration, GCPC migration, or asks how a Vertex/KFP concept maps to ZenML — even if they do not explicitly say "migrate". Also use when they paste KFP DSL code, compiled pipeline YAML/JSON, Vertex submission code, or describe a workflow using Vertex/KFP terminology (`dsl.component`, `dsl.pipeline`, `dsl.If`, `dsl.ParallelFor`, `PipelineJob`, GCPC) in a ZenML context. If the user just asks a quick conceptual question ("what is the ZenML equivalent of `dsl.importer`?"), answer it directly from the concept map — no need to run the full migration workflow.
npx claudepluginhub joshuarweaver/cascade-ai-ml-engineering --plugin zenml-io-skillsThis skill uses the workspace's default tool permissions.
This skill translates **Vertex AI Pipelines / KFP v2 / PipelineJob workflows** into idiomatic ZenML pipelines.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
This skill translates Vertex AI Pipelines / KFP v2 / PipelineJob workflows into idiomatic ZenML pipelines.
The shape looks familiar at first: both systems use decorated Python functions, typed inputs and outputs, and DAG-style orchestration. But under the hood the story is different:
That means this migration is not a search-and-replace job. Some patterns map directly, some map approximately, and some must be treated as redesign boundaries.
For most teams, the safest migration story is:
There are two practical strategies:
PipelineJob submission inside a ZenML step as a temporary black-box migrationThe skill should optimize for the first strategy and document the second honestly as a partial migration escape hatch.
| Type | Meaning | Action |
|---|---|---|
| Direct | Clean 1:1 mapping exists | Translate automatically |
| Approximate | Similar intent, different semantics | Translate with caveats and record the difference |
| Absent | No ZenML equivalent | Flag for human review and suggest a redesign |
See references/concept-map.md for the full mapping tables.
Ask the user for the actual source artifacts before rewriting anything. In practice that may be:
PipelineJob(...)Read all of it carefully and identify:
@dsl.pipeline, subpipelines, graph helpers, imported components@dsl.component, @dsl.container_component, importer components, GCPC operatorsInput[T], Output[T], InputPath, OutputPath, .path, .uri, metadata helpersdsl.If, dsl.Elif, dsl.Else, dsl.Condition, dsl.ParallelFor, dsl.Collected, dsl.ExitHandlercompiler.Compiler().compile(), template registries, template_path=..., schedule creation from compiled templatesPipelineJob.create_schedule(...), cron strings, timezones, start/end windows, concurrency knobsClassify each concept as direct / approximate / absent using references/concept-map.md and references/gaps-and-flags.md.
Direct translations (translate automatically):
@dsl.pipeline -> @pipeline@dsl.component -> @step when the contract is value-centricApproximate translations (translate with caveats):
InputPath / OutputPath -> explicit file handling or a location-aware artifact contractdsl.If on upstream outputs -> @pipeline(dynamic=True) + .load()dsl.ParallelFor -> .map() in a dynamic pipelinedsl.Collected -> reducer step over mapped outputsExternalArtifact(...) or explicit artifact lookupPipelineJob.create_schedule(...) -> Schedule(...)ResourceSettings(...) plus Vertex-specific orchestrator settingsAbsent / must-flag patterns (never silently approximate):
dsl.ExitHandlerBefore writing code, summarize the findings for the user:
"Here is what I found in your Vertex / KFP pipeline:
- Direct translations: [list]
- Approximate translations: [list]
- Needs redesign: [list]
The main risk areas are [artifact contract / GCPC / control flow / template lifecycle]. Shall I proceed with the migration?"
If there are high-risk flags, explain them concretely in story form: what the original KFP code does, why ZenML cannot preserve that behavior 1:1, and what the least-bad redesign looks like.
Translate the workflow into a ZenML project. Follow these conventions strictly.
Every migrated project MUST use this layout:
migrated_pipeline/
├── steps/ # One file per step
│ ├── extract.py
│ ├── train.py
│ └── deploy.py
├── pipelines/
│ └── my_pipeline.py
├── materializers/ # Custom materializers when path/URI semantics matter
├── configs/
│ ├── dev.yaml
│ └── prod.yaml
├── run.py # CLI entry point (argparse, not click)
├── README.md
└── pyproject.toml
Key rules:
steps/run.py uses argparsepyproject.toml uses zenml>=0.94.1 and requires-python = ">=3.12"configs/dev.yaml and configs/prod.yamlREADME.mdzenml init at the project root1. Artifact-contract-first translation
Do not start by translating decorators. Start by asking what the step really exchanges.
.path / .uri is only a serialization boundary, convert to a normal Python type and let ZenML materialize it.2. GCPC rewrite rule
Never pretend ZenML has a matching GCPC operator.
3. Control-flow rule
if is only safe when the condition depends on pipeline parameters@pipeline(dynamic=True) plus .load()dsl.ParallelFor should map to .map() only when the resulting dynamic-pipeline semantics are acceptable4. Compile / template rule
compiler.Compiler().compile() in migrated ZenML codePipelineJob submission in a ZenML step and be explicit that ZenML sees it as one black-box node5. Comment style
Use concise migration comments:
# Migration note: for semantic caveats# TODO(migration): for required user actionKeep detailed prose in the migration report, not in the code.
When semantics change, leave a short inline note:
@step
def evaluate_model(metrics_path_hint: str | None = None) -> dict[str, float]:
# Migration note: the original KFP component wrote metrics to a managed
# artifact path. This ZenML step returns metrics as a typed artifact instead.
return {"accuracy": 0.91}
For unsupported patterns:
# TODO(migration): comment in code# TODO(migration): UNSUPPORTED — original pipeline relied on dsl.ExitHandler
# to guarantee cleanup after failure. ZenML has no exact equivalent.
# Redesign this as idempotent cleanup plus hooks / external failure handling.
After generating the project, always create MIGRATION_REPORT.md in the project root.
Use this structure:
# Migration Report: [Vertex Pipeline] -> [ZenML Pipeline]
## Summary
- **Source**: Vertex AI Pipelines / KFP v2 workflow `[name]`
- **Target**: ZenML pipeline `[pipeline_name]`
- **Steps migrated**: X direct, Y approximate, Z flagged
## Direct Translations
| Source Concept | ZenML Target | Notes |
## Approximate Translations
| Source Concept | ZenML Target | What Changed |
## Flagged for Review
| Pattern | Severity | Issue | Suggested Redesign |
## Artifact Contract Decisions
| Source Component | Old Contract | New Contract | Notes |
## GCPC Rewrite Summary
| GCPC Node | Replacement Step | SDK / API Used | Notes |
## Vertex Platform Integration Mapping
| Vertex Feature | ZenML Mapping | Notes |
## Compiled Template / Schedule Lifecycle Gaps
- [list exactly what is not preserved]
## What's NOT Migrated
- template registries
- Vertex schedule lifecycle management
- Vertex Model Registry parity unless explicitly implemented
- Vertex Experiment wiring unless explicitly configured
- ML Metadata equivalence
## What You Get for Free After Migration
- artifact versioning and lineage
- stack portability
- step caching
- model / artifact control-plane capabilities
- a cleaner separation between pipeline logic and cloud SDK calls
## Recommended Next Steps
1. Run `zenml-quick-wins`
2. Install the ZenML docs MCP server
3. Validate the migrated pipeline on the ZenML Vertex orchestrator
4. Run the same pipeline on a second stack if portability matters
5. Run `/simplify` on the generated code
After migration, always guide the user toward the next layer of cleanup and hardening.
zenml-quick-winsSay this explicitly:
"Now that the migration is done, I recommend running
zenml-quick-winsto add metadata logging, experiment tracking, secrets, and other production features."
When relevant, include specific links:
https://docs.zenml.io/stacks/stack-components/orchestrators/vertexhttps://docs.zenml.io/concepts/steps_and_pipelines/schedulinghttps://docs.zenml.io/concepts/steps_and_pipelines/dynamic_pipelineshttps://docs.zenml.io/user-guides/starter-guide/manage-artifacts#consuming-external-artifacts-within-a-pipelinehttps://docs.zenml.io/concepts/artifacts/materializershttps://docs.zenml.io/stacks/stack-components/experiment-trackers"For easier access to current ZenML docs while you refine the migration, install the docs MCP server:
claude mcp add zenmldocs --transport http https://docs.zenml.io/~gitbook/mcp"
When there are 2+ HIGH-severity flags, generate a copy-pasteable Slack message for zenml.io/slack summarizing:
If the migration exposes a real missing feature in ZenML, offer to open an issue on zenml-io/zenml with:
/simplifyMigration output is often correct but bulky. Always suggest /simplify to remove scaffolding comments, reduce duplication, and make the code feel production-ready.
These are the places where users are most likely to think "this looks the same" when it is not the same.
KFP artifacts are runtime-managed references with .path, .uri, and metadata attached. ZenML artifacts are Python values loaded and saved by materializers.
Practical consequence: a KFP Input[Dataset] is not automatically the same as a ZenML pd.DataFrame. Sometimes it is. Sometimes that translation erases the real contract.
KFP authoring usually ends in compilation and submission of a template. ZenML runs from Python definitions through its orchestrator abstraction.
Practical consequence: compiled-template workflows, template registries, and template_path=... usage are redesign boundaries.
KFP runtime branching and fan-out are backend orchestration concepts. ZenML can do similar work with dynamic pipelines, but the mechanics differ.
Practical consequence: never translate dsl.If into a plain Python if in a static pipeline, and never translate dsl.ParallelFor into a plain for loop if you need backend fan-out semantics.
Vertex exposes richer schedule lifecycle and concurrency controls than ZenML’s generic Schedule(...) surface.
Practical consequence: cron schedule creation may migrate, but full lifecycle parity usually does not.
Vertex Experiments, Vertex Model Registry, and Vertex ML Metadata overlap with ZenML concepts, but they are not the same objects or the same metadata plane.
Practical consequence: migrate the intent explicitly. Do not tell the user that ZenML artifacts automatically become Vertex Model resources or that ZenML metadata is the same thing as Vertex MLMD.
| Anti-pattern | Why it is wrong | What to do instead |
|---|---|---|
Translating every Input[Dataset] to pd.DataFrame | Erases URI/path semantics when the original component was location-aware | Classify the artifact contract first |
| Pretending GCPC has ZenML equivalents | ZenML has no GCPC-style operator catalog | Rewrite as SDK-calling steps |
Keeping compiler.Compiler().compile() in migrated code | ZenML does not need user-facing compilation | Remove it, or treat compiled-template submission as a partial migration |
Replacing dsl.ParallelFor with a plain for loop | Loses backend fan-out and observability | Use dynamic pipelines with .map() when appropriate |
Replacing runtime dsl.If with static Python branching | Changes control-flow semantics | Use dynamic=True plus .load() |
Treating dsl.ExitHandler as "just add a last step" | Exit handlers can run after failure; a last step may never run | Redesign cleanup / notification semantics explicitly |
| Assuming ZenML model artifacts equal Vertex Model resources | They are different resource models | Add an explicit model-upload step if Vertex Model Registry matters |
| Assuming ZenML schedule updates delete/update Vertex schedules | ZenML does not fully manage Vertex schedule lifecycle | Document manual schedule management |
| Assuming caching semantics are identical | Cache identity and execution model differ | Revalidate caching behavior after migration |
For topics beyond migration, query the ZenML docs at https://docs.zenml.io.