REQUIRED Phase 3 of /ds workflow. Enforces output-first verification at each step.
Enforces output-first verification for data science workflows. Spawns subagents to execute analysis while main chat orchestrates and verifies every step produces visible output before proceeding.
/plugin marketplace add edwinhu/workflows/plugin install workflows@edwinhu-pluginsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/verification-patterns.mdAnnounce: "I'm using ds-implement (Phase 3) to build with output-first verification."
references/verification-patterns.mdMAIN CHAT MUST NOT WRITE ANALYSIS CODE. This is not negotiable.
Main chat orchestrates. Subagents analyze. If you catch yourself about to write Python/R code, STOP.
Allowed in main chat:
NOT allowed in main chat:
If you're about to write analysis code directly, STOP and spawn a Task agent instead.
These thoughts mean STOP—you're rationalizing:
| Thought | Reality |
|---|---|
| "It's just a quick plot" | Quick plots hide data issues. Delegate. |
| "I'll just check the shape" | Shape checks need output-first protocol. Delegate. |
| "The subagent will take too long" | Subagent time is cheap. Your context is expensive. |
| "I already know this data" | Knowing ≠ verified. Delegate. |
| "Let me just run this merge" | Merges silently fail. Delegate with verification. |
| "This is too simple for a subagent" | Simple is exactly when errors hide. Delegate. |
| "I'm already looking at the data" | Looking ≠ analyzing. Delegate. |
| "The user wants results fast" | Wrong results are worse than slow results. Delegate. |
For each task in PLAN.md:
Why delegate?
REQUIRED SUB-SKILL: For Task templates and detailed flow:
Skill(skill="workflows:ds-delegate")
Implement analysis with mandatory visible output at every step. NO TDD - instead, every code step MUST produce and verify output.
<EXTREMELY-IMPORTANT> ## The Iron Law of DS ImplementationEVERY CODE STEP MUST PRODUCE VISIBLE OUTPUT. This is not negotiable.
Before moving to the next step, you MUST:
This applies even when:
If you catch yourself about to write code without outputting results, STOP. </EXTREMELY-IMPORTANT>
| DO | DON'T |
|---|---|
| Print shape after each transform | Chain operations silently |
| Display sample rows | Trust transformations work |
| Show summary stats | Wait until end to check |
| Verify row counts | Assume merges worked |
| Check for unexpected nulls | Skip intermediate checks |
| Plot distributions | Move on without looking |
The Mantra: If you can't see it, you can't trust it.
| Thought | Why It's Wrong | Do Instead |
|---|---|---|
| "I'll check at the end" | Errors compound silently | Check after every step |
| "This transform is simple" | Simple code can still be wrong | Output and verify |
| "I know merge worked" | Merges often fail silently | Check row counts |
| "Data looks fine" | "Looks" isn't verification | Print stats, show samples |
| "I'll batch the outputs" | Loses ability to isolate issues | Output per operation |
# BEFORE
print(f"Before: {df.shape}")
# OPERATION
df = df.merge(other, on='key')
# AFTER - MANDATORY
print(f"After: {df.shape}")
print(f"Nulls introduced: {df.isnull().sum().sum()}")
df.head()
| Operation | Required Output |
|---|---|
| Load data | shape, dtypes, head() |
| Filter | shape before/after, % removed |
| Merge/Join | shape, null check, sample |
| Groupby | result shape, sample groups |
| Transform | before/after comparison, sample |
| Model fit | metrics, convergence info |
| Prediction | distribution, sample predictions |
cat .claude/PLAN.md
Follow the task order defined in the plan.
For each task:
# Task N: [Description]
print("=" * 50)
print("Task N: [Description]")
print("=" * 50)
# Before state
print(f"Input shape: {df.shape}")
# Operation
result = do_operation(df)
# After state - MANDATORY
print(f"Output shape: {result.shape}")
print(f"Sample output:")
display(result.head())
# Verification
assert result.shape[0] > 0, "No rows returned!"
print("Task N complete")
Document every significant step:
## Step N: [Task Description]
**Input:** DataFrame with shape (10000, 15)
**Operation:** Merged with reference table on 'id'
**Output:**
- Shape: (9500, 20)
- 500 rows dropped (no match)
- 5 new columns added
- No new nulls introduced
**Verification:**
- Row count reasonable (5% drop expected due to filtering)
- Sample output matches expected format
- Key columns preserved
**Notes:** [Any observations, issues, or decisions]
Main chat spawns Task agent:
Task(subagent_type="general-purpose", prompt="""
Implement [TASK] following output-first protocol.
Context:
- Read .claude/LEARNINGS.md for prior steps
- Read .claude/PLAN.md for task details
- Read .claude/SPEC.md for objectives
Output-First Protocol:
1. Print state BEFORE each operation
2. Execute the operation
3. Print state AFTER with verification
4. Display sample output
5. Document in LEARNINGS.md
Required outputs per operation:
- Shape before/after
- Null counts
- Sample rows (head)
- Sanity checks (row counts, value ranges)
DO NOT proceed to next task without:
- Visible output showing operation worked
- LEARNINGS.md entry documenting the step
Report back: what was done, output observed, any issues.
""")
See references/verification-patterns.md for detailed code patterns for:
| Failure | Why It Happens | Prevention |
|---|---|---|
| Silent data loss | Merge drops rows | Print row counts before/after |
| Hidden nulls | Join introduces nulls | Check null counts after joins |
| Wrong aggregation | Groupby logic error | Display sample groups |
| Type coercion | Pandas silent conversion | Verify dtypes after load |
| Off-by-one | Date filtering edge cases | Print min/max dates |
Append each step to .claude/LEARNINGS.md:
## Step N: [Description] - [STATUS]
**Input:** [Describe input state]
**Operation:** [What was done]
**Output:** [Shape, stats, sample]
[Paste actual output here]
**Verification:** [How you confirmed it worked]
**Next:** [What comes next]
Never hide failures. Bad output documented is better than silent failure.
| Thought | Reality |
|---|---|
| "Task done, let me check in with user" | NO. User wants ALL tasks done. Keep going. |
| "User might want to see intermediate results" | User will see results at the END. Continue. |
| "Natural pause point" | Only pause when ALL tasks complete or blocked. |
| "Let me summarize this step" | Summarize AFTER all tasks. Keep moving. |
Pausing between tasks is procrastination disguised as courtesy. </EXTREMELY-IMPORTANT>
REQUIRED SUB-SKILL: After all analysis steps complete with verified output, IMMEDIATELY invoke:
Skill(skill="workflows:ds-review")
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.