Systematic marketing experimentation process - discover concepts, generate hypotheses, coordinate multiple experiments, synthesize results, generate next-iteration ideas through rigorous validation cycles
Orchestrates rigorous marketing experimentation cycles from concept validation to data-driven iteration.
npx claudepluginhub tilmon-engineering/claude-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
templates/01-discovery.mdtemplates/02-hypothesis-generation.mdtemplates/03-prioritization.mdtemplates/04-experiment-tracker.mdtemplates/05-synthesis.mdtemplates/06-iteration-plan.mdUse this skill when you need to validate marketing concepts or business ideas through rigorous experimental cycles. This skill orchestrates the complete Build-Measure-Learn cycle from concept to data-driven signal.
When to use this skill:
What this skill does:
What this skill does NOT do:
Integration with existing skills:
hypothesis-testing for quantitative experiment design and execution (metrics, A/B tests, statistical analysis)qualitative-research for qualitative experiment design and execution (interviews, surveys, focus groups, observations)interpreting-results to synthesize findings across multiple experimentscreating-visualizations to communicate aggregate resultsmarket-researcher agent for concept validation via internet researchMulti-conversation persistence: This skill is designed for campaigns spanning days or weeks. Each phase documents completely enough that new conversations can resume after extended breaks. The experiment tracker (04-experiment-tracker.md) serves as the living coordination hub.
Required skills:
hypothesis-testing - Quantitative experiment design and execution (invoked for metric-based experiments)qualitative-research - Qualitative experiment design and execution (invoked for interviews, surveys, focus groups, observations)interpreting-results - Result synthesis and pattern identification (invoked in Phase 5)creating-visualizations - Aggregate result visualization (invoked in Phase 5)Required agents:
market-researcher - Concept validation via internet research (invoked in Phase 1)Required knowledge:
Data requirements:
CRITICAL: This is a 6-phase process skill. You MUST complete all phases in order. Use TodoWrite to track progress through each phase.
TodoWrite template:
When starting a marketing-experimentation session, create these todos:
- [ ] Phase 1: Discovery & Asset Inventory
- [ ] Phase 2: Hypothesis Generation
- [ ] Phase 3: Prioritization
- [ ] Phase 4: Experiment Coordination
- [ ] Phase 5: Cross-Experiment Synthesis
- [ ] Phase 6: Iteration Planning
Workspace structure:
All work for a marketing-experimentation session is saved to:
analysis/marketing-experimentation/[campaign-name]/
├── 01-discovery.md
├── 02-hypothesis-generation.md
├── 03-prioritization.md
├── 04-experiment-tracker.md
├── 05-synthesis.md
├── 06-iteration-plan.md
└── experiments/
├── [experiment-1]/ # hypothesis-testing session
├── [experiment-2]/ # hypothesis-testing session
└── [experiment-3]/ # hypothesis-testing session
Phase progression rules:
Multi-conversation resumption:
CHECKPOINT: Before proceeding, you MUST have:
01-discovery.mdGather the business concept
Invoke market-researcher agent for concept validation
Dispatch the market-researcher agent with the concept description:
Document agent findings in 01-discovery.md under "Market Research Findings"
Conduct asset inventory
Work with user to inventory existing assets that could be leveraged:
Content Assets:
Campaign Assets:
Audience Assets:
Data Assets:
Define success criteria and validation signals
Work with user to define:
Document known constraints
Capture any constraints that will affect experimentation:
Create 01-discovery.md with: ./templates/01-discovery.md
STOP and get user confirmation
Common Rationalization: "I'll skip discovery and go straight to testing - the concept is obvious" Reality: Discovery surfaces assumptions, constraints, and existing assets that dramatically affect experiment design. Always start with discovery.
Common Rationalization: "I don't need market research - I already know this market" Reality: The market-researcher agent provides current, data-driven validation signals that prevent building experiments around false assumptions. Always validate.
Common Rationalization: "Asset inventory is busywork - I'll figure out what's available as I go" Reality: Existing assets can dramatically reduce experiment cost and time. Inventorying first prevents reinventing wheels and enables building on proven foundations.
CHECKPOINT: Before proceeding, you MUST have:
02-hypothesis-generation.mdGenerate 5-10 testable hypotheses
For each hypothesis, use this format:
Hypothesis [N]: [Brief statement]
Example hypothesis:
Hypothesis 1: Value proposition clarity drives conversion
Ensure tactic coverage
Verify hypotheses cover multiple marketing tactics:
Acquisition Tactics:
Activation Tactics:
Retention Tactics:
Don't generate 10 ad hypotheses. Aim for diversity across tactics.
Reference experimentation frameworks
Lean Startup Build-Measure-Learn:
AARRR Pirate Metrics:
Map each hypothesis to one or more AARRR stages.
ICE/RICE Prioritization (used in Phase 3):
Create 02-hypothesis-generation.md with: ./templates/02-hypothesis-generation.md
STOP and get user confirmation
Common Rationalization: "I'll generate hypotheses as I build experiments - more efficient" Reality: Generating hypotheses before prioritization enables strategic selection of highest-impact tests. Generating ad-hoc leads to testing whatever's easiest, not what matters most.
Common Rationalization: "I'll focus all hypotheses on one tactic (ads) since that's what we know" Reality: Tactic diversity reveals which channels work for this concept. Single-tactic testing creates blind spots and missed opportunities.
Common Rationalization: "I'll write vague hypotheses and refine them during experiment design" Reality: Vague hypotheses lead to vague experiments that produce vague results. Specific hypotheses with expected outcomes enable clear signal detection.
Common Rationalization: "More hypotheses = better coverage, I'll generate 20+" Reality: Too many hypotheses dilute focus and create analysis paralysis in prioritization. 5-10 high-quality hypotheses enable strategic selection of 2-4 tests.
CHECKPOINT: Before proceeding, you MUST have:
03-prioritization.mdCRITICAL: You MUST use computational methods (Python scripts) to calculate scores. Do NOT estimate or manually calculate scores.
Choose prioritization framework
ICE Framework (simpler, faster):
RICE Framework (more comprehensive):
Choose ICE for speed, RICE for precision when reach varies significantly.
Score each hypothesis using Python script
For ICE Framework:
Create a Python script to compute and sort ICE scores:
#!/usr/bin/env python3
"""
ICE Score Calculator for Marketing Experimentation
Computes ICE scores: (Impact × Confidence) / Ease
Sorts hypotheses by score (highest to lowest)
"""
hypotheses = [
{
"id": "H1",
"name": "Value proposition clarity drives conversion",
"impact": 8,
"confidence": 7,
"ease": 9
},
{
"id": "H2",
"name": "Ad targeting refinement",
"impact": 7,
"confidence": 6,
"ease": 5
},
{
"id": "H3",
"name": "Email sequence optimization",
"impact": 6,
"confidence": 8,
"ease": 8
},
{
"id": "H4",
"name": "Content marketing expansion",
"impact": 5,
"confidence": 4,
"ease": 3
},
]
# Calculate ICE scores
for h in hypotheses:
h['ice_score'] = (h['impact'] * h['confidence']) / h['ease']
# Sort by ICE score (descending)
sorted_hypotheses = sorted(hypotheses, key=lambda x: x['ice_score'], reverse=True)
# Print results table
print("| Hypothesis | Impact | Confidence | Ease | ICE Score | Rank |")
print("|------------|--------|------------|------|-----------|------|")
for rank, h in enumerate(sorted_hypotheses, 1):
print(f"| {h['id']}: {h['name'][:30]} | {h['impact']} | {h['confidence']} | {h['ease']} | {h['ice_score']:.2f} | {rank} |")
Usage:
python3 ice_calculator.py
For RICE Framework:
Create a Python script to compute and sort RICE scores:
#!/usr/bin/env python3
"""
RICE Score Calculator for Marketing Experimentation
Computes RICE scores: (Reach × Impact × Confidence) / Effort
Sorts hypotheses by score (highest to lowest)
"""
hypotheses = [
{
"id": "H1",
"name": "Value proposition clarity drives conversion",
"reach": 10000, # users affected
"impact": 3, # 0.25=minimal, 1=low, 2=medium, 3=high, 5=massive
"confidence": 80, # percentage (50, 80, 100)
"effort": 2 # person-weeks
},
{
"id": "H2",
"name": "Ad targeting refinement",
"reach": 50000,
"impact": 1,
"confidence": 50,
"effort": 4
},
{
"id": "H3",
"name": "Email sequence optimization",
"reach": 5000,
"impact": 2,
"confidence": 80,
"effort": 3
},
{
"id": "H4",
"name": "Content marketing expansion",
"reach": 20000,
"impact": 1,
"confidence": 50,
"effort": 8
},
]
# Calculate RICE scores
for h in hypotheses:
# Convert confidence percentage to decimal
confidence_decimal = h['confidence'] / 100
h['rice_score'] = (h['reach'] * h['impact'] * confidence_decimal) / h['effort']
# Sort by RICE score (descending)
sorted_hypotheses = sorted(hypotheses, key=lambda x: x['rice_score'], reverse=True)
# Print results table
print("| Hypothesis | Reach | Impact | Confidence | Effort | RICE Score | Rank |")
print("|------------|-------|--------|------------|--------|------------|------|")
for rank, h in enumerate(sorted_hypotheses, 1):
print(f"| {h['id']}: {h['name'][:30]} | {h['reach']} | {h['impact']} | {h['confidence']}% | {h['effort']}w | {h['rice_score']:.2f} | {rank} |")
Usage:
python3 rice_calculator.py
Scoring Guidance:
Impact (1-10 for ICE, 0.25-5 for RICE):
Confidence (1-10 for ICE, 50-100% for RICE):
Ease (1-10 for ICE):
Effort (person-weeks for RICE):
Run scoring script and document results
hypotheses list with actual hypothesis data from Phase 2python3 [ice|rice]_calculator.py03-prioritization.md## Prioritization Calculation
**Method:** ICE Framework
**Calculation Script:**
```python
[paste full script here]
Results:
[paste output table here]
Select 2-4 highest-priority hypotheses
Considerations for selection:
Selection criteria:
Document experiment sequence
Determine execution strategy:
Example sequence plan:
Week 1-2: Launch H1 (landing page) and H3 (email) in parallel
Week 3-4: Analyze H1 and H3 results
Week 5-6: Launch H2 (ads) based on H1 learnings
Week 7-8: Analyze H2 results
Create 03-prioritization.md with: ./templates/03-prioritization.md
STOP and get user confirmation
Common Rationalization: "I'll test all hypotheses - don't want to miss opportunities" Reality: Resource constraints make testing everything impossible. Prioritization ensures highest-value experiments get resources. Unfocused testing produces weak signals across too many fronts.
Common Rationalization: "Scoring is subjective and arbitrary - I'll just pick what feels right" Reality: Scoring frameworks force explicit reasoning about trade-offs. "Feels right" selections optimize for recency bias and personal preference, not business value. Computational methods ensure consistency.
Common Rationalization: "I'll skip prioritization and go straight to easiest test" Reality: Easiest test rarely equals highest value. Prioritization prevents optimizing for ease at the expense of impact.
Common Rationalization: "I'll estimate scores mentally instead of running the script" Reality: Manual estimation introduces calculation errors and inconsistency. Python scripts ensure exact, reproducible results that can be audited and verified.
CHECKPOINT: Before proceeding, you MUST have:
04-experiment-tracker.mdCRITICAL: This phase is designed for multi-conversation workflows. The experiment tracker is a LIVING DOCUMENT that you will update throughout experimentation. New conversations should ALWAYS read this file first.
Determine experiment type for each hypothesis
CRITICAL: Before creating the tracker, classify each hypothesis as Quantitative or Qualitative.
Quantitative experiments use hypothesis-testing skill:
Qualitative experiments use qualitative-research skill:
Decision criteria:
Ask: "What do we need to learn?"
Ask: "What data will we collect?"
Mixed methods:
Create experiment tracker
The tracker is your coordination hub for managing multiple experiments over time.
Create 04-experiment-tracker.md with: ./templates/04-experiment-tracker.md
Tracker format:
For each selected hypothesis, create an entry:
### Experiment 1: [Hypothesis Brief Name]
**Status:** [Planned | In Progress | Complete]
**Hypothesis:** [Full hypothesis statement from Phase 2]
**Tactic/Channel:** [landing page | ads | email | etc.]
**Priority Score:** [ICE/RICE score from Phase 3]
**Start Date:** [YYYY-MM-DD or "Not started"]
**Completion Date:** [YYYY-MM-DD or "In progress"]
**Location:** `analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/`
**Signal:** [Positive | Negative | Null | Mixed | "Not analyzed"]
**Key Findings:** [Brief summary when complete, "TBD" otherwise]
Invoke appropriate skill for each experiment
For each hypothesis marked "Planned" or "In Progress":
Step 1: Read the hypothesis details from 02-hypothesis-generation.md
Step 2a: If Quantitative experiment, invoke hypothesis-testing skill:
Use hypothesis-testing skill to test: [Hypothesis statement]
Context for hypothesis-testing:
- Session name: [descriptive-name-for-experiment]
- Save location: analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/
- Success criteria: [From Phase 1 discovery]
- Expected outcome: [From Phase 2 hypothesis]
- Metric to measure: [CTR, conversion rate, etc.]
- Data source: [Google Analytics, database, etc.]
Step 2b: If Qualitative experiment, invoke qualitative-research skill:
Use qualitative-research skill to conduct: [Hypothesis statement]
Context for qualitative-research:
- Session name: [descriptive-name-for-experiment]
- Save location: analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/
- Research question: [From Phase 2 hypothesis]
- Collection method: [Interviews | Surveys | Focus Groups | Observations]
- Success criteria: [What insights validate/invalidate hypothesis?]
Step 3: Update experiment tracker:
Step 4a: Let hypothesis-testing skill complete its 5-phase workflow:
Step 4b: Let qualitative-research skill complete its 6-phase workflow:
Step 5: When skill completes, update tracker:
Step 6: Commit tracker updates after each status change
Handle multi-conversation resumption
At the start of EVERY conversation during Phase 4:
04-experiment-tracker.md existsExample resumption:
I've read the experiment tracker. Current status:
- Experiment 1 (H1: Value prop): Complete, Positive signal
- Experiment 2 (H3: Email sequence): In Progress, currently in hypothesis-testing Phase 3
- Experiment 3 (H2: Ad targeting): Planned, not yet started
What would you like to do?
a) Continue Experiment 2 (in hypothesis-testing Phase 3)
b) Start Experiment 3
c) Move to synthesis (Phase 5) since Experiment 1 is complete
Coordinate parallel vs. sequential experiments
Parallel execution (multiple experiments simultaneously):
Sequential execution (one at a time):
Hybrid execution:
Progress through all experiments
Continue invoking hypothesis-testing and updating the tracker until:
Only when ALL experiments are complete should you proceed to Phase 5.
Common Rationalization: "I'll keep experiment details in my head - the tracker is just busywork" Reality: Multi-day campaigns lose context between conversations. The tracker is the ONLY source of truth that persists across sessions. Without it, you'll re-ask questions and lose progress.
Common Rationalization: "I'll wait until all experiments finish before updating the tracker" Reality: Batch updates create opportunity for lost data. Update the tracker IMMEDIATELY after status changes. Real-time tracking prevents confusion and missed experiments.
Common Rationalization: "I'll design the experiment myself instead of using hypothesis-testing" Reality: hypothesis-testing skill provides rigorous experimental design, statistical analysis, and signal detection. Skipping it produces weak experiments with ambiguous results.
Common Rationalization: "All experiments are done, I don't need to update the tracker before synthesis" Reality: The tracker is your input to Phase 5. Incomplete tracker means incomplete synthesis. Update ALL fields (status, dates, signals, findings) before proceeding.
CHECKPOINT: Before proceeding, you MUST have:
05-synthesis.mdCRITICAL: This phase synthesizes results ACROSS multiple experiments. Do NOT proceed until ALL Phase 4 experiments are complete with documented signals.
Verify experiment completion
Read 04-experiment-tracker.md and verify:
If any experiments are incomplete, return to Phase 4 to finish them.
Create aggregate results table
Compile findings from all experiments into a summary table:
Example Aggregate Table (Mixed Quantitative & Qualitative):
| Experiment | Type | Hypothesis | Tactic | Signal | Key Finding | Confidence |
|---|---|---|---|---|---|---|
| E1 | Quant | Value prop clarity | Landing page A/B test | Positive | Conversion rate +18% (p<0.05) | High |
| E2 | Qual | Customer pain points | Discovery interviews | Positive | 8 of 10 cited onboarding complexity | High |
| E3 | Quant | Ad targeting | Ads | Null | CTR +2% (not sig., p=0.12) | Medium |
| E4 | Qual | Ad message resonance | Focus groups | Negative | 6 of 8 found messaging confusing | High |
For Quantitative experiments (hypothesis-testing):
For Qualitative experiments (qualitative-research):
Invoke presenting-data skill for comprehensive synthesis
Use the presenting-data skill to create complete synthesis with visualizations and presentation materials:
Use presenting-data skill to synthesize marketing experimentation results:
Context:
- Campaign: [campaign name]
- Experiments completed: [count]
- Results table: [paste aggregate table]
- Audience: [stakeholders/decision-makers]
- Format: [markdown report | slides | whitepaper]
- Focus: Pattern identification across experiments (what works, what doesn't, what's unclear)
presenting-data skill will handle:
Focus areas for synthesis:
Document patterns and insights
The presenting-data skill will create 05-synthesis.md (or slides/whitepaper) with:
What Worked (Positive Signals):
What Didn't Work (Negative Signals):
What's Unclear (Null/Mixed Signals):
Cross-Experiment Patterns:
Visualizations (created by presenting-data):
Classify overall campaign signal
Based on aggregate analysis (from presenting-data output), classify the campaign:
Positive: Campaign validates concept, proceed to scaling
Negative: Campaign invalidates concept, pivot or abandon
Null: Campaign results inconclusive, needs refinement
Mixed: Some aspects work, some don't, iterate strategically
Review presenting-data output and finalize synthesis
After presenting-data skill completes:
STOP and get user confirmation
Common Rationalization: "I'll synthesize results mentally - no need to document patterns" Reality: Mental synthesis loses details and creates false confidence. Documented synthesis with presenting-data skill ensures intellectual honesty and identifies confounding factors you'd otherwise miss.
Common Rationalization: "I'll skip synthesis for experiments with clear signals" Reality: Individual experiment signals don't reveal cross-experiment patterns. Synthesis identifies why some tactics work while others don't - the strategic insight that guides iteration.
Common Rationalization: "Visualization is optional - the data speaks for itself" Reality: Tabular data obscures patterns. Visualization reveals signal distribution, effect size clusters, and confidence patterns that inform strategic decisions. presenting-data handles this systematically.
CHECKPOINT: Before proceeding, you MUST have:
06-iteration-plan.mdCRITICAL: Phase 6 generates experiment IDEAS, NOT hypotheses. Ideas feed into new marketing-experimentation sessions where Phase 2 formalizes hypotheses. Do NOT skip the discovery and hypothesis generation steps.
Generate 3-7 new experiment ideas
Based on Phase 5 synthesis, generate ideas for next iteration:
Idea Format:
Idea [N]: [Brief descriptive name]
Example Ideas:
Idea 1: Scale value prop landing page to paid ads
Idea 2: Investigate email sequence timing sensitivity
Idea 3: Pivot from broad ad targeting to lookalike audiences
Categorize ideas by strategy
Scale Winners:
Investigate Nulls:
Pivot from Failures:
Explore New:
Document campaign-level signal and strategic recommendation
Campaign Summary:
Strategic Recommendations by Signal:
Positive Signal:
Negative Signal:
Null Signal:
Mixed Signal:
Explain feed-forward pattern
CRITICAL: Ideas from Phase 6 are NOT ready for testing. They MUST go through a new marketing-experimentation session:
Feed-Forward Cycle:
Phase 6 generates IDEAS
↓
Start new marketing-experimentation session with idea
↓
Phase 1: Discovery (validate idea with market-researcher)
↓
Phase 2: Hypothesis Generation (formalize idea into testable hypotheses)
↓
Phase 3-6: Complete full experimental cycle
Why this matters:
Example: Phase 6 generates "Scale value prop to ads" (idea) → New session Phase 1: Market research on ad platform best practices → New session Phase 2: Generate hypotheses like "H1: Simplified value prop in ad headline increases CTR by 10%+" (specific, testable) → Continue with Phase 3-6 to test formal hypotheses
Create 06-iteration-plan.md with: ./templates/06-iteration-plan.md
STOP and review with user
Common Rationalization: "I'll turn ideas directly into experiments - skip the new session" Reality: Ideas need discovery and hypothesis generation. Skipping these steps leads to untested assumptions and vague experiments. Always run ideas through a new marketing-experimentation session.
Common Rationalization: "I'll generate hypotheses in Phase 6 for efficiency" Reality: Phase 6 generates IDEAS, Phase 2 (in a new session) generates hypotheses. Conflating these skips critical validation and formalization steps. Ideas → new session → hypotheses.
Common Rationalization: "Campaign signal is obvious from results, no need to document strategic recommendation" Reality: Documented recommendation provides clear guidance for stakeholders and future sessions. Without it, insights are lost and decisions become ad-hoc.
These are rationalizations that lead to failure. When you catch yourself thinking any of these, STOP and follow the skill process instead.
Why this fails: Discovery surfaces assumptions, constraints, and existing assets that dramatically affect experiment design. "Obvious" concepts often hide critical assumptions that need validation.
Reality: Market-researcher agent provides current, data-driven validation signals. Asset inventory reveals resources that reduce experiment cost and time. Success criteria definition prevents ambiguous results. Always start with discovery.
What to do instead: Complete Phase 1 (Discovery & Asset Inventory) before generating hypotheses. Invoke market-researcher agent. Document all findings.
Why this fails: The research skills provide rigorous experimental design, analysis, and signal detection. hypothesis-testing ensures statistical rigor for quantitative experiments. qualitative-research ensures systematic rigor for qualitative experiments. Skipping them produces weak experiments with ambiguous results.
Reality: Marketing-experimentation is a meta-orchestrator that coordinates multiple experiments. It does NOT design experiments itself. Delegation to appropriate skills (hypothesis-testing or qualitative-research) ensures methodological rigor.
What to do instead: Determine experiment type (quantitative or qualitative) in Phase 4. Invoke hypothesis-testing skill for quantitative experiments. Invoke qualitative-research skill for qualitative experiments. Let the appropriate skill handle all design, execution, and analysis.
Why this fails: Single experiments miss cross-experiment patterns. Some tactics work, others don't. Single-experiment campaigns can't identify which channels/tactics are most effective.
Reality: Marketing-experimentation tests 2-4 hypotheses to reveal strategic insights. Synthesis (Phase 5) identifies patterns across experiments - which tactics work, which don't, and why.
What to do instead: Follow Phase 3 prioritization to select 2-4 hypotheses. Complete all experiments before synthesis. Use Phase 5 to identify patterns.
Why this fails: Batch updates create opportunity for lost data. Multi-day campaigns lose context between conversations. Incomplete tracker leads to missed experiments and confusion.
Reality: The experiment tracker (04-experiment-tracker.md) is the ONLY source of truth that persists across sessions. Update it IMMEDIATELY after status changes.
What to do instead: Update tracker after every status change (Planned → In Progress, In Progress → Complete). Commit tracker updates to git. Read tracker FIRST in every new conversation.
Why this fails: Individual experiment signals don't reveal cross-experiment patterns. "Obvious" interpretations miss confounding factors and alternative explanations.
Reality: Documented synthesis with presenting-data skill ensures intellectual honesty. Visualization reveals patterns. Statistical assessment identifies robust vs uncertain findings.
What to do instead: Always complete Phase 5 (Cross-Experiment Synthesis). Invoke presenting-data skill. Document patterns, visualizations, and signal classification. Get user confirmation.
Why this fails: Phase 6 generates IDEAS, not hypotheses. Ideas need discovery (Phase 1) and hypothesis generation (Phase 2) in new sessions. Skipping these steps leads to untested assumptions and vague experiments.
Reality: Feed-forward cycle: Phase 6 ideas → new marketing-experimentation session → Phase 1 discovery → Phase 2 hypothesis generation → Phase 3-6 complete cycle.
What to do instead: Generate IDEAS in Phase 6. Start NEW marketing-experimentation session with selected idea. Complete Phase 1 and Phase 2 to formalize idea into testable hypotheses.
Why this fails: Manual estimation introduces calculation errors and inconsistency. Mental math is unreliable for multiplication and division.
Reality: Python scripts ensure exact, reproducible results that can be audited and verified. Computational methods eliminate human error.
What to do instead: Use Python scripts (ICE or RICE calculator) from Phase 3 instructions. Update hypothesis data in script. Run script and document exact scores. Copy output table to prioritization document.
Why this fails: Mental synthesis loses details and creates false confidence. Cross-experiment patterns require systematic analysis.
Reality: presenting-data skill handles pattern identification (via interpreting-results), visualization creation (via creating-visualizations), and synthesis documentation. It ensures intellectual honesty and reproducibility.
What to do instead: Always invoke presenting-data skill in Phase 5. Provide aggregate results table. Request pattern analysis and visualizations. Document all findings from presenting-data output.
The marketing-experimentation skill ensures rigorous, evidence-based validation of marketing concepts through structured experimental cycles. This skill orchestrates the complete Build-Measure-Learn loop from concept to data-driven signal.
What this skill ensures:
Validated concepts through market research - market-researcher agent provides current demand signals, competitive landscape analysis, and audience insights before experimentation begins.
Strategic hypothesis generation - 5-10 testable hypotheses spanning multiple tactics (landing pages, ads, email, content) grounded in discovery findings and mapped to experimentation frameworks (Lean Startup, AARRR).
Data-driven prioritization - Computational methods (ICE/RICE Python scripts) ensure exact, reproducible scoring. Selection of 2-4 highest-value hypotheses optimizes resource allocation.
Multi-experiment coordination - Experiment tracker (living document) enables multi-conversation workflows spanning days or weeks. Status tracking (Planned, In Progress, Complete) maintains visibility across all experiments. Supports both quantitative and qualitative experiment types.
Methodological rigor through delegation - hypothesis-testing skill handles quantitative experiment design (statistical analysis, A/B tests, metrics). qualitative-research skill handles qualitative experiment design (interviews, surveys, focus groups, observations, thematic analysis). Marketing-experimentation coordinates multiple tests without duplicating methodology.
Cross-experiment synthesis - presenting-data skill identifies patterns across experiments (what works, what doesn't, what's unclear). Aggregate analysis reveals strategic insights invisible in single experiments.
Clear signal generation - Campaign-level classification (Positive/Negative/Null/Mixed) with strategic recommendations (Scale/Pivot/Refine/Pause) provides actionable guidance for stakeholders.
Systematic iteration - Phase 6 generates experiment IDEAS (not hypotheses) that feed into new marketing-experimentation sessions. Feed-forward cycle maintains rigor through repeated discovery and hypothesis generation.
Multi-conversation persistence - Complete documentation at every phase enables resumption after days or weeks. Experiment tracker serves as coordination hub. All artifacts are git-committable.
Tool-agnostic approach - Focuses on techniques (value proposition testing, targeting strategies, sequence optimization) rather than specific platforms. Applicable across marketing tools and channels.
Key principles: