From datapeeker
Systematic marketing experimentation process - discover concepts, generate hypotheses, coordinate multiple experiments, synthesize results, generate next-iteration ideas through rigorous validation cycles
npx claudepluginhub tilmon-engineering/claude-skills --plugin datapeekerThis skill uses the workspace's default tool permissions.
Use this skill when you need to validate marketing concepts or business ideas through rigorous experimental cycles. This skill orchestrates the complete Build-Measure-Learn cycle from concept to data-driven signal.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Builds scalable data pipelines, modern data warehouses, and real-time streaming architectures using Spark, dbt, Airflow, Kafka, and cloud platforms like Snowflake, BigQuery.
Builds production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. For data pipelines, workflow orchestration, and batch job scheduling.
Use this skill when you need to validate marketing concepts or business ideas through rigorous experimental cycles. This skill orchestrates the complete Build-Measure-Learn cycle from concept to data-driven signal.
When to use this skill:
What this skill does:
What this skill does NOT do:
Integration with existing skills:
hypothesis-testing for quantitative experiment design and execution (metrics, A/B tests, statistical analysis)qualitative-research for qualitative experiment design and execution (interviews, surveys, focus groups, observations)interpreting-results to synthesize findings across multiple experimentscreating-visualizations to communicate aggregate resultsmarket-researcher agent for concept validation via internet researchMulti-conversation persistence: This skill is designed for campaigns spanning days or weeks. Each phase documents completely enough that new conversations can resume after extended breaks. The experiment tracker (04-experiment-tracker.md) serves as the living coordination hub.
Required skills:
hypothesis-testing - Quantitative experiment design and execution (invoked for metric-based experiments)qualitative-research - Qualitative experiment design and execution (invoked for interviews, surveys, focus groups, observations)interpreting-results - Result synthesis and pattern identification (invoked in Phase 5)creating-visualizations - Aggregate result visualization (invoked in Phase 5)Required agents:
market-researcher - Concept validation via internet research (invoked in Phase 1)Required knowledge:
Data requirements:
CRITICAL: This is a 6-phase process skill. You MUST complete all phases in order. Use TodoWrite to track progress through each phase.
TodoWrite template:
When starting a marketing-experimentation session, create these todos:
- [ ] Phase 1: Discovery & Asset Inventory
- [ ] Phase 2: Hypothesis Generation
- [ ] Phase 3: Prioritization
- [ ] Phase 4: Experiment Coordination
- [ ] Phase 5: Cross-Experiment Synthesis
- [ ] Phase 6: Iteration Planning
Workspace structure:
All work for a marketing-experimentation session is saved to:
analysis/marketing-experimentation/[campaign-name]/
├── 01-discovery.md
├── 02-hypothesis-generation.md
├── 03-prioritization.md
├── 04-experiment-tracker.md
├── 05-synthesis.md
├── 06-iteration-plan.md
└── experiments/
├── [experiment-1]/ # hypothesis-testing session
├── [experiment-2]/ # hypothesis-testing session
└── [experiment-3]/ # hypothesis-testing session
Phase progression rules:
Multi-conversation resumption:
CHECKPOINT: Before proceeding, you MUST have:
01-discovery.mdGather the business concept
Invoke market-researcher agent for concept validation
Dispatch the market-researcher agent with the concept description:
Document agent findings in 01-discovery.md under "Market Research Findings"
Conduct asset inventory
Work with user to inventory existing assets that could be leveraged:
Content Assets:
Campaign Assets:
Audience Assets:
Data Assets:
Define success criteria and validation signals
Work with user to define:
Document known constraints
Capture any constraints that will affect experimentation:
Create 01-discovery.md with: ./templates/01-discovery.md
STOP and get user confirmation
Common Rationalization: "I'll skip discovery and go straight to testing - the concept is obvious" Reality: Discovery surfaces assumptions, constraints, and existing assets that dramatically affect experiment design. Always start with discovery.
Common Rationalization: "I don't need market research - I already know this market" Reality: The market-researcher agent provides current, data-driven validation signals that prevent building experiments around false assumptions. Always validate.
Common Rationalization: "Asset inventory is busywork - I'll figure out what's available as I go" Reality: Existing assets can dramatically reduce experiment cost and time. Inventorying first prevents reinventing wheels and enables building on proven foundations.
CHECKPOINT: Before proceeding, you MUST have:
02-hypothesis-generation.mdGenerate 5-10 testable hypotheses
For each hypothesis, use this format:
Hypothesis [N]: [Brief statement]
Example hypothesis:
Hypothesis 1: Value proposition clarity drives conversion
Ensure tactic coverage
Verify hypotheses cover multiple marketing tactics:
Acquisition Tactics:
Activation Tactics:
Retention Tactics:
Don't generate 10 ad hypotheses. Aim for diversity across tactics.
Reference experimentation frameworks
Lean Startup Build-Measure-Learn:
AARRR Pirate Metrics:
Map each hypothesis to one or more AARRR stages.
ICE/RICE Prioritization (used in Phase 3):
Create 02-hypothesis-generation.md with: ./templates/02-hypothesis-generation.md
STOP and get user confirmation
Common Rationalization: "I'll generate hypotheses as I build experiments - more efficient" Reality: Generating hypotheses before prioritization enables strategic selection of highest-impact tests. Generating ad-hoc leads to testing whatever's easiest, not what matters most.
Common Rationalization: "I'll focus all hypotheses on one tactic (ads) since that's what we know" Reality: Tactic diversity reveals which channels work for this concept. Single-tactic testing creates blind spots and missed opportunities.
Common Rationalization: "I'll write vague hypotheses and refine them during experiment design" Reality: Vague hypotheses lead to vague experiments that produce vague results. Specific hypotheses with expected outcomes enable clear signal detection.
Common Rationalization: "More hypotheses = better coverage, I'll generate 20+" Reality: Too many hypotheses dilute focus and create analysis paralysis in prioritization. 5-10 high-quality hypotheses enable strategic selection of 2-4 tests.
CHECKPOINT: Before proceeding, you MUST have:
03-prioritization.mdCRITICAL: You MUST use computational methods (Python scripts) to calculate scores. Do NOT estimate or manually calculate scores.
Choose prioritization framework
ICE Framework (simpler, faster):
RICE Framework (more comprehensive):
Choose ICE for speed, RICE for precision when reach varies significantly.
Score each hypothesis using Python script
For ICE Framework:
Create a Python script to compute and sort ICE scores:
#!/usr/bin/env python3
"""
ICE Score Calculator for Marketing Experimentation
Computes ICE scores: (Impact × Confidence) / Ease
Sorts hypotheses by score (highest to lowest)
"""
hypotheses = [
{
"id": "H1",
"name": "Value proposition clarity drives conversion",
"impact": 8,
"confidence": 7,
"ease": 9
},
{
"id": "H2",
"name": "Ad targeting refinement",
"impact": 7,
"confidence": 6,
"ease": 5
},
{
"id": "H3",
"name": "Email sequence optimization",
"impact": 6,
"confidence": 8,
"ease": 8
},
{
"id": "H4",
"name": "Content marketing expansion",
"impact": 5,
"confidence": 4,
"ease": 3
},
]
# Calculate ICE scores
for h in hypotheses:
h['ice_score'] = (h['impact'] * h['confidence']) / h['ease']
# Sort by ICE score (descending)
sorted_hypotheses = sorted(hypotheses, key=lambda x: x['ice_score'], reverse=True)
# Print results table
print("| Hypothesis | Impact | Confidence | Ease | ICE Score | Rank |")
print("|------------|--------|------------|------|-----------|------|")
for rank, h in enumerate(sorted_hypotheses, 1):
print(f"| {h['id']}: {h['name'][:30]} | {h['impact']} | {h['confidence']} | {h['ease']} | {h['ice_score']:.2f} | {rank} |")
Usage:
python3 ice_calculator.py
For RICE Framework:
Create a Python script to compute and sort RICE scores:
#!/usr/bin/env python3
"""
RICE Score Calculator for Marketing Experimentation
Computes RICE scores: (Reach × Impact × Confidence) / Effort
Sorts hypotheses by score (highest to lowest)
"""
hypotheses = [
{
"id": "H1",
"name": "Value proposition clarity drives conversion",
"reach": 10000, # users affected
"impact": 3, # 0.25=minimal, 1=low, 2=medium, 3=high, 5=massive
"confidence": 80, # percentage (50, 80, 100)
"effort": 2 # person-weeks
},
{
"id": "H2",
"name": "Ad targeting refinement",
"reach": 50000,
"impact": 1,
"confidence": 50,
"effort": 4
},
{
"id": "H3",
"name": "Email sequence optimization",
"reach": 5000,
"impact": 2,
"confidence": 80,
"effort": 3
},
{
"id": "H4",
"name": "Content marketing expansion",
"reach": 20000,
"impact": 1,
"confidence": 50,
"effort": 8
},
]
# Calculate RICE scores
for h in hypotheses:
# Convert confidence percentage to decimal
confidence_decimal = h['confidence'] / 100
h['rice_score'] = (h['reach'] * h['impact'] * confidence_decimal) / h['effort']
# Sort by RICE score (descending)
sorted_hypotheses = sorted(hypotheses, key=lambda x: x['rice_score'], reverse=True)
# Print results table
print("| Hypothesis | Reach | Impact | Confidence | Effort | RICE Score | Rank |")
print("|------------|-------|--------|------------|--------|------------|------|")
for rank, h in enumerate(sorted_hypotheses, 1):
print(f"| {h['id']}: {h['name'][:30]} | {h['reach']} | {h['impact']} | {h['confidence']}% | {h['effort']}w | {h['rice_score']:.2f} | {rank} |")
Usage:
python3 rice_calculator.py
Scoring Guidance:
Impact (1-10 for ICE, 0.25-5 for RICE):
Confidence (1-10 for ICE, 50-100% for RICE):
Ease (1-10 for ICE):
Effort (person-weeks for RICE):
Run scoring script and document results
hypotheses list with actual hypothesis data from Phase 2python3 [ice|rice]_calculator.py03-prioritization.md## Prioritization Calculation
**Method:** ICE Framework
**Calculation Script:**
```python
[paste full script here]
Results:
[paste output table here]
Select 2-4 highest-priority hypotheses
Considerations for selection:
Selection criteria:
Document experiment sequence
Determine execution strategy:
Example sequence plan:
Week 1-2: Launch H1 (landing page) and H3 (email) in parallel
Week 3-4: Analyze H1 and H3 results
Week 5-6: Launch H2 (ads) based on H1 learnings
Week 7-8: Analyze H2 results
Create 03-prioritization.md with: ./templates/03-prioritization.md
STOP and get user confirmation
Common Rationalization: "I'll test all hypotheses - don't want to miss opportunities" Reality: Resource constraints make testing everything impossible. Prioritization ensures highest-value experiments get resources. Unfocused testing produces weak signals across too many fronts.
Common Rationalization: "Scoring is subjective and arbitrary - I'll just pick what feels right" Reality: Scoring frameworks force explicit reasoning about trade-offs. "Feels right" selections optimize for recency bias and personal preference, not business value. Computational methods ensure consistency.
Common Rationalization: "I'll skip prioritization and go straight to easiest test" Reality: Easiest test rarely equals highest value. Prioritization prevents optimizing for ease at the expense of impact.
Common Rationalization: "I'll estimate scores mentally instead of running the script" Reality: Manual estimation introduces calculation errors and inconsistency. Python scripts ensure exact, reproducible results that can be audited and verified.
CHECKPOINT: Before proceeding, you MUST have:
04-experiment-tracker.mdCRITICAL: This phase is designed for multi-conversation workflows. The experiment tracker is a LIVING DOCUMENT that you will update throughout experimentation. New conversations should ALWAYS read this file first.
Determine experiment type for each hypothesis
CRITICAL: Before creating the tracker, classify each hypothesis as Quantitative or Qualitative.
Quantitative experiments use hypothesis-testing skill:
Qualitative experiments use qualitative-research skill:
Decision criteria:
Ask: "What do we need to learn?"
Ask: "What data will we collect?"
Mixed methods:
Create experiment tracker
The tracker is your coordination hub for managing multiple experiments over time.
Create 04-experiment-tracker.md with: ./templates/04-experiment-tracker.md
Tracker format:
For each selected hypothesis, create an entry:
### Experiment 1: [Hypothesis Brief Name]
**Status:** [Planned | In Progress | Complete]
**Hypothesis:** [Full hypothesis statement from Phase 2]
**Tactic/Channel:** [landing page | ads | email | etc.]
**Priority Score:** [ICE/RICE score from Phase 3]
**Start Date:** [YYYY-MM-DD or "Not started"]
**Completion Date:** [YYYY-MM-DD or "In progress"]
**Location:** `analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/`
**Signal:** [Positive | Negative | Null | Mixed | "Not analyzed"]
**Key Findings:** [Brief summary when complete, "TBD" otherwise]
Invoke appropriate skill for each experiment
For each hypothesis marked "Planned" or "In Progress":
Step 1: Read the hypothesis details from 02-hypothesis-generation.md
Step 2a: If Quantitative experiment, invoke hypothesis-testing skill:
Use hypothesis-testing skill to test: [Hypothesis statement]
Context for hypothesis-testing:
- Session name: [descriptive-name-for-experiment]
- Save location: analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/
- Success criteria: [From Phase 1 discovery]
- Expected outcome: [From Phase 2 hypothesis]
- Metric to measure: [CTR, conversion rate, etc.]
- Data source: [Google Analytics, database, etc.]
Step 2b: If Qualitative experiment, invoke qualitative-research skill:
Use qualitative-research skill to conduct: [Hypothesis statement]
Context for qualitative-research:
- Session name: [descriptive-name-for-experiment]
- Save location: analysis/marketing-experimentation/[campaign-name]/experiments/[experiment-name]/
- Research question: [From Phase 2 hypothesis]
- Collection method: [Interviews | Surveys | Focus Groups | Observations]
- Success criteria: [What insights validate/invalidate hypothesis?]
Step 3: Update experiment tracker:
Step 4a: Let hypothesis-testing skill complete its 5-phase workflow:
Step 4b: Let qualitative-research skill complete its 6-phase workflow:
Step 5: When skill completes, update tracker:
Step 6: Commit tracker updates after each status change
Handle multi-conversation resumption
At the start of EVERY conversation during Phase 4:
04-experiment-tracker.md existsExample resumption:
I've read the experiment tracker. Current status:
- Experiment 1 (H1: Value prop): Complete, Positive signal
- Experiment 2 (H3: Email sequence): In Progress, currently in hypothesis-testing Phase 3
- Experiment 3 (H2: Ad targeting): Planned, not yet started
What would you like to do?
a) Continue Experiment 2 (in hypothesis-testing Phase 3)
b) Start Experiment 3
c) Move to synthesis (Phase 5) since Experiment 1 is complete
Coordinate parallel vs. sequential experiments
Parallel execution (multiple experiments simultaneously):
Sequential execution (one at a time):
Hybrid execution:
Progress through all experiments
Continue invoking hypothesis-testing and updating the tracker until:
Only when ALL experiments are complete should you proceed to Phase 5.
Common Rationalization: "I'll keep experiment details in my head - the tracker is just busywork" Reality: Multi-day campaigns lose context between conversations. The tracker is the ONLY source of truth that persists across sessions. Without it, you'll re-ask questions and lose progress.
Common Rationalization: "I'll wait until all experiments finish before updating the tracker" Reality: Batch updates create opportunity for lost data. Update the tracker IMMEDIATELY after status changes. Real-time tracking prevents confusion and missed experiments.
Common Rationalization: "I'll design the experiment myself instead of using hypothesis-testing" Reality: hypothesis-testing skill provides rigorous experimental design, statistical analysis, and signal detection. Skipping it produces weak experiments with ambiguous results.
Common Rationalization: "All experiments are done, I don't need to update the tracker before synthesis" Reality: The tracker is your input to Phase 5. Incomplete tracker means incomplete synthesis. Update ALL fields (status, dates, signals, findings) before proceeding.
CHECKPOINT: Before proceeding, you MUST have:
05-synthesis.mdCRITICAL: This phase synthesizes results ACROSS multiple experiments. Do NOT proceed until ALL Phase 4 experiments are complete with documented signals.
Verify experiment completion
Read 04-experiment-tracker.md and verify:
If any experiments are incomplete, return to Phase 4 to finish them.
Create aggregate results table
Compile findings from all experiments into a summary table:
Example Aggregate Table (Mixed Quantitative & Qualitative):
| Experiment | Type | Hypothesis | Tactic | Signal | Key Finding | Confidence |
|---|---|---|---|---|---|---|
| E1 | Quant | Value prop clarity | Landing page A/B test | Positive | Conversion rate +18% (p<0.05) | High |
| E2 | Qual | Customer pain points | Discovery interviews | Positive | 8 of 10 cited onboarding complexity | High |
| E3 | Quant | Ad targeting | Ads | Null | CTR +2% (not sig., p=0.12) | Medium |
| E4 | Qual | Ad message resonance | Focus groups | Negative | 6 of 8 found messaging confusing | High |
For Quantitative experiments (hypothesis-testing):
For Qualitative experiments (qualitative-research):
Invoke presenting-data skill for comprehensive synthesis
Use the presenting-data skill to create complete synthesis with visualizations and presentation materials:
Use presenting-data skill to synthesize marketing experimentation results:
Context:
- Campaign: [campaign name]
- Experiments completed: [count]
- Results table: [paste aggregate table]
- Audience: [stakeholders/decision-makers]
- Format: [markdown report | slides | whitepaper]
- Focus: Pattern identification across experiments (what works, what doesn't, what's unclear)
presenting-data skill will handle:
Focus areas for synthesis:
Document patterns and insights
The presenting-data skill will create 05-synthesis.md (or slides/whitepaper) with:
What Worked (Positive Signals):
What Didn't Work (Negative Signals):
What's Unclear (Null/Mixed Signals):
Cross-Experiment Patterns:
Visualizations (created by presenting-data):
Classify overall campaign signal
Based on aggregate analysis (from presenting-data output), classify the campaign:
Positive: Campaign validates concept, proceed to scaling
Negative: Campaign invalidates concept, pivot or abandon
Null: Campaign results inconclusive, needs refinement
Mixed: Some aspects work, some don't, iterate strategically
Review presenting-data output and finalize synthesis
After presenting-data skill completes:
STOP and get user confirmation
Common Rationalization: "I'll synthesize results mentally - no need to document patterns" Reality: Mental synthesis loses details and creates false confidence. Documented synthesis with presenting-data skill ensures intellectual honesty and identifies confounding factors you'd otherwise miss.
Common Rationalization: "I'll skip synthesis for experiments with clear signals" Reality: Individual experiment signals don't reveal cross-experiment patterns. Synthesis identifies why some tactics work while others don't - the strategic insight that guides iteration.
Common Rationalization: "Visualization is optional - the data speaks for itself" Reality: Tabular data obscures patterns. Visualization reveals signal distribution, effect size clusters, and confidence patterns that inform strategic decisions. presenting-data handles this systematically.
CHECKPOINT: Before proceeding, you MUST have:
06-iteration-plan.mdCRITICAL: Phase 6 generates experiment IDEAS, NOT hypotheses. Ideas feed into new marketing-experimentation sessions where Phase 2 formalizes hypotheses. Do NOT skip the discovery and hypothesis generation steps.
Generate 3-7 new experiment ideas
Based on Phase 5 synthesis, generate ideas for next iteration:
Idea Format:
Idea [N]: [Brief descriptive name]
Example Ideas:
Idea 1: Scale value prop landing page to paid ads
Idea 2: Investigate email sequence timing sensitivity
Idea 3: Pivot from broad ad targeting to lookalike audiences
Categorize ideas by strategy
Scale Winners:
Investigate Nulls:
Pivot from Failures:
Explore New:
Document campaign-level signal and strategic recommendation
Campaign Summary:
Strategic Recommendations by Signal:
Positive Signal:
Negative Signal:
Null Signal:
Mixed Signal:
Explain feed-forward pattern
CRITICAL: Ideas from Phase 6 are NOT ready for testing. They MUST go through a new marketing-experimentation session:
Feed-Forward Cycle:
Phase 6 generates IDEAS
↓
Start new marketing-experimentation session with idea
↓
Phase 1: Discovery (validate idea with market-researcher)
↓
Phase 2: Hypothesis Generation (formalize idea into testable hypotheses)
↓
Phase 3-6: Complete full experimental cycle
Why this matters:
Example: Phase 6 generates "Scale value prop to ads" (idea) → New session Phase 1: Market research on ad platform best practices → New session Phase 2: Generate hypotheses like "H1: Simplified value prop in ad headline increases CTR by 10%+" (specific, testable) → Continue with Phase 3-6 to test formal hypotheses
Create 06-iteration-plan.md with: ./templates/06-iteration-plan.md
STOP and review with user
Common Rationalization: "I'll turn ideas directly into experiments - skip the new session" Reality: Ideas need discovery and hypothesis generation. Skipping these steps leads to untested assumptions and vague experiments. Always run ideas through a new marketing-experimentation session.
Common Rationalization: "I'll generate hypotheses in Phase 6 for efficiency" Reality: Phase 6 generates IDEAS, Phase 2 (in a new session) generates hypotheses. Conflating these skips critical validation and formalization steps. Ideas → new session → hypotheses.
Common Rationalization: "Campaign signal is obvious from results, no need to document strategic recommendation" Reality: Documented recommendation provides clear guidance for stakeholders and future sessions. Without it, insights are lost and decisions become ad-hoc.
These are rationalizations that lead to failure. When you catch yourself thinking any of these, STOP and follow the skill process instead.
Why this fails: Discovery surfaces assumptions, constraints, and existing assets that dramatically affect experiment design. "Obvious" concepts often hide critical assumptions that need validation.
Reality: Market-researcher agent provides current, data-driven validation signals. Asset inventory reveals resources that reduce experiment cost and time. Success criteria definition prevents ambiguous results. Always start with discovery.
What to do instead: Complete Phase 1 (Discovery & Asset Inventory) before generating hypotheses. Invoke market-researcher agent. Document all findings.
Why this fails: The research skills provide rigorous experimental design, analysis, and signal detection. hypothesis-testing ensures statistical rigor for quantitative experiments. qualitative-research ensures systematic rigor for qualitative experiments. Skipping them produces weak experiments with ambiguous results.
Reality: Marketing-experimentation is a meta-orchestrator that coordinates multiple experiments. It does NOT design experiments itself. Delegation to appropriate skills (hypothesis-testing or qualitative-research) ensures methodological rigor.
What to do instead: Determine experiment type (quantitative or qualitative) in Phase 4. Invoke hypothesis-testing skill for quantitative experiments. Invoke qualitative-research skill for qualitative experiments. Let the appropriate skill handle all design, execution, and analysis.
Why this fails: Single experiments miss cross-experiment patterns. Some tactics work, others don't. Single-experiment campaigns can't identify which channels/tactics are most effective.
Reality: Marketing-experimentation tests 2-4 hypotheses to reveal strategic insights. Synthesis (Phase 5) identifies patterns across experiments - which tactics work, which don't, and why.
What to do instead: Follow Phase 3 prioritization to select 2-4 hypotheses. Complete all experiments before synthesis. Use Phase 5 to identify patterns.
Why this fails: Batch updates create opportunity for lost data. Multi-day campaigns lose context between conversations. Incomplete tracker leads to missed experiments and confusion.
Reality: The experiment tracker (04-experiment-tracker.md) is the ONLY source of truth that persists across sessions. Update it IMMEDIATELY after status changes.
What to do instead: Update tracker after every status change (Planned → In Progress, In Progress → Complete). Commit tracker updates to git. Read tracker FIRST in every new conversation.
Why this fails: Individual experiment signals don't reveal cross-experiment patterns. "Obvious" interpretations miss confounding factors and alternative explanations.
Reality: Documented synthesis with presenting-data skill ensures intellectual honesty. Visualization reveals patterns. Statistical assessment identifies robust vs uncertain findings.
What to do instead: Always complete Phase 5 (Cross-Experiment Synthesis). Invoke presenting-data skill. Document patterns, visualizations, and signal classification. Get user confirmation.
Why this fails: Phase 6 generates IDEAS, not hypotheses. Ideas need discovery (Phase 1) and hypothesis generation (Phase 2) in new sessions. Skipping these steps leads to untested assumptions and vague experiments.
Reality: Feed-forward cycle: Phase 6 ideas → new marketing-experimentation session → Phase 1 discovery → Phase 2 hypothesis generation → Phase 3-6 complete cycle.
What to do instead: Generate IDEAS in Phase 6. Start NEW marketing-experimentation session with selected idea. Complete Phase 1 and Phase 2 to formalize idea into testable hypotheses.
Why this fails: Manual estimation introduces calculation errors and inconsistency. Mental math is unreliable for multiplication and division.
Reality: Python scripts ensure exact, reproducible results that can be audited and verified. Computational methods eliminate human error.
What to do instead: Use Python scripts (ICE or RICE calculator) from Phase 3 instructions. Update hypothesis data in script. Run script and document exact scores. Copy output table to prioritization document.
Why this fails: Mental synthesis loses details and creates false confidence. Cross-experiment patterns require systematic analysis.
Reality: presenting-data skill handles pattern identification (via interpreting-results), visualization creation (via creating-visualizations), and synthesis documentation. It ensures intellectual honesty and reproducibility.
What to do instead: Always invoke presenting-data skill in Phase 5. Provide aggregate results table. Request pattern analysis and visualizations. Document all findings from presenting-data output.
The marketing-experimentation skill ensures rigorous, evidence-based validation of marketing concepts through structured experimental cycles. This skill orchestrates the complete Build-Measure-Learn loop from concept to data-driven signal.
What this skill ensures:
Validated concepts through market research - market-researcher agent provides current demand signals, competitive landscape analysis, and audience insights before experimentation begins.
Strategic hypothesis generation - 5-10 testable hypotheses spanning multiple tactics (landing pages, ads, email, content) grounded in discovery findings and mapped to experimentation frameworks (Lean Startup, AARRR).
Data-driven prioritization - Computational methods (ICE/RICE Python scripts) ensure exact, reproducible scoring. Selection of 2-4 highest-value hypotheses optimizes resource allocation.
Multi-experiment coordination - Experiment tracker (living document) enables multi-conversation workflows spanning days or weeks. Status tracking (Planned, In Progress, Complete) maintains visibility across all experiments. Supports both quantitative and qualitative experiment types.
Methodological rigor through delegation - hypothesis-testing skill handles quantitative experiment design (statistical analysis, A/B tests, metrics). qualitative-research skill handles qualitative experiment design (interviews, surveys, focus groups, observations, thematic analysis). Marketing-experimentation coordinates multiple tests without duplicating methodology.
Cross-experiment synthesis - presenting-data skill identifies patterns across experiments (what works, what doesn't, what's unclear). Aggregate analysis reveals strategic insights invisible in single experiments.
Clear signal generation - Campaign-level classification (Positive/Negative/Null/Mixed) with strategic recommendations (Scale/Pivot/Refine/Pause) provides actionable guidance for stakeholders.
Systematic iteration - Phase 6 generates experiment IDEAS (not hypotheses) that feed into new marketing-experimentation sessions. Feed-forward cycle maintains rigor through repeated discovery and hypothesis generation.
Multi-conversation persistence - Complete documentation at every phase enables resumption after days or weeks. Experiment tracker serves as coordination hub. All artifacts are git-committable.
Tool-agnostic approach - Focuses on techniques (value proposition testing, targeting strategies, sequence optimization) rather than specific platforms. Applicable across marketing tools and channels.
Key principles: