Component skill for systematic result interpretation with intellectual honesty in DataPeeker analysis sessions
Systematically interprets query results through a 6-phase framework: establishing context, describing patterns objectively, generating alternative explanations, assessing significance, stating conclusions with caveats, and identifying follow-up questions. Use when analyzing query outputs to avoid premature conclusions and ensure intellectual honesty.
/plugin marketplace add tilmon-engineering/claude-skills/plugin install datapeeker@tilmon-eng-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
This component skill guides rigorous, intellectually honest interpretation of query results. Use it when:
writing-queries skill)understanding-data skill)Create a TodoWrite checklist for the 6-step interpretation framework:
Phase 1: Understand Context
Phase 2: Describe Patterns
Phase 3: Generate Alternative Explanations
Phase 4: Assess Significance
Phase 5: State Conclusions with Caveats
Phase 6: Identify Follow-up Questions
Mark each phase as you complete it. Document all interpretations in numbered markdown files.
Goal: Ground interpretation in business and analytical context before looking for patterns.
Before interpreting results, explicitly state:
## Context for Interpretation
**Original Question:** [What we set out to answer]
**Why This Matters:** [Business context and decisions that depend on this answer]
**Hypothesis (if applicable):** [What we expected to find, and why]
**Data Period:** [Time range covered by results]
**Filters Applied:** [Any exclusions or subsets]
**Known Data Limitations:** [Quality issues, missing data, coverage gaps]
Ask yourself:
What was happening during this time period?
How might this affect results?
What context is needed to interpret these numbers?
Document: External factors that might explain or confound results.
Before calling results "good" or "bad", define what you're comparing to:
## Success Criteria
**Comparing to:**
- Historical baseline: [e.g., "Q1 2023 had $500K revenue"]
- Target/goal: [e.g., "Target was 10% growth"]
- Industry benchmark: [e.g., "Industry average conversion is 2.5%"]
- Control group: [e.g., "Comparing treatment to control segment"]
**Threshold for meaningful difference:**
- [e.g., "Need >5% difference to be operationally significant"]
Don't proceed to pattern identification without context.
Goal: Objectively describe what the data shows before explaining why.
Describe results using neutral, factual language:
DO:
DON'T:
Categorize what you observe:
Magnitude patterns:
- Absolute values: [e.g., "Total revenue: $1.2M"]
- Relative comparisons: [e.g., "Region A is 3x larger than Region B"]
- Distributions: [e.g., "80% of revenue from 20% of customers"]
Time patterns:
- Trends: [e.g., "Monthly growth averaged 5% over 6 months"]
- Cycles: [e.g., "Weekly pattern peaks mid-week, dips on weekends"]
- Anomalies: [e.g., "March 15 spike to 3x normal daily volume"]
Relationship patterns:
- Correlations: [e.g., "Higher price segments have lower order counts"]
- Segments: [e.g., "B2B customers have 4x higher average order value than B2C"]
- Thresholds: [e.g., "Sharp drop-off in conversion above $100 price point"]
Use specific numbers, not vague terms:
Vague: "Sales increased significantly" Precise: "Sales increased 23% (1,234 → 1,518 units), a gain of 284 units"
Vague: "Most customers prefer Category A" Precise: "58% of customers (2,340 of 4,034) purchased from Category A"
Vague: "There's a big difference between segments" Precise: "Segment 1 average: $127, Segment 2 average: $89, difference of $38 (43%)"
Before accepting patterns as real, verify they're not artifacts:
-- Verify: Is this pattern real or a data issue?
-- Check for row count consistency
SELECT COUNT(*) FROM results; -- Does this match expectations?
-- Check for NULL inflation
SELECT COUNT(*) - COUNT(column_name) as null_count FROM results;
-- Check for duplicate records
SELECT COUNT(*) as total_rows, COUNT(DISTINCT id) as unique_ids FROM results;
-- Check for incomplete periods
SELECT MIN(date), MAX(date), COUNT(DISTINCT date) as date_count FROM results;
Document: Any data quality issues that might create misleading patterns.
Goal: Consider multiple explanations before committing to one.
This is the most critical phase for intellectual honesty. Premature explanation is the enemy of good analysis.
For each pattern identified, generate at least 3 possible explanations:
## Pattern: Friday has 16.2% of weekly sales vs 12.8% on Sunday
### Possible Explanations:
1. **Consumer behavior:** People shop more on Fridays (payday, preparing for weekend)
- Testable: Do other metrics (sessions, conversion rate) also peak Friday?
2. **Business operations:** We run promotions on Fridays
- Testable: Check promotion calendar, compare promoted vs non-promoted Fridays
3. **Data artifact:** Incomplete weeks in dataset skew day-of-week calculation
- Testable: Count how many of each weekday are in dataset
4. **Seasonality interaction:** Dataset includes holiday weeks where Friday patterns differ
- Testable: Split analysis into holiday vs non-holiday weeks
5. **Geographic mix:** Different time zones make "Friday" broader (Friday in US, Saturday in Asia)
- Testable: Segment by customer timezone if available
For your preferred explanation, actively argue against it:
## Preferred Explanation: Customers shop more on Fridays due to payday
### Why this might be WRONG:
- Many customers have direct deposit on different days
- Weekend shopping (Sat/Sun) should be higher if it's leisure time
- No evidence yet that Friday shoppers are paid-on-Friday workers
- Pattern might be driven by small number of large B2B orders
- Could be specific to this dataset's time period only
### What would convince me this is RIGHT:
- [ ] Friday pattern consistent across multiple months/quarters
- [ ] Segmentation shows pattern strongest for consumer (not B2B) purchases
- [ ] Individual customer purchase history shows Friday preference
- [ ] Pattern persists after removing outlier large orders
- [ ] Industry data confirms Friday shopping peak
Identify factors that might explain the pattern instead of your hypothesis:
Template:
## Potential Confounds
1. **[Confound name]:** [How it could explain the pattern]
- Test: [How to rule this in/out]
Example:
1. **Marketing send schedule:** Email campaigns go out Thursday, driving Friday purchases
- Test: Compare Friday sales on campaign weeks vs non-campaign weeks
2. **Product mix:** High-value products launched mid-dataset period
- Test: Segment analysis into before/after product launch
3. **Measurement error:** Weekend orders processed Monday, suppressing Sunday counts
- Test: Check order_date vs processed_date, validate weekend entries
Distinguish between:
## Explanation Type
This pattern is likely caused by:
- [ ] Single dominant factor (identify the ONE cause)
- [x] Multiple contributing factors (list all contributors)
If multiple factors:
- Factor 1: [Estimated contribution]
- Factor 2: [Estimated contribution]
- Factor 3: [Estimated contribution]
Testable: Can we quantify each factor's contribution?
Goal: Determine if patterns are meaningful before acting on them.
While we can't run formal statistical tests without additional tools, we can reason about significance:
## Significance Assessment
**Sample size:**
- [e.g., "Based on 10,234 orders over 90 days"]
- [Is this enough data to trust the pattern?]
**Effect size:**
- [e.g., "15% difference between segments"]
- [Is the difference large enough to matter?]
**Consistency:**
- [e.g., "Pattern appears in 8 of 10 months"]
- [Is this stable or fluctuating wildly?]
**Variance:**
- [e.g., "Group A: 100 ± 45, Group B: 150 ± 12"]
- [Do ranges overlap significantly?]
Even if a pattern is statistically real, is it actionable?
Questions to ask:
Is the difference large enough to matter operationally?
- Finding: Segment A has 2% higher conversion than Segment B
- Volume: Segment B is 50x larger
- Practical significance: Optimizing Segment B (even with lower rate) has
25x more impact than Segment A
- Conclusion: Pattern is real but not the priority
Can we actually act on this finding?
- Finding: Customers in ZIP codes starting with "9" have higher LTV
- Actionability: We can't control customer ZIP codes
- Practical significance: Low - interesting but not actionable
- Alternative: Look for correlated factors we CAN influence
What's the cost/benefit of acting?
- Finding: 3% revenue increase if we extend hours to 10pm
- Cost: Staffing, utilities for extra 2 hours
- Benefit: 3% of $1M = $30K annual revenue increase
- Margin: Estimated $8K net profit after costs
- Assessment: Marginally significant, requires testing
Without formal confidence intervals, reason about uncertainty:
## Uncertainty Assessment
**Data quality confidence:** [High/Medium/Low]
- [Any known issues with data accuracy?]
**Sample representativeness:** [High/Medium/Low]
- [Does this sample represent the broader population?]
- [Any selection bias in how data was collected?]
**Temporal stability:** [High/Medium/Low]
- [Will this pattern hold next month? Next year?]
- [Dependent on temporary conditions?]
**Overall confidence in pattern:** [High/Medium/Low]
- [Considering all factors, how confident are we this is real?]
Goal: Make clear, hedged claims that accurately represent certainty level.
Match your language to your confidence level:
High confidence:
Medium confidence:
Low confidence:
Inappropriate hedging:
Structure conclusions to distinguish facts from interpretations:
## Conclusions
### What We Observed (Facts)
- Friday sales averaged 16.2% of weekly total (vs 14.3% expected if uniform)
- This pattern appeared in 11 of 12 months studied
- Friday average order value ($127) is similar to weekly average ($124)
- Friday transaction count is 18% higher than Sunday (2,340 vs 1,980)
### What We Infer (Interpretations)
- Friday traffic increase (not AOV increase) drives higher sales
- Pattern is stable across most months (except December outlier)
- This is likely a behavioral pattern, not a pricing or promotion effect
### Confidence Level: Medium-High
- Strong evidence for Friday traffic pattern
- Insufficient data on WHY (customer motivation unclear)
- Need to rule out confounds (marketing calendar, staffing changes)
Every conclusion should include what you DON'T know:
## Caveats and Limitations
**What this analysis does NOT tell us:**
- [e.g., "We don't know if Friday shoppers are different people or same
people shopping more frequently"]
- [e.g., "We can't determine causation from this correlation"]
- [e.g., "This dataset doesn't include abandoned carts, only completed purchases"]
**Assumptions we made:**
- [e.g., "Assumed all timestamps are in local timezone"]
- [e.g., "Treated returns as separate from original purchase date"]
**Data quality concerns:**
- [e.g., "First two weeks of January had incomplete data"]
- [e.g., "Product category field was NULL for 5% of records"]
**Generalizability limits:**
- [e.g., "This analysis covers only online sales, not in-store"]
- [e.g., "Time period includes major pandemic shifts, may not represent normal behavior"]
Force yourself to state the implications clearly:
## Implications
**For decision-makers:**
- [What should they DO differently based on this finding?]
- [What should they STOP doing?]
- [What remains uncertain that needs more investigation?]
Example:
**For marketing team:**
- Consider scheduling campaigns for Thursday delivery (to catch Friday traffic)
- Don't assume Friday success will translate to other days
- Test: Run A/B test with campaign timing to validate causal relationship
**For ops team:**
- Current Friday staffing appears adequate (no degradation in service metrics)
- Monitor: Watch for Friday capacity constraints as volume grows
**For analytics team:**
- Investigate: Why do customers shop more on Fridays?
- Build: Day-of-week forecasting model to improve inventory planning
Goal: Turn conclusions into next analytical steps.
Good analysis creates more questions than it answers:
## Follow-up Questions
### Questions to deepen understanding:
1. [Question that drills into WHY pattern exists]
- Data needed: [What data would answer this?]
- Query approach: [How would we query for this?]
2. [Question that tests alternative explanation]
- Data needed: [...]
- Query approach: [...]
### Questions to test generalizability:
3. [Does this pattern hold in different segments?]
- Segment by: [Customer type, product category, region, etc.]
4. [Is this pattern stable over time?]
- Test: [Earlier time periods, recent vs historical]
### Questions to assess actionability:
5. [Can we influence this pattern?]
- Experiment: [What intervention could we test?]
6. [What's the ROI of acting on this finding?]
- Calculate: [Revenue impact, cost, net benefit]
Not all questions are equally valuable:
## Question Prioritization
**High Priority (Do Next):**
- [Questions that directly inform pending decisions]
- [Questions that could refute our main conclusion]
- [Questions that are cheap/fast to answer with existing data]
**Medium Priority (Do Eventually):**
- [Questions that deepen understanding but don't change decisions]
- [Questions that require additional data collection]
**Low Priority (Backlog):**
- [Interesting but not actionable]
- [Questions that would take significant effort for marginal insight]
For high-priority questions, sketch the analysis:
## Proposed Follow-up Analysis
**Question:** Does Friday pattern vary by customer segment?
**Hypothesis:** Business customers drive Friday peak (ordering for next week)
**Data needed:**
- Customer segment field (B2B vs B2C)
- Order data with day-of-week already calculated
**Query approach:**
```sql
-- Compare day-of-week patterns by segment
SELECT
customer_segment,
day_of_week,
COUNT(*) as order_count,
SUM(amount) as revenue,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY customer_segment), 2) as pct_of_segment
FROM orders_with_dow
GROUP BY customer_segment, day_of_week
ORDER BY customer_segment, day_of_week;
Expected outcome:
Decision impact:
---
## Documentation Requirements
After completing all 6 phases, create an interpretation summary:
```markdown
## Result Interpretation Summary
### Context
- Question: [What we set out to answer]
- Data: [What we analyzed]
- Time period: [Coverage]
### Key Findings
1. [Finding 1 with supporting numbers]
2. [Finding 2 with supporting numbers]
3. [Finding 3 with supporting numbers]
### Interpretation
[2-3 paragraph narrative explaining what findings mean]
### Confidence Assessment
- Overall confidence: [High/Medium/Low]
- Key uncertainties: [What remains unknown]
- Supporting evidence: [What makes us confident]
- Contradicting evidence: [What makes us uncertain]
### Caveats
- [Limitation 1]
- [Limitation 2]
- [Limitation 3]
### Recommendations
1. [Actionable recommendation with rationale]
2. [Actionable recommendation with rationale]
3. [Further investigation needed]
### Follow-up Questions
- High priority: [Question 1]
- High priority: [Question 2]
- Medium priority: [Question 3]
Problem: Seeing what you expect to see, ignoring contradictory evidence.
Example:
Prevention:
Problem: Assuming that because A and B move together, A causes B.
Example:
Prevention:
Problem: Highlighting patterns that support your story, hiding those that don't.
Example:
Prevention:
Problem: Finding patterns in noise, then creating explanations post-hoc.
Example:
Prevention:
Problem: Misinterpreting percentages without considering absolute numbers.
Example:
Prevention:
Problem: Aggregate trends that reverse when data is segmented.
Example:
Prevention:
Problem: Analyzing only data that "survived" to be recorded, missing the full picture.
Example:
Prevention:
Re-run portions of this skill when:
Process skills reference this component skill with:
Use the `interpreting-results` component skill to systematically interpret query outputs, ensuring intellectual honesty and avoiding premature conclusions.
This ensures analysts:
Rigorous interpretation is the difference between data analysis and data-driven storytelling.
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.