Guide the user through building retention analyses and retention curves to understand user behavior over time.
Guide users through building retention analyses to understand user behavior over time. Choose from three analysis types: classic retention curves (API), cohort behavior comparisons (SQL), or time-to-event analysis to measure conversion speed.
/plugin marketplace add jaredmcfarland/mixpanel_data/plugin install mixpanel-data@mixpanel-dataGuide the user through building retention analyses and retention curves to understand user behavior over time.
Verify credentials are configured:
!$(mp auth test 2>&1 || echo "No credentials configured")
If not configured, suggest running /mp-auth first.
Present three analysis types:
Ask which type the user wants to perform.
Classic retention: What % of users return to perform an action?
Born Event: Event that defines cohort membership
$1 if provided, otherwise askExamples:
Return Event: Event indicating user returned
$2 if provided, otherwise askExamples:
Date Range:
Time Unit:
day - Daily cohorts (good for recent data)week - Weekly cohorts (recommended for most analyses)month - Monthly cohorts (good for long-term trends)Retention Intervals:
Filters (optional):
--born-where: Filter cohort entry (e.g., only organic signups)--return-where: Filter return events (e.g., only purchases >$50)Basic retention:
mp query retention \
--born "<born-event>" \
--return "<return-event>" \
--from <from-date> \
--to <to-date> \
--unit <day|week|month> \
--format table
With filters:
mp query retention \
--born "Sign Up" \
--return "Purchase" \
--from 2024-01-01 \
--to 2024-01-31 \
--unit week \
--interval-count 12 \
--born-where 'properties["source"] == "organic"' \
--return-where 'properties["amount"] > 50' \
--format table
Key metrics to highlight:
Day 0 Retention: % who return on the same day/week/month
📊 Day 0: X% - This is your immediate activation rate
Day 1/Week 1 Retention: First return after initial period
📊 Week 1: X% - Critical metric for product stickiness
Long-term Retention: Retention at final interval
📊 Week 11: X% - Shows long-term value
Retention Curve Shape:
Offer to create visualization:
import pandas as pd
import matplotlib.pyplot as plt
# Assuming retention data from query
intervals = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
retention_rates = [100, 45, 38, 35, 32, 30, 28, 27, 26, 25, 24, 24]
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(intervals, retention_rates, marker='o', linewidth=2, markersize=8)
ax.set_xlabel('Weeks Since Cohort Birth')
ax.set_ylabel('Retention Rate (%)')
ax.set_title('User Retention Curve')
ax.grid(True, alpha=0.3)
ax.axhline(y=retention_rates[-1], color='r', linestyle='--', alpha=0.5, label=f'Stabilized at {retention_rates[-1]}%')
plt.legend()
plt.tight_layout()
plt.savefig('retention_curve.png')
print("Retention curve saved to retention_curve.png")
Compare how different cohorts behave over time.
!$(mp inspect tables --format table)
If no tables, suggest /mp-fetch first. Run /mp-inspect tables to explore local data structure.
Cohort definition options:
Time-based cohorts:
Property-based cohorts:
Ask the user which cohort dimension to analyze.
Weekly Cohort Analysis:
WITH user_cohorts AS (
-- Assign each user to their cohort (week of first event)
SELECT
distinct_id,
DATE_TRUNC('week', MIN(event_time)) as cohort_week
FROM <table>
GROUP BY distinct_id
),
cohort_activity AS (
-- Track activity by cohort and activity week
SELECT
c.cohort_week,
DATE_TRUNC('week', e.event_time) as activity_week,
COUNT(DISTINCT e.distinct_id) as active_users
FROM <table> e
JOIN user_cohorts c ON e.distinct_id = c.distinct_id
GROUP BY c.cohort_week, activity_week
),
cohort_sizes AS (
-- Get cohort sizes
SELECT
cohort_week,
COUNT(DISTINCT distinct_id) as cohort_size
FROM user_cohorts
GROUP BY cohort_week
)
SELECT
ca.cohort_week,
ca.activity_week,
cs.cohort_size,
ca.active_users,
ROUND(100.0 * ca.active_users / cs.cohort_size, 2) as retention_rate,
FLOOR(DATEDIFF('day', ca.cohort_week, ca.activity_week) / 7) as weeks_since_cohort
FROM cohort_activity ca
JOIN cohort_sizes cs ON ca.cohort_week = cs.cohort_week
ORDER BY ca.cohort_week, ca.activity_week
Property-based Cohort Comparison:
WITH user_cohorts AS (
SELECT
distinct_id,
properties->>'$.source' as cohort_source,
MIN(event_time) as first_seen
FROM <table>
WHERE event_name = 'Sign Up'
GROUP BY distinct_id, properties->>'$.source'
),
cohort_metrics AS (
SELECT
c.cohort_source,
COUNT(DISTINCT e.distinct_id) as total_users,
AVG(CASE WHEN e.event_name = 'Purchase' THEN 1 ELSE 0 END) as purchase_rate,
AVG(CAST(e.properties->>'$.amount' AS DOUBLE)) as avg_revenue
FROM <table> e
JOIN user_cohorts c ON e.distinct_id = c.distinct_id
GROUP BY c.cohort_source
)
SELECT * FROM cohort_metrics
ORDER BY total_users DESC
Offer to generate a retention heatmap:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Query results transformed to pivot table
# cohort_week on y-axis, weeks_since_cohort on x-axis, retention_rate as values
df = pd.DataFrame({
'cohort_week': [...],
'weeks_since_cohort': [...],
'retention_rate': [...]
})
pivot = df.pivot(index='cohort_week', columns='weeks_since_cohort', values='retention_rate')
fig, ax = plt.subplots(figsize=(15, 8))
sns.heatmap(pivot, annot=True, fmt='.1f', cmap='RdYlGn', ax=ax,
cbar_kws={'label': 'Retention Rate (%)'})
ax.set_title('Cohort Retention Heatmap')
ax.set_xlabel('Weeks Since Cohort Birth')
ax.set_ylabel('Cohort Week')
plt.tight_layout()
plt.savefig('cohort_heatmap.png')
print("Cohort heatmap saved to cohort_heatmap.png")
Analyze how long it takes users to reach milestones.
Starting Event: Where the clock starts (e.g., "Sign Up") Target Event: What we're measuring time to (e.g., "First Purchase")
WITH user_events AS (
SELECT
distinct_id,
MIN(CASE WHEN event_name = '<start-event>' THEN event_time END) as start_time,
MIN(CASE WHEN event_name = '<target-event>' THEN event_time END) as target_time
FROM <table>
GROUP BY distinct_id
),
time_to_event AS (
SELECT
distinct_id,
start_time,
target_time,
DATEDIFF('day', start_time, target_time) as days_to_convert,
DATEDIFF('hour', start_time, target_time) as hours_to_convert
FROM user_events
WHERE start_time IS NOT NULL
AND target_time IS NOT NULL
AND target_time > start_time
)
SELECT
-- Bucket by days
CASE
WHEN days_to_convert = 0 THEN 'Same day'
WHEN days_to_convert <= 1 THEN '1 day'
WHEN days_to_convert <= 7 THEN '2-7 days'
WHEN days_to_convert <= 30 THEN '1-4 weeks'
WHEN days_to_convert <= 90 THEN '1-3 months'
ELSE '3+ months'
END as time_bucket,
COUNT(*) as users,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 2) as pct_of_total
FROM time_to_event
GROUP BY time_bucket
ORDER BY MIN(days_to_convert)
Show distribution metrics:
SELECT
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY days_to_convert) as median_days,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY days_to_convert) as p90_days,
COUNT(*) as converted_users
FROM time_to_event
✅ Healthy retention detected:
- Week 1 retention > 40%
- Curve flattens after week 4
- Long-term retention > 20%
This indicates: Strong product-market fit
⚠️ Retention concerns detected:
- Steep drop-off in first week
- Week 1 retention < 25%
- Continuous decline without flattening
Recommendations:
- Improve onboarding experience
- Add early activation hooks
- Survey churned users
- Implement retention campaigns
📊 Cohort performance varies:
- [Best cohort]: X% higher retention than average
- [Worst cohort]: Y% lower retention
Investigate:
- What was different about successful cohorts?
- Product changes during that period?
- Marketing campaign differences?
- External factors (seasonality, competition)?
After cohort analysis, suggest:
Deep dive on specific cohort:
/mp-query sql
# Analyze behavior of specific cohort
Segment retention by property:
mp query retention --born "Sign Up" --return "Login" \
--from 2024-01-01 --to 2024-01-31 --on source
Export for stakeholders:
mp query retention [...] --format csv > retention_report.csv
Create dashboard report:
/mp-report cohort
# Generate comprehensive cohort report
"No retention data":
mp inspect events"All retention is 0%":
"Cohort sizes too small":