Help us improve
Share bugs, ideas, or general feedback.
From cohort-analysis
Set up and launch an interactive cohort analysis dashboard. Use when the user wants to analyze engagement retention, revenue/GP cohorts, CAC/LTV payback, or customer retention from CSV, Excel, or database data.
npx claudepluginhub fergana-labs/fergana-plugins --plugin cohort-analysisHow this skill is triggered — by the user, by Claude, or both
Slash command
/cohort-analysis:cohort-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are setting up a cohort analysis dashboard. The app code lives at `${CLAUDE_SKILL_DIR}/../../app/` and data configs go in `${CLAUDE_SKILL_DIR}/../../data/configs/`.
Analyzes user cohorts for retention curves, feature adoption trends, churn patterns, and engagement insights. Generates heatmaps, charts, Python scripts, and research recommendations.
Analyzes customer cohorts by acquisition date, channel, behavior, or revenue tier to track retention curves, LTV, revenue, and engagement from CRM or analytics data.
Use this skill when the user asks about "cohort analysis", "retention cohorts", "how to read cohort data", "analyze my retention", "what does my cohort data say", "cohort retention curves", "D7/D30 retention", "how to improve cohort retention", or has cohort data they want to interpret and act on.
Share bugs, ideas, or general feedback.
You are setting up a cohort analysis dashboard. The app code lives at ${CLAUDE_SKILL_DIR}/../../app/ and data configs go in ${CLAUDE_SKILL_DIR}/../../data/configs/.
Guide the user through data ingestion interactively.
First, ask the user what they're working with:
"What data source do you have? I can work with CSV files, Excel files (.xlsx), or connect directly to a PostgreSQL database."
Scan the working directory for .xlsx, .csv files. List what you find and ask:
Any of these can be "none" — the dashboard only shows tabs for available data.
postgresql://user:pass@host:port/dbnameimport psycopg2
conn = psycopg2.connect(connection_string)
cur = conn.cursor()
cur.execute("SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'")
tables = [r[0] for r in cur.fetchall()]
cur.execute(f"SELECT column_name, data_type FROM information_schema.columns WHERE table_name = '{table}'")
cur.execute(f"SELECT * FROM {table} LIMIT 5")
customer_id, action_month (timestamp), event_type, event_count, revenue, etc.The query runs at server startup and caches the results in memory, so the dashboard is fast after initial load. If the database is large, add WHERE clauses to limit the date range.
For each selected file/sheet (or the query results for DB sources), read the first few rows to understand the schema. Then ask the user to confirm or correct these mappings:
The CAC/LTV payback analysis requires gross profit data. There are three ways to get it — check in this order:
Explicit GP column — If the revenue data has a GP or gross_profit column, map it directly ("gp": "column_name"). This is the most accurate option.
Per-row margin % — If the revenue data has a margin percentage column, map it ("margin_pct": "column_name"). GP is calculated as revenue * margin_pct per row.
Inputs sheet margin (automatic fallback) — If the revenue data has neither a GP column nor a margin % column, but the Inputs sheet has gross margin percentages by month, the tool will automatically apply those margins to compute GP. No extra mapping needed — just make sure the Inputs sheet is configured.
If none of these are available, ask the user:
"The CAC/LTV analysis needs gross profit data. Do you know your approximate gross margin? For example, 'around 70%' or 'it varies — 65% in 2024 and 72% in 2025.' I can apply that to your revenue to estimate GP."
Then either:
margin_pct column to the revenue dataThis is important if the business has any revenue that spans multiple periods. Ask the user:
"Does your revenue data represent cash (revenue recorded when payment is received) or accrual (revenue spread over the service period)? Would you like both views?"
Look for signals that accrual matters:
If the founder wants both views:
data_processor.py that takes the cash data and generates an accrual version based on their rulesIf the data already has separate accrual and cash sheets/tables, just map them in the config ("sheet" for accrual, "cash_sheet" for cash).
If the founder only has one view and doesn't need both, skip this step.
Default: calculate cohort from first appearance. Most datasets won't have an explicit cohort column. Set calculate_cohort: true — the tool derives cohort_month = min(action_date) per customer, floored to start of month (or week in weekly mode).
Only ask the user if you notice a column that looks like it could be a pre-set cohort date (e.g., 'Cohort Month', 'signup_date', 'cohort_week'). In that case, confirm:
"I see a column called '[name]' — is this the cohort date you want to use, or should I calculate cohorts from each customer's first transaction?"
If the data has Excel serial numbers for dates (large numbers like 45322), the tool handles this automatically.
Ask the user for a profile name (e.g., the company name, product name, or dataset label). This allows running multiple analyses side by side.
"What should I call this analysis? (e.g., 'acme', 'q1-2025', 'product-a')"
Write the config to ${CLAUDE_SKILL_DIR}/../../data/configs/{profile_name}.json. If this is the only analysis, also copy it to ${CLAUDE_SKILL_DIR}/../../data/config.json as the default.
Example structure:
{
"engagement": {
"source_type": "file",
"filepath": "engagement_data.xlsx",
"sheet": "Data",
"calculate_cohort": true,
"column_map": {
"customer_id": "user_id",
"action_month": "event_month",
"event_type": "event",
"event_count": "event_count"
}
},
"revenue": {
"source_type": "file",
"filepath": "revenue_data.xlsx",
"sheet": "Data",
"cash_sheet": "Data_Cash",
"calculate_cohort": true,
"column_map": {
"customer_id": "Customer ID",
"revenue": "Total Revenue",
"action_month": "Action Month",
"margin_pct": "% Margin",
"gp": "GP ($)",
"plan": "Plan"
}
},
"inputs": {
"filepath": "revenue_data.xlsx",
"sheet": "Inputs"
}
}
For database connections:
{
"engagement": {
"source_type": "db",
"connection_string": "postgresql://user:pass@host:5432/dbname",
"query": "SELECT u.id AS customer_id, ae.created_at AS action_month, ae.event AS event_type, 1 AS event_count FROM users u JOIN analytics_events ae ON ae.user_id = u.id",
"calculate_cohort": true,
"column_map": {}
}
}
By default the dashboard runs in monthly mode. If the user's data has daily-level timestamps (not pre-aggregated by month) and they want weekly cohort analysis, set "period": "week" in the config:
{
"period": "week",
"engagement": { ... },
"revenue": { ... }
}
When period is "week":
When to use weekly mode:
Do NOT use weekly mode when:
Ask the user: "Your data spans N months with daily timestamps. Would you like monthly or weekly cohorts?"
cd ${CLAUDE_SKILL_DIR}/../..
pip install -r requirements.txt
python app/server.py
Tell the user the dashboard is running at http://localhost:8000 and explain:
If the server is already running, call the reload endpoint instead:
curl -X POST 'http://localhost:8000/api/reload?profile={profile_name}'
Then tell the user to refresh the browser.