Expert PM data analysis across multiple modes — error RCA, cohort retention, funnel drop-off, trend investigation, user segmentation, A/B test evaluation, and exploratory analysis. Use when the analysis type is unclear, when the data needs classification before analysis, or when the question spans more than one analytical framework.
From pm-data-analyticsnpx claudepluginhub jupitermoney/pm-superic-skills --plugin pm-data-analyticsThis skill uses the workspace's default tool permissions.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Designs, audits, and improves analytics tracking systems using Signal Quality Index for reliable, decision-ready data in marketing, product, and growth.
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
Analyse any product or engineering dataset with the depth of an experienced data analyst and the judgement of a senior PM. The output is not a data summary — it is a diagnosis with a decision at the end.
1. Classify the data before running any framework. A file with user IDs and timestamps could be an error log, a session log, or a transaction log. Each requires a different analysis. Read the data first, state what it is, and then choose the framework.
2. Numbers anchor every finding. "Retention is declining" is not a finding. "D30 retention dropped from 62% to 48% in the September cohort, driven by users acquired via paid campaigns" is a finding. Every observation must include a specific number.
3. Find one pattern the user did not ask for. The most useful analysis surfaces something the person did not know to ask. Run the requested analysis fully, then look for the adjacent signal.
4. Connect to a decision. Every analysis ends with a recommendation specific enough to assign to a person and measure within 30 days.
5. Flag what the data cannot answer. State the gaps explicitly. "Silence and success look identical in this dataset" is important context. So is "this analysis covers 28 days; a seasonal pattern would require 90+ days to confirm."
When to use: Data is error logs, failure counts, or repeated failures per entity. The question is "why is this breaking" or "how many users are affected."
Step 1 — Parse entities and classify failures. Extract the key entities (user IDs, request IDs, service names) from raw log lines using regex or structured parsing. Do not eyeball — parse programmatically when data volume exceeds 50 rows.
Classify every entity into a failure type:
Step 2 — Analyse retry cadence. For chronic entities, calculate the distribution of intervals between consecutive errors:
Step 3 — Volume trend. Break total errors by day or week. For each period, count:
A declining error count with a growing chronic population means the problem is compounding, not resolving.
Step 4 — Spike investigation. For any day with 2x+ normal error volume, determine:
Step 5 — Business impact. Count entities (users, requests, transactions) currently blocked from completing a key flow. Estimate conversion impact if the blocked flow is revenue-critical. State what "resolved" looks like and whether the data can confirm resolution vs. abandonment.
Output:
| Cohort | Entities | Errors | % of volume | Avg span | Still active |
|---|
When to use: Data has user IDs, a signup or activation date, and subsequent activity events. The question is about retention, churn, or long-term engagement.
Step 1 — Define activation. The cohort start event must be meaningful, not just account creation. A user who signed up but never transacted is not an activated user. Confirm the activation event before building cohorts.
Step 2 — Define the retention event. The retention event must reflect genuine engagement. "Logged in" is a weak signal. "Completed a transaction," "used the core feature," or "reached a spending threshold" are stronger.
Step 3 — Build the retention table. Calculate retention at each period (D7, D30, D90 for mobile; Week 1–12 for SaaS; Month 1–6 for financial products). Each cohort is a row; each period is a column; each cell is the % of the cohort still active at that period.
Step 4 — Identify the patterns.
Step 5 — Segment. If the data allows, cut retention by: acquisition channel, onboarding completion tier, first feature used, geography. The segment with the largest retention gap is the hypothesis for the root cause.
Output:
| Cohort | Size | D7 | D30 | D90 | D180 |
|---|
When to use: Data has sequential steps with a user count or conversion rate per step. The question is where users are dropping off.
Step 1 — Map the funnel explicitly. List each step with its event name and user count. Do not infer steps — confirm each one exists in the data.
Step 2 — Calculate step-by-step and cumulative conversion. Step conversion = users reaching step N / users reaching step N-1. Cumulative conversion = users reaching step N / users entering step 1.
Step 3 — Identify the primary drop. The single step with the largest absolute user loss is the priority. Relative drop rate matters less than absolute volume lost.
Step 4 — Segment the drop. Cut the funnel by: device type, user segment, acquisition channel, time of day. The drop is rarely uniform — segmentation reveals whether a specific slice is driving the overall number.
Step 5 — Distinguish abandonment from alternative paths. Users who left at step 3 may have taken a different path to completion. Confirm that "drop" actually means "did not convert" before concluding.
Output:
| Step | Users | Step conversion | Cumulative conversion | Drop vs prior step |
|---|
When to use: A specific metric is moving in an unexpected direction and the cause is unknown.
Step 1 — Find the inflection point. Plot the metric at daily granularity for at least 30 days. Identify the exact date the trend changed. Aggregate metrics (weekly, monthly) hide the day the break happened.
Step 2 — Decompose. Break the metric into its components. If DAU is declining, split into new users, returning users, and resurrected users. The component driving the change is the starting point for the diagnosis.
Step 3 — Check for co-incidents. List product releases, campaigns, pricing changes, and external events (holidays, news, competitor moves) within 7 days of the inflection point. Correlation is not causation but it narrows the hypothesis space.
Step 4 — Segment. Determine whether the trend is broad-based or concentrated. A 10% decline in overall DAU driven entirely by one city or one acquisition channel is a different problem than a broad 10% decline.
Step 5 — Validate with a counter-check. For the leading hypothesis, identify one additional signal that should be true if the hypothesis is correct. If "the new onboarding flow reduced activation" is the hypothesis, then D1 retention for post-launch cohorts should also be lower. Check it.
When to use: Raw user or event data with no prior segmentation. The question is "who are these users" or "what patterns exist in this data."
Step 1 — Calculate RFM or equivalent dimensions. For user data: recency of last activity, frequency of actions in the window, magnitude (spend, session length, transaction size). For event data: frequency of the event type, entity that generated it, time distribution.
Step 2 — Classify into segments. Use natural breaks in the distribution, not arbitrary percentile cuts. Name each segment by what it represents behaviourally, not by a number.
Common segments that emerge from product data:
Step 3 — Size and trend each segment. What % of total users does each segment represent? How has that % changed over the window? A shrinking power user segment and a growing dormant segment is a retention crisis, even if total MAU is flat.
When to use: Data has a control group and one or more variant groups with an outcome metric per user.
Step 1 — Validate before reading results.
Step 2 — Calculate statistical significance.
Step 3 — Check guardrail metrics. A variant that wins on the primary metric but degrades a guardrail (revenue per user, session depth, error rate) is not a true win. Check all guardrails before recommending a ship.
Step 4 — Recommend.
| Outcome | Recommendation |
|---|---|
| Significant positive lift, no guardrail issues | Ship — roll out to 100% |
| Significant positive lift, guardrail concerns | Investigate before shipping |
| Positive trend, not significant | Extend — need more data |
| Flat, not significant | Stop — no meaningful difference |
| Significant negative lift | Do not ship — revert and diagnose |
When to use: Raw data with no specific question, or when the user is not sure what to look for.
Step 1 — Characterise the dataset. Row count, date range, unique entities, key dimensions, null rates per column, ID format consistency.
Step 2 — Find the three most interesting patterns. Do not wait to be asked. Surface the patterns that are most actionable or most surprising. Apply the heuristic: what would a senior PM want to know immediately after seeing this data?
Step 3 — Flag anomalies and outliers. Entities or events that are statistical outliers often represent the most important signal (e.g., one user generating 18% of total error volume; one day with 3x normal activity).
Step 4 — Propose follow-up questions. State 2–3 specific questions this dataset is capable of answering, and what additional data would be needed to answer the questions it cannot.
When data is provided as a file:
When data is not provided:
Every analysis output must include: