Systematic exploratory data analysis process - discover patterns in unfamiliar data, identify meaningful insights, formulate specific questions for deeper investigation
Systematic exploration of unfamiliar datasets to discover patterns and formulate specific questions for deeper investigation. Use when you need to understand what's in a new dataset or when users ask to "see what's interesting" rather than answer a specific question.
/plugin marketplace add tilmon-engineering/claude-skills/plugin install datapeeker@tilmon-eng-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
templates/01-data-familiarization.mdtemplates/02-temporal-patterns.mdtemplates/03-segmentation-patterns.mdtemplates/04-relationship-patterns.mdtemplates/05-anomaly-investigation.mdtemplates/06-insights.mdtemplates/07-next-questions.mdtemplates/overview-summary.mdThis skill guides you through systematic exploration of unfamiliar datasets when you don't yet have a specific question to answer. Unlike hypothesis-testing (where you test a specific claim) or guided-investigation (where you answer a specific question), exploratory analysis helps you discover what's interesting in the data and identify what questions you should be asking.
Exploratory analysis is appropriate when:
Before using this skill, you MUST:
importing-data skillcleaning-data skill (MANDATORY - never skip)just start-analysis exploratory-analysis <name>)understanding-data - for data profilingwriting-queries - for SQL query constructioninterpreting-results - for result analysiscreating-visualizations - for text-based visualizationsYou MUST use TodoWrite to track progress through all 5 phases. Create todos at the start:
- Phase 1: Data Familiarization - pending
- Phase 2: Pattern Discovery - pending
- Phase 3: Anomaly Investigation - pending
- Phase 4: Insight Generation - pending
- Phase 5: Question Formulation - pending
Update status as you progress. Mark phases complete ONLY after checkpoint verification.
CHECKPOINT: Before proceeding, you MUST have:
01 - data-familiarization.mdCreate analysis/[session-name]/01 - data-familiarization.md with: ./templates/01-data-familiarization.md
Use the understanding-data component skill
Resist premature pattern-hunting
Common Rationalization: "I can see the tables, I don't need detailed familiarization" Reality: Skipping familiarization leads to missing important context about data quality, coverage, and structure.
Common Rationalization: "I'll explore patterns while I familiarize myself with the data" Reality: Mixing familiarization and pattern-hunting creates confusion. Separate concerns.
CHECKPOINT: Before proceeding, you MUST have:
02-temporal-patterns.md, 03-segmentation-patterns.md, 04-relationship-patterns.mdYou MUST explore all three vectors, even if some seem less promising. Surprises often come from unexpected places.
Create analysis/[session-name]/02-temporal-patterns.md with: ./templates/02-temporal-patterns.md
Create analysis/[session-name]/03-segmentation-patterns.md with: ./templates/03-segmentation-patterns.md
Create analysis/[session-name]/04-relationship-patterns.md with: ./templates/04-relationship-patterns.md
Be systematic, not random
Separate observation from interpretation
Use visualizations liberally
creating-visualizations component skillCommon Rationalization: "I found an interesting pattern, I'll just focus on that" Reality: Focusing on first interesting pattern causes you to miss better patterns elsewhere. Complete all three vectors.
Common Rationalization: "This dimension looks boring, I'll skip it" Reality: "Boring" dimensions often hide surprising patterns. Explore systematically.
Common Rationalization: "I'll combine all patterns into one analysis file" Reality: Separate files by vector creates structure and makes findings easy to locate.
CHECKPOINT: Before proceeding, you MUST have:
05 - anomaly-investigation.mdGo back through temporal, segmentation, and relationship explorations and identify:
Create analysis/[session-name]/05 - anomaly-investigation.md with: ./templates/05-anomaly-investigation.md
Distinguish data quality from real patterns
Don't ignore anomalies
Common Rationalization: "That spike is probably a holiday, I'll ignore it" Reality: VERIFY your assumption. "Probably" isn't good enough. Check.
Common Rationalization: "This anomaly is a data quality issue, I'll just exclude it" Reality: Document what you're excluding and why. Future analysts need to know.
Common Rationalization: "I found 20 anomalies, I'll investigate them all" Reality: Focus on the 3-5 most significant. You're exploring, not auditing every data point.
CHECKPOINT: Before proceeding, you MUST have:
06 - insights.mdNot every pattern is an insight. An insight must be:
Create analysis/[session-name]/06 - insights.md with: ./templates/06-insights.md
Be selective
Quantify magnitude
Assess confidence honestly
Common Rationalization: "Every pattern I found is an insight" Reality: Most patterns are noise or expected. Be selective. Insights must clear the bar: actionable, surprising, meaningful.
Common Rationalization: "I'll just state the pattern and let the user figure out if it matters" Reality: Your job is to interpret. Explain WHY the pattern matters and WHAT it suggests.
Common Rationalization: "I'm very confident in this insight" Reality: Exploratory analysis rarely produces high confidence. Be honest about uncertainty and limitations.
CHECKPOINT: Before proceeding, you MUST have:
07 - next-questions.md00 - overview.md with exploration summaryEach insight should suggest 1-2 follow-up questions. Good questions:
Create analysis/[session-name]/07 - next-questions.md with: ./templates/07-next-questions.md
Update analysis/[session-name]/00 - overview.md by adding content from: ./templates/overview-summary.md
Common Rationalization: "I found interesting patterns, I'm done" Reality: Exploration isn't complete until you've formulated what to investigate next. Always end with questions.
Common Rationalization: "I'll just suggest broad areas to explore further" Reality: Be specific. "Investigate customer behavior" is not actionable. "Compare weekend vs weekday sales per operating hour using comparative-analysis skill" is actionable.
Common Rationalization: "I'll list every possible question I can think of" Reality: Focus on the 3-5 highest-value questions. Too many options create decision paralysis.
Why this is wrong: Without understanding data quality, coverage, and structure, your pattern discoveries may be artifacts or noise.
Do instead: Complete Phase 1 fully. Familiarization prevents false discoveries and wasted effort.
Why this is wrong: The most surprising insights often come from places you didn't expect. "Boring" dimensions frequently hide interesting patterns.
Do instead: Explore ALL three vectors (temporal, segmentation, relationship) systematically. Be comprehensive.
Why this is wrong: One insight doesn't exhaust a dataset. You likely missed other valuable patterns.
Do instead: Continue systematic exploration. Aim for 3-5 insights across different dimensions.
Why this is wrong: Most patterns are noise, expected, or immaterial. Calling everything an insight dilutes the valuable discoveries.
Do instead: Apply strict criteria: actionable, surprising, meaningful. Be selective.
Why this is wrong: Your job is interpretation, not just data reporting. Users expect you to identify what matters and why.
Do instead: Assess significance, provide context, explain business implications. Do the analytical thinking.
Why this is wrong: Exploratory analysis generates hypotheses, not confirmations. Patterns need validation through targeted investigation.
Do instead: Be honest about confidence levels. Exploratory findings are typically medium-low confidence until validated.
Why this is wrong: Exploration is the beginning, not the end. The goal is to identify what to investigate deeply.
Do instead: Always complete Phase 5. Convert insights into specific, answerable questions for follow-up.
Why this is wrong: Mixing temporal, segmentation, and relationship analyses creates confusion and makes findings hard to locate.
Do instead: Separate files by exploration vector (02-temporal, 03-segmentation, 04-relationship). Clear structure aids comprehension.
Why this is wrong: Assumptions about anomalies are often wrong. "Probably" isn't good enough. Also, excluding data without documentation creates reproducibility issues.
Do instead: Investigate anomalies in Phase 3. Document what you exclude and why. Verify your assumptions.
Why this is wrong: Too many options create decision paralysis. Not all questions are equally valuable.
Do instead: Prioritize ruthlessly. Focus on 3-5 highest-value questions. Help the user know where to start.
This skill ensures systematic, thorough exploration of unfamiliar datasets by:
Follow this process and you'll discover what's truly interesting in unfamiliar data, avoid random pattern-chasing, and identify high-value questions for targeted investigation.
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.