Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.
Performs statistical analysis for hypothesis testing, A/B experiments, and data validation. Triggers when analyzing experimental results, comparing groups, or validating data distributions with statistical rigor.
/plugin marketplace add pluginagentmarketplace/custom-plugin-ai-data-scientist/plugin install ai-data-scientist-plugin@pluginagentmarketplace-ai-data-scientistThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/ab_test_config.yamlreferences/TEST_SELECTION_GUIDE.mdscripts/hypothesis_testing.pyApply statistical methods to understand data and validate findings.
from scipy import stats
import numpy as np
# Descriptive statistics
data = np.array([1, 2, 3, 4, 5])
print(f"Mean: {np.mean(data)}")
print(f"Std: {np.std(data)}")
# Hypothesis testing
group1 = [23, 25, 27, 29, 31]
group2 = [20, 22, 24, 26, 28]
t_stat, p_value = stats.ttest_ind(group1, group2)
print(f"P-value: {p_value}")
# One-sample: Compare to population mean
stats.ttest_1samp(data, 100)
# Two-sample: Compare two groups
stats.ttest_ind(group1, group2)
# Paired: Before/after comparison
stats.ttest_rel(before, after)
from scipy.stats import chi2_contingency
observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)
f_stat, p_value = stats.f_oneway(group1, group2, group3)
from scipy import stats
confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)
print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")
# Pearson (linear)
r, p_value = stats.pearsonr(x, y)
# Spearman (rank-based)
rho, p_value = stats.spearmanr(x, y)
# Normal
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)
# Sampling
samples = np.random.normal(0, 1, 1000)
# Test normality
stat, p_value = stats.shapiro(data)
def ab_test(control, treatment, alpha=0.05):
"""
Run A/B test with statistical significance
Returns: significant (bool), p_value (float)
"""
t_stat, p_value = stats.ttest_ind(control, treatment)
significant = p_value < alpha
improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100
return {
'significant': significant,
'p_value': p_value,
'improvement': f"{improvement:.2f}%"
}
P-value < 0.05: Reject null hypothesis (statistically significant)
P-value >= 0.05: Fail to reject null (not significant)
Problem: Non-normal data for t-test
# Check normality first
stat, p = stats.shapiro(data)
if p < 0.05:
# Use non-parametric alternative
stat, p = stats.mannwhitneyu(group1, group2) # Instead of ttest_ind
Problem: Multiple comparisons inflating false positives
from statsmodels.stats.multitest import multipletests
# Apply Bonferroni correction
p_values = [0.01, 0.03, 0.04, 0.02, 0.06]
rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')
Problem: Underpowered study (sample too small)
from statsmodels.stats.power import TTestIndPower
# Calculate required sample size
power_analysis = TTestIndPower()
sample_size = power_analysis.solve_power(
effect_size=0.5, # Medium effect (Cohen's d)
power=0.8, # 80% power
alpha=0.05 # 5% significance
)
print(f"Required n per group: {sample_size:.0f}")
Problem: Heterogeneous variances
# Check with Levene's test
stat, p = stats.levene(group1, group2)
if p < 0.05:
# Use Welch's t-test (default in scipy)
t, p = stats.ttest_ind(group1, group2, equal_var=False)
Problem: Outliers affecting results
from scipy.stats import zscore
# Detect outliers (|z| > 3)
z_scores = np.abs(zscore(data))
clean_data = data[z_scores < 3]
# Or use robust statistics
median = np.median(data)
mad = np.median(np.abs(data - median)) # Median Absolute Deviation
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.