From research-writing-assistant
Guides statistical test selection for group comparisons/relationships, Python code for execution/tests/effect sizes with scipy/pingouin, APA reporting, and pitfalls avoidance.
npx claudepluginhub norman-bury/research-writing-skillThis skill uses the workspace's default tool permissions.
本技能提供学术论文中统计分析的选择、执行和报告指南。
Guides statistical test selection, assumption checks, effect sizes, power analysis, and APA reporting for frequentist (t-test, ANOVA, chi-square, regression) and Bayesian methods using statsmodels or pymc-bayesian-modeling.
Guides statistical test selection, assumption checks, power analysis, hypothesis tests (t-tests, ANOVA, chi-square, regression, Bayesian), effect sizes, and APA-formatted reports for research data.
Conducts statistical hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, Bayesian analyses, power analysis, assumption checks, and APA reporting for academic research data.
Share bugs, ideas, or general feedback.
本技能提供学术论文中统计分析的选择、执行和报告指南。
| 数据特征 | 推荐检验 |
|---|---|
| 独立、连续、正态 | 独立样本t检验 |
| 独立、连续、非正态 | Mann-Whitney U检验 |
| 配对、连续、正态 | 配对样本t检验 |
| 配对、连续、非正态 | Wilcoxon符号秩检验 |
| 二分类结果 | 卡方检验或Fisher精确检验 |
| 数据特征 | 推荐检验 |
|---|---|
| 独立、连续、正态 | 单因素方差分析 |
| 独立、连续、非正态 | Kruskal-Wallis检验 |
| 配对、连续、正态 | 重复测量方差分析 |
| 配对、连续、非正态 | Friedman检验 |
| 分析目标 | 推荐方法 |
|---|---|
| 两个连续变量关系 | Pearson相关(正态)或Spearman相关(非正态) |
| 连续结果与预测变量 | 线性回归 |
| 二分类结果与预测变量 | 逻辑回归 |
from scipy import stats
# Shapiro-Wilk检验(样本量<5000)
stat, p_value = stats.shapiro(data)
print(f"Shapiro-Wilk检验: W={stat:.4f}, p={p_value:.4f}")
if p_value > 0.05:
print("数据符合正态分布假设")
else:
print("数据不符合正态分布,考虑使用非参数检验")
from scipy import stats
# Levene检验
stat, p_value = stats.levene(group1, group2)
print(f"Levene检验: F={stat:.4f}, p={p_value:.4f}")
if p_value > 0.05:
print("方差齐性假设满足")
else:
print("方差不齐,使用Welch's t检验")
| 检验 | 效应量 | 小 | 中 | 大 |
|---|---|---|---|---|
| t检验 | Cohen's d | 0.20 | 0.50 | 0.80 |
| ANOVA | η²_p | 0.01 | 0.06 | 0.14 |
| 相关 | r | 0.10 | 0.30 | 0.50 |
| 回归 | R² | 0.02 | 0.13 | 0.26 |
import pingouin as pg
# t检验返回Cohen's d
result = pg.ttest(group1, group2)
d = result['cohen-d'].values[0]
print(f"Cohen's d = {d:.2f}")
# ANOVA返回偏η²
aov = pg.anova(dv='score', between='group', data=df)
eta_p2 = aov['np2'].values[0]
print(f"Partial η² = {eta_p2:.3f}")
A组(n = 48, M = 75.2, SD = 8.5)得分显著高于B组
(n = 52, M = 68.3, SD = 9.2),t(98) = 3.82, p < .001,
d = 0.77, 95% CI [0.36, 1.18]。
单因素方差分析显示处理条件对测试分数有显著主效应,
F(2, 147) = 8.45, p < .001, η²_p = .10。事后比较使用
Tukey HSD表明,条件A(M = 78.2, SD = 7.3)得分显著
高于条件B(M = 71.5, SD = 8.1, p = .002)。
多元线性回归预测考试成绩,整体模型显著,
F(3, 146) = 45.2, p < .001, R² = .48。学习时间
(β = .35, p < .001)和先前GPA(β = .28, p < .001)
是显著预测变量。
import numpy as np
import pingouin as pg
from scipy import stats
# 数据
group_a = np.array([75, 82, 68, 79, 85, 72, 88, 76])
group_b = np.array([65, 70, 62, 68, 75, 60, 72, 66])
# 1. 描述统计
print(f"A组: M={group_a.mean():.2f}, SD={group_a.std():.2f}")
print(f"B组: M={group_b.mean():.2f}, SD={group_b.std():.2f}")
# 2. 正态性检验
_, p_a = stats.shapiro(group_a)
_, p_b = stats.shapiro(group_b)
print(f"正态性: A组 p={p_a:.3f}, B组 p={p_b:.3f}")
# 3. t检验
result = pg.ttest(group_a, group_b)
print(f"t = {result['T'].values[0]:.2f}")
print(f"p = {result['p-val'].values[0]:.4f}")
print(f"Cohen's d = {result['cohen-d'].values[0]:.2f}")