From great-econometrics
Comprehensive results analysis for empirical research: generate publication-quality descriptive statistics and balance tables, interpret regression coefficients with economic magnitude and effect sizes, assess identification assumption diagnostics, and produce structured results memos. Use when asked to create summary statistics, Table 1, balance tests, interpret results, assess economic significance, or write results narratives.
npx claudepluginhub zhouziyue233/great-econometrics --plugin econometricsThis skill uses the workspace's default tool permissions.
æ¬ skill æ¯äžäœåç»æåæå·¥å ·ïŒæ¯æä»åå§æ°æ®æ¢çŽ¢ïŒEDAïŒå°ååœç»æè§£è¯»ç宿޿µçšïŒ
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
æ¬ skill æ¯äžäœåç»æåæå·¥å ·ïŒæ¯æä»åå§æ°æ®æ¢çŽ¢ïŒEDAïŒå°ååœç»æè§£è¯»ç宿޿µçšïŒ
results-memo.md åææè¡šæ Œç .tex/.csv åæ ŒåŒèŸåº/data Step 3 è°çšïŒ/code Phase 6 æ§è¡å®æåè°çšïŒæ¬ skill éåžžç± /data æ /code äŒ å
¥ç»æåäžäžæã
æ¥æ¶åæ°ïŒ
| åæ° | 诎æ | ç€ºäŸ |
|---|---|---|
clean_data_path | æž æŽåæ°æ®è·¯åŸ | data/clean/china_trade_*.parquet |
identification_strategy | è¯å«çç¥ | DiD / RDD / IV / Panel FE |
Y_var | ç»æåé | log_gdp_growth |
D_var | å€çåé | policy_dummy |
Z_var | è¯å«åéïŒåŠéçšïŒ | tariff_rate_1990 |
control_vars | ååéå衚 | ["log_gdp", "population"] |
id_var | 颿¿äžªäœæ è¯ | province_code |
time_var | æ¶éŽåé | year |
treatment_timing | æ¿ç宿œæ¶ç¹ïŒDiD çšïŒ | 2003 |
cutoff_value | æç¹éåŒïŒRDD çšïŒ | 50.0 |
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import os
os.makedirs("tables", exist_ok=True)
os.makedirs("figures", exist_ok=True)
mpl.rcParams.update({
"figure.dpi": 300,
"font.size": 11,
"axes.spines.top": False,
"axes.spines.right": False,
"axes.grid": True,
"grid.alpha": 0.3,
})
df = pd.read_parquet(clean_data_path)
print(f"æ ·æ¬ïŒ{len(df):,} è¡ Ã {df.shape[1]} å")
çæå
šæ ·æ¬æè¿°æ§ç»è®¡ïŒåæ ŒåŒèŸåºïŒ.tex çšäºè®ºæïŒ.csv çšäºæ žæ¥ïŒã
# âââââââââââââââââââââââââââââââââââââââââââââ
# Table 1 â æè¿°æ§ç»è®¡
# âââââââââââââââââââââââââââââââââââââââââââââ
analysis_vars = [Y_var, D_var] + ([Z_var] if Z_var else []) + control_vars
analysis_vars = [v for v in analysis_vars if v in df.columns]
stats_dict = {}
for v in analysis_vars:
s = df[v].dropna()
stats_dict[v] = {
"N": len(s),
"Mean": s.mean(),
"SD": s.std(),
"P25": s.quantile(0.25),
"Median": s.median(),
"P75": s.quantile(0.75),
"Min": s.min(),
"Max": s.max(),
}
table1 = pd.DataFrame(stats_dict).T.round(3)
table1.to_csv("tables/table1_descriptive.csv")
print("â
Table 1 æ°æ®å·²ä¿åïŒ.csvïŒ")
# LaTeX æ ŒåŒåç± `table` skill ç»äžèŽèŽ£ïŒåš /plot é¶æ®µè°çšïŒ
è§Šåæ¡ä»¶ïŒ ååšå€çåé D_var
å ³é®ååïŒå¹³è¡¡æ§æ£éªå¿ é¡»éå®åšé¢å€çææ ·æ¬äžïŒäžåŸäœ¿çšå šæ ·æ¬ã
# âââââââââââââââââââââââââââââââââââââââââââââ
# Table 2 â å€çç»/æ§å¶ç»å¹³è¡¡æ§æ£éª
# âââââââââââââââââââââââââââââââââââââââââââââ
from scipy import stats as scipy_stats
# çéé¢å€çææ ·æ¬
if treatment_timing:
df_pre = df[df[time_var] < treatment_timing].copy()
print(f"é¢å€çææ ·æ¬ïŒ{len(df_pre):,} è¡")
else:
df_pre = df.copy()
treated_group = df_pre[df_pre[D_var] == 1]
control_group = df_pre[df_pre[D_var] == 0]
balance_rows = {}
for v in control_vars:
if v not in df_pre.columns:
continue
t_vals = treated_group[v].dropna()
c_vals = control_group[v].dropna()
mean_t = t_vals.mean()
mean_c = c_vals.mean()
diff = mean_t - mean_c
t_stat, p_val = scipy_stats.ttest_ind(t_vals, c_vals, equal_var=False)
norm_diff = diff / np.sqrt((t_vals.var() + c_vals.var()))
balance_rows[v] = {
"Treatment Mean": round(mean_t, 3),
"Control Mean": round(mean_c, 3),
"Difference": round(diff, 3),
"t-stat": round(t_stat, 2),
"p-value": round(p_val, 3),
"Norm. Diff.": round(norm_diff, 3),
"Balanced": "â
" if abs(norm_diff) < 0.25 else "â ïž",
}
table2 = pd.DataFrame(balance_rows).T
table2.to_csv("tables/table2_balance.csv")
print("â
Table 2 æ°æ®å·²ä¿åïŒ.csvïŒ")
# LaTeX æ ŒåŒåç± `table` skill ç»äžèŽèŽ£ïŒåš /plot é¶æ®µè°çšïŒ
unbalanced = table2[table2["Balanced"] == "â ïž"].index.tolist()
if unbalanced:
print(f"â ïž æ ååå·®åŒ > 0.25 çåéïŒ{unbalanced}")
æ ¹æ® identification_strategyïŒèªåšè§Šå对åºçè¯æåŸã
# âââââââââââââââââââââââââââââââââââââââââââââ
# DiDïŒé¢å€çæå¹³è¡è¶å¿ç®æµ
# âââââââââââââââââââââââââââââââââââââââââââââ
trend = (df.groupby([time_var, D_var])[Y_var]
.mean()
.reset_index()
.rename(columns={D_var: "group"}))
fig, ax = plt.subplots(figsize=(8, 4.5))
for g, label, color, ls in [(1, "Treatment", "#C0392B", "-"),
(0, "Control", "#2C3E50", "--")]:
sub = trend[trend["group"] == g]
ax.plot(sub[time_var], sub[Y_var], color=color, ls=ls,
lw=2, marker="o", ms=4, label=label)
if treatment_timing:
ax.axvline(treatment_timing, color="gray", ls=":", lw=1.5,
label=f"Policy ({treatment_timing})")
ax.set_xlabel("Year")
ax.set_ylabel(f"Mean {Y_var}")
ax.set_title("Treatment vs Control: Pre-treatment Trend")
ax.legend()
plt.tight_layout()
plt.savefig("figures/eda_did_trend.png", dpi=300)
plt.close()
# é¢å€çæè¶å¿æç对æ¯
pre = df[df[time_var] < treatment_timing] if treatment_timing else df
for g, label in [(1, "Treatment"), (0, "Control")]:
sub = pre[pre[D_var] == g].groupby(time_var)[Y_var].mean()
if len(sub) >= 2:
slope = np.polyfit(sub.index, sub.values, 1)[0]
print(f" {label}ïŒå¹Žåè¶å¿æç = {slope:.4f}")
# âââââââââââââââââââââââââââââââââââââââââââââ
# RDDïŒåé
åéååž + éåŒéè¿å¯åºŠç®æµ
# âââââââââââââââââââââââââââââââââââââââââââââ
running_centered = df["running_centered"]
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
bandwidth_plot = running_centered.std() * 3
mask = running_centered.abs() <= bandwidth_plot
# å·ŠåŸïŒçŽæ¹åŸ
ax = axes[0]
ax.hist(running_centered[mask & (running_centered < 0)],
bins=30, color="#2C3E50", alpha=0.7, label="Below cutoff")
ax.hist(running_centered[mask & (running_centered >= 0)],
bins=30, color="#C0392B", alpha=0.7, label="Above cutoff")
ax.axvline(0, color="black", lw=1.5, ls="--")
ax.set_xlabel("Running Variable (centered)")
ax.set_ylabel("Frequency")
ax.set_title("Distribution of Running Variable")
ax.legend()
# å³åŸïŒY å
³äºåé
åéçæ£ç¹ + åæ®µæå
ax = axes[1]
x = running_centered[mask]
y = df.loc[mask, Y_var]
ax.scatter(x, y, alpha=0.2, s=8, color="steelblue")
for side_mask, color in [(x < 0, "#2C3E50"), (x >= 0, "#C0392B")]:
if side_mask.sum() > 5:
coef = np.polyfit(x[side_mask], y[side_mask], 1)
x_line = np.linspace(x[side_mask].min(), x[side_mask].max(), 100)
ax.plot(x_line, np.polyval(coef, x_line), color=color, lw=2)
ax.axvline(0, color="black", lw=1.5, ls="--")
ax.set_xlabel("Running Variable (centered)")
ax.set_ylabel(Y_var)
ax.set_title(f"Y vs Running Variable Near Cutoff")
plt.suptitle("RDD Diagnostic Plots", fontsize=13, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_rdd.png", dpi=300)
plt.close()
n_above = (df["above_cutoff"] == 1).sum()
n_below = (df["above_cutoff"] == 0).sum()
print(f"éåŒä»¥äžïŒ{n_above}ïŒä»¥äžïŒ{n_below}ïŒæ¯åŒ {n_above/n_below:.2f}ïŒ")
# âââââââââââââââââââââââââââââââââââââââââââââ
# IVïŒZ äž D çæ£ç¹åŸ + äžé¶æ®µåæ¥æ£éª
# âââââââââââââââââââââââââââââââââââââââââââââ
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
ax = axes[0]
ax.scatter(df[Z_var], df[D_var], alpha=0.3, s=10, color="steelblue")
coef = np.polyfit(df[Z_var].dropna(), df[D_var].dropna(), 1)
x_line = np.linspace(df[Z_var].min(), df[Z_var].max(), 100)
ax.plot(x_line, np.polyval(coef, x_line), color="#C0392B", lw=2)
ax.set_xlabel(f"Instrument: {Z_var}")
ax.set_ylabel(f"Treatment: {D_var}")
ax.set_title("First Stage: Z vs D")
corr = df[[Z_var, D_var]].corr().iloc[0, 1]
ax.text(0.05, 0.92, f"Ï = {corr:.3f}", transform=ax.transAxes,
fontsize=11, color="#C0392B")
ax = axes[1]
ax.scatter(df[Z_var], df[Y_var], alpha=0.3, s=10, color="#2C3E50")
corr_zy = df[[Z_var, Y_var]].corr().iloc[0, 1]
ax.set_xlabel(f"Instrument: {Z_var}")
ax.set_ylabel(f"Outcome: {Y_var}")
ax.set_title("Reduced Form: Z vs Y")
ax.text(0.05, 0.92, f"Ï = {corr_zy:.3f}", transform=ax.transAxes,
fontsize=11, color="#2C3E50")
plt.suptitle("IV Diagnostic Plots", fontsize=13, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_iv.png", dpi=300)
plt.close()
if abs(corr) < 0.1:
print(f"â ïž Z-D çžå
³ç³»æ°ä»
{corr:.3f}ïŒäžé¶æ®µ F < 10 é£é©")
# âââââââââââââââââââââââââââââââââââââââââââââ
# Panel FEïŒæ¹å·®åè§£
# âââââââââââââââââââââââââââââââââââââââââââââ
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
for ax, var, title in [(axes[0], Y_var, "Outcome Y"),
(axes[1], D_var, "Treatment D")]:
unit_means = df.groupby(id_var)[var].mean()
df_temp = df.copy()
df_temp["unit_mean"] = df_temp.groupby(id_var)[var].transform("mean")
df_temp["within"] = df_temp[var] - df_temp["unit_mean"]
var_between = unit_means.var()
var_within = df_temp["within"].var()
total = var_between + var_within
bars = ax.bar(["Between", "Within"],
[var_between / total * 100, var_within / total * 100],
color=["#2C3E50", "#C0392B"], alpha=0.8, width=0.5)
ax.bar_label(bars, fmt="%.1f%%", padding=3)
ax.set_ylabel("Share of Total Variance (%)")
ax.set_title(f"Variance Decomposition: {title}")
ax.set_ylim(0, 110)
if var_within / total < 0.1:
ax.text(0.5, 0.6, "â ïž Low within\nFE weak",
transform=ax.transAxes, ha="center", color="#C0392B", fontsize=9)
plt.suptitle(f"Panel FE: Within vs Between Variance", fontsize=12, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_panel_fe.png", dpi=300)
plt.close()
miss = df.isnull().mean().sort_values(ascending=False)
miss = miss[miss > 0]
miss_report = pd.DataFrame({
"N_Missing": df.isnull().sum()[miss.index],
"Pct_Missing": (miss * 100).round(2),
})
miss_report.to_csv("tables/missing_report.csv")
if not miss_report.empty:
print("=== çŒºå€±åŒæ¥å ===")
print(miss_report.to_string())
key_vars = [v for v in [Y_var, D_var, Z_var] if v and v in miss_report.index]
if key_vars:
print("\nâ ïž å
³é®åé猺倱ïŒ")
for v in key_vars:
pct = miss_report.loc[v, "Pct_Missing"]
print(f" {v}: {pct:.1f}%")
äŸæ¬¡è¯»åïŒ
model-spec.md # äž»æ¹çšãè¯å«åè®Ÿç¶æ
tables/table_main.csv # äž»ååœç»æ
tables/table1_descriptive.csv # æè¿°ç»è®¡
data-report.md # æ ·æ¬ä¿¡æ¯
literature-review-report.md # å
è¡æç®äŒ°è®¡åŒïŒå¯éïŒ
results = pd.read_csv("tables/table_main.csv", index_col=0)
beta = results.loc[D_var, "coef"] # ç¹äŒ°è®¡
se = results.loc[D_var, "se"] # æ å误
p_value = results.loc[D_var, "pvalue"] # p åŒ
ci_lo = results.loc[D_var, "ci_low"] # 95% CI äžç
ci_hi = results.loc[D_var, "ci_high"] # 95% CI äžç
n_obs = results.loc["N", "coef"]
print(f"Î²Ì = {beta:.4f}ïŒSE = {se:.4f}, p = {p_value:.3f}ïŒ")
print(f"95% CIïŒ[{ci_lo:.4f}, {ci_hi:.4f}]")
# 读åæè¿°ç»è®¡çšäºé级æ¢ç®
desc = pd.read_csv("tables/table1_descriptive.csv", index_col=0)
Y_mean = desc.loc[Y_var, "Mean"]
Y_sd = desc.loc[Y_var, "SD"]
D_mean = desc.loc[D_var, "Mean"]
é级解读æ¯ç»æåæææ žå¿çç¯èã
æ ¹æ®åéåæ¢ç±»åïŒèªåšéæ©å¯¹åºçè§£è¯»æ¡æ¶ïŒ
| Y 忢 | D 忢 | ç³»æ°å«ä¹ | æšè衚蟟 |
|---|---|---|---|
| æ°Žå¹³åŒ | èæåé | D=1 æ¶ Y ç»å¯¹ååé | "[å€ç]䜿 Y å¢å β 䞪åäœ" |
| log Y | èæåé | D=1 æ¶ Y çŸåæ¯åå | "[å€ç]䜿 Y å¢å 纊 100β%" |
| log Y | log D | åŒ¹æ§ | "D å¢å 1%ïŒY åå β%" |
| æ åå z-score | ä»»æ | 以 Y çæ å差䞺åäœ | "æåºé = β SD" |
# æåºéïŒCohen's d çä»·ïŒ
effect_size_sd = abs(beta) / Y_sd
print(f"æåºéïŒ{effect_size_sd:.3f} Ï_Y")
if effect_size_sd < 0.1:
magnitude = "æå°ïŒ< 0.1ÏïŒ"
elif effect_size_sd < 0.3:
magnitude = "å°ïŒ0.1â0.3ÏïŒ"
elif effect_size_sd < 0.5:
magnitude = "äžçïŒ0.3â0.5ÏïŒ"
else:
magnitude = "èŸå€§ïŒ> 0.5ÏïŒ"
print(f"ç级ïŒ{magnitude}")
# çžå¯¹äºååŒççŸåæ¯
pct_of_mean = beta / Y_mean * 100
print(f"çžå¯¹äº Y ååŒçæåºïŒ{pct_of_mean:.1f}%")
䞀è å¿ é¡»åæ¶è®šè®ºïŒäžå¯ä» å p åŒäžç»è®ºã
def sig_stars(p):
if p < 0.01: return "***"
if p < 0.05: return "**"
if p < 0.10: return "*"
return "(äžæŸè)"
stars = sig_stars(p_value)
print(f"Î²Ì = {beta:.4f}{stars}ïŒp = {p_value:.3f}")
print(f"95% CIïŒ[{ci_lo:.4f}, {ci_hi:.4f}]")
# è¯æä¿¡æ¯
if p_value < 0.05 and effect_size_sd < 0.05:
print("â ïž ç»è®¡æŸèäœç»æµäžåŸ®äžè¶³éïŒå€§æ ·æ¬æåºïŒ")
elif p_value >= 0.05 and (ci_hi - ci_lo) > 0.2:
print("â ïž ç²ŸåºŠäžè¶³ïŒæ æ³æé€å®èŽšæåº")
elif p_value >= 0.05 and (ci_hi - ci_lo) <= 0.1:
print("â
粟床å
åïŒå¯åçæé€å€§æåº")
ç»æµæŸèæ§å€ææ¡æ¶ïŒ
ç»æµæŸèæ§ âº
1. æåºé ⥠0.1Ï_Y
2. çžå¯¹äºååŒ â¥ 5%
3. 眮信åºéŽå
šåå·
4. äž IV/OLS 䌰计äžèŽ
ä» literature-review-report.md æååç±»ç ç©¶ç䌰计åŒïŒ
# æç®äŒ°è®¡æ±æ»è¡š
literature_comparison = {
"[äœè
, 幎仜]": {
"strategy": "[è¯å«çç¥]",
"sample": "[å°åº/æ¶æ]",
"beta": "[䌰计åŒ]",
"comment": "æ¬æé«/äœ/äžèŽ"
}
}
# å·®åŒæ¥æºæ£æ¥
print("""
å·®åŒæ¥æºæ£æ¥æž
å
â¡ è¯å«çç¥äžåïŒOLS vs. IV/DiDïŒâ è¡°åå误æ¹å
â¡ æ ·æ¬å·®åŒïŒåœå®¶/æ¶æïŒâ åŒèŽšæ§
â¡ åéå®ä¹äžå â æµé误差衰å
â¡ LATE vs. ATE å·®åŒ
â¡ çæ vs. é¿ææåº
â¡ æ¿ç区床差åŒ
""")
print(f"""
蟹é
莡ç®å®äœïŒ
äž [æç®] çžæ¯ïŒæ¬æéçš [æŽå¯ä¿¡çè¯å«çç¥]ïŒ
䌰计åŒäžº {beta:.4f}ïŒ[é«äº/äœäº/类䌌äº] æç®äžäœæ°äŒ°è®¡ã
å·®åŒäž»èŠæ¥èª [åå ]ã
""")
å° model-spec.md äžæ¯æ¡è¯å«å讟çç¶æçº³å
¥ç»æè§£è¯»ïŒ
credibility_assessment = """
è¯å«å¯ä¿¡åºŠè¯äŒ°
ââââââââââââââââââââââââââââââââââââââââââ
çç¥ïŒ[çç¥åç§°]
ç®æ åæ°ïŒ[ATE / ATT / LATE]
åè®Ÿç¶æïŒ
â
[å讟1]ïŒ[è¯æéè¿çäŸæ®]
â ïž [å讟2]ïŒ[åçç衚ç°ååºå¯¹]
â [å讟3]ïŒ[æªæ»¡è¶³çå«ä¹]
绌åå¯ä¿¡åºŠïŒé« / äž / äœ
ââââââââââââââââââââââââââââââââââââââââââ
"""
# å¯ä¿¡åºŠç级å®ä¹
credibility_levels = {
"é«": "æææ žå¿å讟 â
â å¯äœ¿çšå æè¯èš",
"äž": "æ žå¿åè®Ÿå« â ïž äœæ â â 审æ
è¯èš",
"äœ": "ååš â å讟 â æç¡®å£°æå±éæ§ïŒé级䞺çžå
³æ§"
}
# æçç¥çæ žå¿å讟快éè¯æ
core_assumptions = {
"DiD": "å¹³è¡è¶å¿å讟",
"RDD": "åé
åéæ æçºµ",
"IV": "æä»æ§éå¶",
"Panel FE": "äž¥æ Œå€çæ§",
"Synthetic Control": "é¢å€çææåè¯å¥œ"
}
late_interpretation = """
LATE 解读
ââââââââââââââââââââââââââââââââââââââââ
æ¬æäŒ°è®¡çæ¯ Compliers çå€çæåºïŒå³ïŒ
"å [å·¥å
·åé/éåŒ] æ¹åèæ¹åå€çç¶æç䞪äœ"
Compliers å¯èœäžä»£è¡šïŒ
â¡ Always-takers
â¡ Never-takers
â¡ æ ·æ¬å€çå
¶ä»çŸ€äœ
æ¿çå«ä¹ïŒç»æçŽæ¥éçšäº [å¯èœåæ¿ç圱åç蟹é
矀äœ]
ââââââââââââââââââââââââââââââââââââââââ
"""
representativeness_check = """
æ ·æ¬ä»£è¡šæ§æ£æ¥
- å°çèåŽïŒç»ææ¯åŠä»
éçšäºç¹å®åœå®¶/å°åºïŒ
- æ¶éŽèåŽïŒç¹å®æ¿æ²»ç»æµç¯å¢äžçæåºå¯æšå¹¿åïŒ
- è¡äž/矀äœïŒæ¯åŠååšéæ©æ§è¿å
¥æ ·æ¬ïŒ
- æ°æ®å±éïŒæµé误差çåœ±åæ¹å
"""
residual_endogeneity = """
æ®äœå
çæ§æ£æ¥
â¡ åæåççå
¶ä»æ¿ç/äºä»¶ïŒConfoundersïŒ
â¡ æ ·æ¬éæ©
â¡ æµé误差
â¡ SUTVA è¿åïŒå€çåå
éŽæº¢åºïŒ
â¡ äžè¬åè¡¡æåºïŒå€§è§æš¡æ¿ççåéŠïŒ
"""
results-memo.mdæŽå Part 2 çææåæïŒåå ¥å·¥äœç®åœïŒ
# Results Memo
**项ç®ïŒ** [ç ç©¶é®é¢äžå¥è¯]
**çæ¬ïŒ** v1.0
**æ¥æïŒ** [YYYY-MM-DD]
**å¯¹åºæ¹çšïŒ** model-spec.mdïŒæ¹çš [eq:label]
---
## 1. æ žå¿äŒ°è®¡ç»æ
| | äž»è§æ Œ | å æ§å¶åé | å
šæ ·æ¬ |
|--|--------|-----------|--------|
| Î²Ì (D_var) | [åŒ]*** | [åŒ]*** | [åŒ]** |
| SE | ([åŒ]) | ([åŒ]) | ([åŒ]) |
| N | [N] | [N] | [N] |
**䞻系æ°è§£è¯»ïŒ**
[æ Step 2 æ ŒåŒïŒå«ç»å¯¹é级ãæåºéãçžå¯¹ååŒçŸåæ¯]
## 2. ç»è®¡æŸèæ§ vs. ç»æµæŸèæ§
- **ç»è®¡æŸèæ§ïŒ** [æŸè/èŸ¹çŒæŸè/äžæŸè]ïŒp = [åŒ]
- **95% CIïŒ** [[lo], [hi]]
- **æåºéïŒ** [åŒ] Ï_YïŒ[æå°/å°/äž/倧]ïŒ
- **çžå¯¹äºååŒïŒ** Y ååŒç [åŒ]%
- **ç»æµæŸèæ§ïŒ** [äžå¥è¯ç»è®º]
## 3. äžå
è¡æç®å¯¹æ¯
[è¡šæ Œ + 2â3 å¥è¯è¯Žæå·®åŒæ¥æºå莡ç®]
## 4. è¯å«å¯ä¿¡åºŠè¯äŒ°
绌åå¯ä¿¡åºŠïŒ**[é«/äž/äœ]**
[éæ¡ååºå讟åç¶æ]
ç»è®ºè¯èšå»ºè®®ïŒ[å æè¯èš / 审æ
è¯èš / çžå
³æ§è¯èš]
## 5. å€éšæææ§äžå±éæ§
[LATE 蟹ç + æ ·æ¬ä»£è¡šæ§ + æ®äœå
çæ§åšè]
## 6. Phase 8 çš³å¥æ§æ£éªäŒå
级
1. **[æé«äŒå
级]**ïŒ[æ£éªåç§°]ââåå ïŒ[...]
2. **[次äŒå
级]**ïŒ[æ£éªåç§°]ââåå ïŒ[...]
3. **[å¯é]**ïŒ[æ£éªåç§°]ââåå ïŒ[...]
## 7. 论æåäœå»ºè®®
- **Results èæ žå¿å¥**ïŒ[çŽæ¥å¯çšäºè®ºæç衚蟟ïŒå«ç³»æ°åŒå眮信åºéŽ]
- **è¯èšåŒºåºŠïŒ** [å æ / 审æ
å æ / çžå
³æ§]
- **Discussion å¿
é¡»æåçæ³šæäºé¡¹**ïŒ[è¯å«å±éæ§]
| â åžžè§é误 | â æ£ç¡®åæ³ |
|---|---|
| å¹³è¡¡æ§æ£éªçšå šæ ·æ¬ | ä» çšé¢å€çææ ·æ¬ |
| åªæ¥å p åŒ | åæ¶æ¥åæ ååå·®åŒåæåºé |
| EDA åŸäžæ 泚æ¿çæ¶ç¹ | DiD è¶å¿åŸå¿
é¡»æ æ³š treatment_timing |
| è¡šæ Œåªä¿å .tex | åæ ŒåŒèŸåºïŒ.tex + .csv |
| 䞻系æ°äžæŸè就诎"æ æåº" | CI çª = é¶æåºæ¬èº«æ¯éèŠåç° |
| ç³»æ°æ¹åäžçè®ºé¢æçžåïŒçŽæ¥åé¿ | ææ¥çŒç æ¹åãäžä»åéãæ ·æ¬éæ©ïŒåŠå®è®°åœ |
| çŒºå€±åŒæ¥åä» åæ°é | å¯¹å ³é®åéç»åº MCAR/MAR/MNAR 忥倿 |
pip install pandas numpy matplotlib scipy
/dataïŒè°ç𿬠skill çäžçº§åœä»€ïŒPhase 4 Step 3ïŒdata-pipelineïŒäžäžé¶æ®µïŒæäŸæ¬ skill çèŸå
¥æ°æ®/model & /codeïŒPhase 5â6ïŒæ§è¡ååœå¹¶äº§åº table_main.csv/robustnessïŒPhase 8ïŒäœ¿ç𿬠skill çæç results-memo.md æå¯Œçš³å¥æ§æ£éª/writeïŒPhase 9ïŒçŽæ¥äœ¿ç𿬠skill çæçç»æè¡šæ Œå results-memo.md