Analyzes experiment results from tables, stats, or descriptions to generate LaTeX discussion paragraphs for academic papers via two-phase workflow: extracts findings for user confirmation, then writes grounded analysis.
npx claudepluginhub lylll9436/paper-polish-workflow-skill --plugin paper-polish-workflowThis skill uses the workspace's default tool permissions.
This Skill accepts experiment result data — tables, statistics, or result descriptions —
Performs strict statistical analysis on ML/AI experimental results, generates real scientific figures, checks significance, validates comparisons, and produces analysis bundles.
Documents A/B test and experiment results with statistical analysis, segment insights, learnings, and recommendations. Use after tests conclude to communicate findings and inform decisions.
Extracts empirical results from primary research papers, summarizes each finding, explains importance, and categorizes discussion citations as supporting or contrasting. Use when analyzing papers or building literature reviews.
Share bugs, ideas, or general feedback.
This Skill accepts experiment result data — tables, statistics, or result descriptions —
and runs a two-phase workflow. Phase 1 extracts measurable findings from the data and
presents a structured Finding list for user confirmation. Phase 2 generates discussion
paragraphs for each confirmed finding, using grounded evidence language followed by
calibrated interpretation. Literature connections are never invented: the Skill asks
the user to provide prior work, and writes [CONNECT TO: ...] placeholders when none
is supplied. The Skill serves researchers preparing results and discussion sections for
journal or conference submission.
Source: awesome-ai-research-writing — 实验分析
# Role
你是一位具有敏锐洞察力的资深数据科学家,擅长处理复杂的实验数据并撰写高质量的学术分析报告。
# Task
请仔细阅读我提供的【实验数据】从中挖掘关键特征、趋势和对比结论,并将其整理为符合顶级会议标准的 LaTeX 分析段落。
# Constraints
1. 数据真实性:
- 所有结论必须严格基于输入的数据。严禁编造数据、夸大提升幅度或捏造不存在的实验现象。
- 如果数据中没有明显的优势或趋势,请如实描述,不要强行总结所谓的显著提升。
2. 分析深度:
- 拒绝简单的报账式描述(例如不要只说 A 是 0.5,B 是 0.6),重点在于比较和趋势分析。
- 关注点包括:方法的有效性(SOTA 比较)、参数的敏感性、性能与效率的权衡,以及消融实验中的关键模块贡献。
3. 排版与格式规范:
- 严禁使用加粗或斜体:正文中不要使用 \textbf 或 \emph,依靠文字逻辑来表达重点。
- 结构强制:必须使用 \paragraph{核心结论} + 分析文本 的形式。
* \paragraph{} 中填写高度凝练的短语结论(使用 Title Case 格式)。
* 紧接着在同一段落中展开具体的数值分析和逻辑推演。
- 不要使用列表环境,保持纯文本段落。
4. 输出格式:
- Part 1 [LaTeX]:只输出分析后的 LaTeX 代码。
* 必须对特殊字符进行转义(例如:`%`、`_`、`&`)。
* 保持数学公式原样(保留 `$` 符号)。
* 不同的结论点之间请空一行。
- Part 2 [Translation]:对应的中文直译(用于核对数据结论是否准确)。
- 除以上两部分外,不要输出任何多余的对话。
Activates when the user asks to:
Example invocations:
| Mode | Default | Behavior |
|---|---|---|
direct | Yes | Full two-phase workflow: Phase 1 finding list → user confirm → Phase 2 discussion |
batch | Not supported — experiment analysis requires full context of the complete results set |
Default mode: direct. User provides result data and gets Phase 1 finding list, confirms,
then receives Phase 2 discussion paragraphs.
Mode inference: "Just identify findings" or "只分析不写讨论" runs Phase 1 only.
| File | Purpose |
|---|---|
references/expression-patterns.md | Expression patterns overview; loaded at Phase 1 start |
| File | When to Load |
|---|---|
references/expression-patterns/results-and-discussion.md | Always in Phase 2 — result reporting and pattern interpretation language |
references/expression-patterns/conclusions-and-claims.md | Always in Phase 2 — calibrated claim language (suggests, indicates, scope) |
references/expression-patterns/methods-and-data.md | In Phase 2 if user's result description includes method details needing clarification |
references/anti-ai-patterns/vocabulary.md | In Phase 2 — screen generated output for AI-sounding vocabulary |
| File | When to Load |
|---|---|
references/journals/[journal].md | When user specifies a target journal. If missing, refuse: "Journal template for [X] not found. Available: CEUS." |
Before starting, ask about:
[CONNECT TO: ...] placeholders in Phase 2)Rules:
[RESEARCH QUESTION: describe your RQ here]
placeholders rather than blocking the workflow entirely.planning/workflow-memory.json. If file missing or empty, skip to Phase 1.ppw:experiment that has appeared >= threshold times in the log. See skill-conventions.md > Workflow Memory > Pattern Detection for the full algorithm.direct, skip Ask Strategy questions.Step 1 — Prepare:
references/expression-patterns.md overviewenglish only, no bilingual, only english, 不要中文. Store result as bilingual_mode (true/false). This flag governs Phase 2 bilingual output below.{"skill": "ppw:experiment", "ts": "<ISO timestamp>"} to .planning/workflow-memory.json. Create file as [] if missing. Drop oldest entry if log length >= 50.Step 2 — Extract Findings:
Step 3 — Present Finding List:
Finding 1: [subject] [comparison/trend] [value] on [metric/condition]
Finding 2: Performance degrades in [condition] ([N] vs. [M])
Finding 3: [Subgroup] shows the largest effect ([value])
Step 1 — Prepare:
references/expression-patterns/results-and-discussion.md for evidence reporting languagereferences/expression-patterns/conclusions-and-claims.md for calibrated interpretationreferences/anti-ai-patterns/vocabulary.md to screen output before presentingStep 2 — Write Discussion Paragraphs:
[CONNECT TO: describe the prior finding here]Step 3 — Output:
Present all discussion paragraphs in sequence
Bilingual display: If bilingual_mode is true: after each discussion paragraph, append a > **[Chinese]** ... blockquote containing the Chinese translation of that paragraph. Use a section header "双语对照 / Bilingual Comparison:" before the first paragraph. Format per finding paragraph:
[English discussion paragraph for Finding N]
[Chinese] [Chinese translation of the discussion paragraph for Finding N]
Do not insert Chinese into any written file. If the user requested writing discussion to the paper file via Write tool, write English-only paragraphs to the file; the Chinese blockquotes remain in conversation only.
If bilingual_mode is false (opt-out detected): skip bilingual display entirely.
If file input was used, offer to append discussion to file using Write tool
Recommend Polish Skill for further expression refinement if higher-register prose is desired
| Output | Format | Condition |
|---|---|---|
pattern_analysis | Structured Finding list (Finding N: format) | Always — Phase 1 |
discussion_paragraphs | One paragraph per confirmed finding | Phase 2 only, after Phase 1 confirmation |
bilingual_discussion | > **[Chinese]** ... blockquotes in session (one per finding paragraph) | Phase 2 only. Skipped when opt-out detected. Not written to file. |
Note: Phase 2 output cannot be produced without Phase 1 confirmation. If user skips Phase 1 and requests discussion directly, require Phase 1 completion first.
| Situation | Handling |
|---|---|
| Input is vague (no measurable values) | Refuse Phase 1 with: "Please provide specific values, comparisons, or metrics before I can identify findings." |
| User skips Phase 1 and asks for discussion | Require Phase 1 completion first; do not generate paragraphs without confirmed findings |
| User provides no research questions | Ask once; if declined, write [RESEARCH QUESTION: describe your RQ here] placeholders |
| User provides no prior literature | Use [CONNECT TO: ...] placeholders; do not attempt to name papers or authors |
| Only one finding identified | Produce a single discussion paragraph; do not pad or invent additional findings |
| Finding conflicts with user-stated hypothesis | Flag the discrepancy explicitly; do not suppress the conflicting result |
| Journal specified but template missing | Refuse: "Journal template for [X] not found. Available: CEUS." |
| Input is LaTeX table markup | Read data values and captions; ignore typesetting commands |
| Phase 1 produces no findings | Report "No measurable findings identified from input" and stop |
| Scenario | Fallback |
|---|---|
| Structured Interaction unavailable | Ask 1-3 plain-text questions: research questions, prior work, target journal |
| Expression pattern leaf missing | Proceed with general academic register; warn user of reduced quality |
| Write tool fails | Present discussion paragraphs in conversation; user saves manually |
| Phase 1 produces no findings | Report clearly and stop; do not proceed to Phase 2 |
Minimal invocation: User pastes a results table comparing Method A and Method B on accuracy and F1 score. User states RQ: "Does our approach outperform the baseline on both metrics?"
Phase 1 output:
Finding 1: Method A outperforms Method B by 3.2 percentage points on accuracy (87.4% vs. 84.2%)
Finding 2: Method A outperforms Method B by 4.1 points on F1 score (82.6 vs. 78.5)
Identified 2 findings. Please confirm, correct, or add before I write discussion.
User confirms. No prior work provided.
Phase 2 output (Finding 1):
Method A achieves 87.4% accuracy, outperforming Method B by 3.2 percentage points (84.2%).
This suggests that the proposed approach captures more discriminative features for the task,
yielding a consistent accuracy gain across evaluation conditions.
[CONNECT TO: describe a prior finding showing similar accuracy improvements for this approach]
Skill: experiment-skill Conventions: references/skill-conventions.md