Skill

workflow-research

Runs a research workflow with baseline measurement, failure analysis, web research, and strategy generation for metric-driven optimization. Use when project has research_target configured.

developer-tools

automation

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/factory:workflow-research <project_path>

User invocable

Model invocation disabled

Inline context

Default effort

Argument hint<project_path>

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

The user wants: **$ARGUMENTS**

SKILL.md

138 lines · ~1.4k tokens

Stats

LanguagePython

Stars40

Forks13

MaintenanceExcellent

Last CommitJun 25, 2026

Actions

View Source View Plugin View on GitHub View README

Research Workflow

The user wants: $ARGUMENTS

Step: Baseline

factory agent evaluator --task "Run eval and report results." --project "$PROJECT_PATH" --timeout 300

Phase 1: Failure Analyst

factory agent failure_analyst --task "Analyze research run results. Read run artifacts at .factory/research/runs/. Read research target config from .factory/config.json. Classify failures by type and severity. Compute failure distribution. Suggest interventions within mutable surfaces only. Write to .factory/strategy/failure_analysis.md.
Read: .factory/experiments/baseline.json
Write output to: .factory/strategy/failure_analysis.md" --project "$PROJECT_PATH" --timeout 600

Phase 2: Researcher

factory agent researcher --task "Failure-targeted research. Read failure analysis at .factory/strategy/failure_analysis.md. Search the web for solutions to the dominant failure modes. Check .factory/archive/ for prior knowledge on these patterns. Write findings to .factory/strategy/research-local.md.
Read: .factory/strategy/failure_analysis.md
Write output to: .factory/strategy/research-local.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Research

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/strategy/research-local.md
Assess: Are observations grounded in data? Did web research surface useful patterns? Any blind spots in the analysis?
Write verdict to .factory/reviews/ceo-verdict-research.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to researcher (max 3 iterations)

Phase 3: Strategist

factory agent strategist --task "Generate research hypotheses targeting dominant failure modes. Each hypothesis must improve over the previous baseline score. Each hypothesis must name specific files from mutable_surfaces to modify. Hypotheses MUST NOT modify files in fixed_surfaces. Prioritize by expected impact on the target metric. Write 1-3 hypotheses to .factory/strategy/current.md.
Read: .factory/strategy/failure_analysis.md, .factory/strategy/research-local.md
Write output to: .factory/strategy/current.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Strategy

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/strategy/current.md
Assess: HARD GATE. Check: specific enough to implement? Scoped to one PR? Expected eval impact realistic? Follows FEEC priority? Not redundant with reverted experiment? At least one growth hypothesis? Backlog convergence? Write PLAN APPROVED with approved hypotheses in priority order.
Write verdict to .factory/reviews/ceo-verdict-strategy.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to strategist (max 3 iterations)

Step: Begin

factory begin $PROJECT_PATH --hypothesis "Implement hypothesis"

Phase 4: Builder

factory agent builder --task "Implement the current hypothesis from .factory/strategy/current.md. Read CLAUDE.md and factory.md. Read the CEO strategy approval. Implement exactly what the hypothesis describes. Run tests. Commit and open a draft PR.
Read: .factory/strategy/current.md
Write output to: .factory/reviews/builder-latest.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Build

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/reviews/builder-latest.md
Assess: Read builder output and PR diff. Does work match the hypothesis? No scope creep? Tests included? REDIRECT if off-scope.
Write verdict to .factory/reviews/ceo-verdict-build.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to builder (max 3 iterations)

Step: Evaluator

factory agent evaluator --task "Run eval and report results." --project "$PROJECT_PATH" --timeout 300

Gate — Precheck (Automated)

factory precheck $PROJECT_PATH --score-before 0 --score-after 0

Step: Finalize

factory finalize $PROJECT_PATH --id 1 --verdict keep --hypothesis 'hypothesis'

Phase 5: Archivist

factory agent archivist --task "Archive experiment results and learnings.
Read: .factory/experiments/verdict.json
Write output to: .factory/archive/experiment.md" --project "$PROJECT_PATH" --timeout 300 --model haiku &

(fire-and-forget — CEO continues immediately)

Gate — Plateau Gate (Automated)

python3 -c "import json, pathlib, sys; tsv = pathlib.Path('$PROJECT_PATH/.factory/results.tsv'); lines = [l for l in tsv.read_text().strip().splitlines()[1:] if l.strip()] if tsv.exists() else []; scores = []; [scores.append(float(p)) for l in lines for i, p in enumerate(l.split(chr(9))) if i == 2 and p]; recent = scores[-3:] if len(scores) >= 3 else scores; improved = len(recent) < 2 or recent[-1] > recent[-2]; print('RELOOP' if improved else 'PROCEED')"

On RELOOP: return to baseline (max 3 iterations)

workflow-research

Popularity

Invocation

Context Preview

SKILL.md

workflow-research

Popularity

Invocation

Context Preview

SKILL.md

Research Workflow

Step: Baseline

Phase 1: Failure Analyst

Phase 2: Researcher

CEO Review — Research

Phase 3: Strategist

CEO Review — Strategy

Step: Begin

Phase 4: Builder

CEO Review — Build

Step: Evaluator

Gate — Precheck (Automated)

Step: Finalize

Phase 5: Archivist

Gate — Plateau Gate (Automated)

Similar Skills

Research Workflow

Step: Baseline

Phase 1: Failure Analyst

Phase 2: Researcher

CEO Review — Research

Phase 3: Strategist

CEO Review — Strategy

Step: Begin

Phase 4: Builder

CEO Review — Build

Step: Evaluator

Gate — Precheck (Automated)

Step: Finalize

Phase 5: Archivist

Gate — Plateau Gate (Automated)

Similar Skills