Skill

workflow-improve

Improves existing projects through systematic experimentation: study, research, hypothesis generation, build/eval loop, and archival. Triggered by 'improve X' or 'make X better'.

developer-tools

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/factory:workflow-improve <project_path> [--focus <target>]

User invocable

Model invocation disabled

Inline context

Default effort

Argument hint<project_path> [--focus <target>]

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

The user wants: **$ARGUMENTS**

SKILL.md

127 lines · ~1.3k tokens

Stats

LanguagePython

Stars40

Forks13

MaintenanceExcellent

Last CommitJun 25, 2026

Actions

View Source View Plugin View on GitHub View README

Improve Workflow

The user wants: $ARGUMENTS

Phase 1: Observe

Run local study to gather observations:

factory study $PROJECT_PATH

Writes observations to .factory/strategy/observations.md.

Phase 2: Researcher

factory agent researcher --task "Deep research for the project. Read observations at .factory/strategy/observations.md. Analyze codebase structure, eval scores, and experiment history. Search the web for best practices relevant to weak dimensions. Check .factory/archive/ for prior knowledge. Write findings to .factory/strategy/research-local.md.
Read: .factory/strategy/observations.md
Write output to: .factory/strategy/research-local.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Research

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/strategy/research-local.md
Assess: Are observations grounded in data? Did web research surface useful patterns? Any blind spots in the analysis?
Write verdict to .factory/reviews/ceo-verdict-research.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to researcher (max 3 iterations)

Phase 3: Strategist

factory agent strategist --task "Generate prioritized hypotheses. Read the backlog at .factory/strategy/backlog.md — clear as many items as possible. Read Hypothesis Budget from observations for constraints. Read CEO research review at .factory/reviews/ceo-verdict-researcher.md. Each hypothesis must be specific, scoped to one PR, tied to observations, with expected impact on eval dimensions. Tag backlog items with **Backlog item:** and new items with **New:**. Write to .factory/strategy/current.md.
Read: .factory/strategy/observations.md, .factory/strategy/research-local.md
Write output to: .factory/strategy/current.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Strategy

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/strategy/current.md
Assess: HARD GATE. Check: specific enough to implement? Scoped to one PR? Expected eval impact realistic? Follows FEEC priority? Not redundant with reverted experiment? At least one growth hypothesis? Backlog convergence? Write PLAN APPROVED with approved hypotheses in priority order.
Write verdict to .factory/reviews/ceo-verdict-strategy.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to strategist (max 3 iterations)

Step: Begin

factory begin $PROJECT_PATH --hypothesis "Implement hypothesis"

Phase 4: Builder

factory agent builder --task "Implement the current hypothesis from .factory/strategy/current.md. Read CLAUDE.md and factory.md. Read the CEO strategy approval. Implement exactly what the hypothesis describes. Run tests. Commit and open a draft PR.
Read: .factory/strategy/current.md
Write output to: .factory/reviews/builder-latest.md" --project "$PROJECT_PATH" --timeout 600

CEO Review — Build

Apply the CEO Review Gate protocol:

Read the agent output for the preceding step
Read artifacts: .factory/reviews/builder-latest.md
Assess: Read builder output and PR diff. Does work match the hypothesis? No scope creep? Tests included? REDIRECT if off-scope.
Write verdict to .factory/reviews/ceo-verdict-build.md
PROCEED → continue to next step
REDIRECT → re-invoke the preceding agent with corrections (max 2)
ABORT → log failure and skip to archival

On RELOOP: return to builder (max 3 iterations)

Phase 5: Evaluator

factory agent evaluator --task "Run eval: factory eval $PROJECT_PATH. Capture composite score. Report delta from baseline. Interpret dimension changes.
Read: .factory/reviews/builder-latest.md
Write output to: .factory/reviews/evaluator-latest.md" --project "$PROJECT_PATH" --timeout 600

Gate — Precheck (Automated)

factory precheck $PROJECT_PATH --score-before 0 --score-after 0

Step: Finalize

factory finalize $PROJECT_PATH --id 1 --verdict keep --hypothesis 'hypothesis'

Phase 6: Archivist

factory agent archivist --task "Archive experiment results and learnings.
Read: .factory/experiments/verdict.json
Write output to: .factory/archive/experiment.md" --project "$PROJECT_PATH" --timeout 300 --model haiku &

(fire-and-forget — CEO continues immediately)

workflow-improve

Popularity

Invocation

Context Preview

SKILL.md

workflow-improve

Popularity

Invocation

Context Preview

SKILL.md

Improve Workflow

Phase 1: Observe

Phase 2: Researcher

CEO Review — Research

Phase 3: Strategist

CEO Review — Strategy

Step: Begin

Phase 4: Builder

CEO Review — Build

Phase 5: Evaluator

Gate — Precheck (Automated)

Step: Finalize

Phase 6: Archivist

Similar Skills

Improve Workflow

Phase 1: Observe

Phase 2: Researcher

CEO Review — Research

Phase 3: Strategist

CEO Review — Strategy

Step: Begin

Phase 4: Builder

CEO Review — Build

Phase 5: Evaluator

Gate — Precheck (Automated)

Step: Finalize

Phase 6: Archivist

Similar Skills