From claude-swe-workflows
Conducts retrospective analysis on completed projects, incidents, decisions, or time periods by separating observations from recollections, applying reflection lenses, and synthesizing updated mental models.
npx claudepluginhub chrisallenlane/claude-swe-workflows --plugin claude-swe-workflowsThis skill uses the workspace's default tool permissions.
Extracts learnings from something that already happened. Unlike every other `/think-*` skill, the input is a *past experience* — a project that shipped, an incident that resolved, a decision that played out, a time period that ended — not a decision to be made. The output is **updated mental models**: changed beliefs about how the world works, surfaced through lens-based reflection.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Extracts learnings from something that already happened. Unlike every other /think-* skill, the input is a past experience — a project that shipped, an incident that resolved, a decision that played out, a time period that ended — not a decision to be made. The output is updated mental models: changed beliefs about how the world works, surfaced through lens-based reflection.
This skill produces no tangible artifacts. It is a consultant, not an implementer. No code, no tickets, no commits. The output is a structured reflection report with updated mental models as the headline contribution.
Judge (you, running this skill):
Reflectors: Each receives a specific reflection lens (what-worked-vs-got-lucky, what-didn't, what-surprised, system-rewards-vs-intent, decisions-that-aged, what-to-tell-past-self, patterns-that-recur) and extracts learnings through that lens in isolation.
Establish what is being reflected on, concretely. Vague scope produces vague reflection.
Probe for:
Produce a written brief of the experience and its boundaries. Reflectors operate on this brief.
This is the most failure-prone step and has enforced structure. Memory is reconstructive; it drifts toward coherent stories. The git log does not drift. The metric did not rewrite itself.
Elicit from the user, in three distinct buckets:
Actively solicit external sources. Unlike other /think-* skills, /think-reflect benefits from loading records the user points to:
Push back on smuggled recollections. If the user says "the launch went well," that's a judgment, not an observation. Ask: what actually happened? what was measured? what did people say at the time? Separate the judgment from the record.
Select 3-6 lenses from the palette based on what the experience affords.
Available lenses:
Selection heuristics:
Drop lenses that don't fit. A solo-contributor reflection has no system rewarding anything. A routine experience may have nothing surprising. Forcing an unfit lens produces noise.
Spawn one THK - Reflector agent per chosen lens, in parallel. Each receives:
No cross-talk between reflectors. NGT principle — independent reflection first, synthesis second.
Collect all reflections.
Combine the isolated reflections into a coherent report. Synthesis differs from other /think-* skills because the headline output is updated mental models, not standouts or findings.
5a. Cluster learnings across lenses. Multiple lenses may surface the same underlying learning from different angles (e.g., a "process win" from what-worked-vs-got-lucky may connect to a "decision that aged well" from decisions-that-aged). Merge and preserve lens attribution.
5b. Extract updated mental models as first-class output. Each reflector may have flagged candidate model updates. Collect them, dedupe, and promote them to the top of the report. Format: "We believed X. This experience suggests Y. The updated belief is Z."
5c. Distinguish process wins from luck. Whenever the orchestrator sees a positive outcome described, verify the attribution. Luck mistaken for process is dangerous — it reinforces bad processes and sets up future failure. Label ambiguous attributions explicitly.
5d. Note observation/recollection gaps. Where reflectors flagged disagreement between observation and recollection, surface it. The gap is itself a learning (memory drifts toward specific narratives).
5e. Identify recurring patterns. One-off learnings are datapoints; recurring patterns are beliefs worth defending against.
Final report format:
## Reflection Report
**Experience:** [one-line scope]
**Lenses applied:** [list]
### Updated Mental Models
[HEADLINE SECTION. Models that should change based on this experience.
Each update in the form: "We believed X. This experience suggests Y.
The updated belief is Z." These are the calibration updates the user
should take forward — they are the skill's real contribution.]
1. **[Area of belief]**
- Previously: [the old mental model]
- Experience suggests: [what this experience indicates]
- Updated belief: [the new or refined mental model]
- Confidence in update: [high / moderate — honest about how well-supported this update is by the evidence]
2. [next update...]
### What Happened (Ground Truth)
[Observation-based summary of the experience. Where recollections differ
from observations, note the divergence.]
### What Worked — and Why
[Positive outcomes, with attribution made explicit. Each labeled:]
- **[outcome] — Process win:** [why attributable to what we did]
- **[outcome] — Lucky:** [why NOT attributable to process; the method doesn't generalize]
- **[outcome] — Mixed:** [process contributed but didn't guarantee]
### What Didn't Work
[Blameless failure-mode analysis. What broke and what conditions allowed it.]
### Decisions in Retrospect
[If decisions-that-aged was a lens: quality grid — good/fortunate/unfortunate/bad
for each decision reviewed. Separates decision quality from outcome quality.]
### What Surprised Us
[Unexpected observations, with the contradicted belief and suggested
replacement — often these connect directly to the Updated Mental Models
section above.]
### System Rewards vs. Intent
[If system-rewards-vs-intent was a lens: Goodhart gaps found. What was
intended vs. what was actually rewarded.]
### Advice to Past-Self
[Forward-applicable advice derived from the experience — actionable
signals that would have been usable at the time they would have been
received.]
### Recurring Patterns
[Connections to prior experiences, if any. One-off vs. repeating.]
### Gaps in the Record
[Things that were not captured and that future retrospectives would
benefit from having. Often generates a small "capture more data next
time" list.]
### Suggested Next Steps
- To act on the updated mental models: these are the user's to internalize; no further skill invocation needed
- To design interventions based on failure modes: `/think-brainstorm`
- To diagnose a specific recurring failure mode: `/think-diagnose`
This skill is one-shot. If the user wants to reflect on a different experience, they re-invoke with that new experience. If they want to go deeper on a specific finding, they use the appropriate downstream skill (/think-diagnose, /think-brainstorm, /think-scrutinize).
Good fit:
Poor fit:
/think-deliberate/think-diagnose/bug-fix or /bug-hunt/think-brainstorm (reflection informs brainstorming but is a distinct step)Rule of thumb: If you're asking "what did I learn?" — /think-reflect. If you're asking "what should I do?" — a different skill.
/think-reflect is structurally different from the other /think-* skills: its input is a past experience, not a decision. The fact-finding phase is substantial because the observation/recollection distinction matters. The output foregrounds mental-model updates rather than options, critiques, or reframings.
Natural follow-ups:
/think-brainstorm to generate interventions/think-diagnose to understand its cause/think-scrutinize to stress-testReflection fits at any cadence. After major projects, after incidents, quarterly, annually, or opportunistically when a significant experience ends. The discipline — not the frequency — is what matters.
Retrospectives are universally skipped or done as ritual theater. A tidy document gets produced; nobody's beliefs update; the next project runs the same way. This is the failure mode /think-reflect exists to avoid.
The value of reflection is updated mental models, not a findings document. A model update is useful even when it's small: "I used to think our test suite was reliable; this experience suggests it's reliable for CRUD changes but not integration changes" is a real calibration that changes future behavior. A findings report that updates no beliefs has taught nothing.
The enforced observation-vs-recollection split is the other discipline. Memory reconstructs coherent narratives; observations don't. When they disagree, prefer the observation — and note the disagreement. The gap between what happened and what we remember is itself a learning about how we perceive our own experience.
Luck and process must stay separate. A good outcome from a bad process reinforces the bad process. A bad outcome from a good process looks like process failure. Attributing honestly — even when uncomfortable — is the foundation of all the other learnings.