From nw
Structures blameless post-mortems with incident timelines, impact assessment, root cause analysis, response evaluation, action items, and lessons learned. Useful after production incidents or outages.
npx claudepluginhub nwave-ai/nwave --plugin nwThis skill uses the workspace's default tool permissions.
- **Blameless**: focus on systems/processes, not individuals. People make reasonable decisions given available info.
Writes blameless postmortems with root cause analysis, timelines, executive summaries, and action items for incident reviews and response improvement.
Guides writing blameless postmortems for SEV1/SEV2 incidents using templates, timelines, root cause analysis, and action items to foster learning.
Generates blameless incident postmortems with timeline, root cause, contributing factors, impact summary, and action items. Use for outage reports, RCAs, P1/P2 reviews.
Share bugs, ideas, or general feedback.
# Post-Mortem: [Incident Title]
**Date**: [incident date]
**Duration**: [start to resolution]
**Severity**: [P0-P3]
**Author**: [analyst]
## Summary
[2-3 sentence overview: what happened, impact, resolution]
## Timeline
| Time | Event | Source |
|------|-------|--------|
| HH:MM | [event] | [log/metric/report] |
## Impact
- Users affected: [number/percentage]
- Duration of impact: [time]
- Business impact: [quantified if possible]
- Systems affected: [list]
## Root Cause Analysis
[5 Whys analysis with evidence at each level]
## Detection and Response
- Time to detect: [duration] -- [how detected]
- Time to respond: [duration] -- [first action]
- Time to mitigate: [duration] -- [mitigation applied]
- Time to resolve: [duration] -- [permanent fix]
## What Went Well
- [positive observations about detection, response, recovery]
## What Could Be Improved
- [areas where detection, response, recovery fell short]
## Action Items
| ID | Action | Owner | Priority | Due Date |
|----|--------|-------|----------|----------|
| 1 | [specific action] | [team/person] | [P0-P3] | [date] |
## Lessons Learned
- [key takeaways for the organization]
Events chronological with verified timestamps | gaps >5 min noted/explained | decision points identified with available info | causal relationships noted
Detected by monitoring or users? | Duration onset-to-detection? | Existing alerts relevant? Missing?
Right team at right time? | Procedures followed? | Communication clear to stakeholders?
Mitigation effective? | Rollback considered/viable? | Duration mitigation-to-permanent-fix?
Document root causes as reusable patterns | update runbooks | share in retrospectives
Update monitoring/alerting per detection gaps | revise deployment per rollback effectiveness | strengthen testing for failure scenario
Every item has owner + due date | track in standups/sprint reviews | verify effectiveness post-deployment