Help us improve
Share bugs, ideas, or general feedback.
From engineering
Runs incident response workflow: triage severity and roles, draft communications, track mitigation, generate blameless postmortem from alerts or status updates.
npx claudepluginhub anthropics/knowledge-work-plugins --plugin engineeringHow this skill is triggered — by the user, by Claude, or both
Slash command
/engineering:incident-responseThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> If you see unfamiliar placeholders or need to check which tools are connected, see [CONNECTORS.md](../../CONNECTORS.md).
Manages active production incidents through detection, triage, mitigation, communication, and resolution with structured roles and severity levels. Triggers on outage, P0/P1, downtime, on-call, service down.
Guides structured incident response from detection through post-mortem, including severity classification and RCA templates.
Manages incident lifecycle with three modes: new (triage, communicate, mitigate), update (status update), and postmortem (blameless RCA report).
Share bugs, ideas, or general feedback.
If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.
Manage an incident from detection through postmortem.
/incident-response $ARGUMENTS
/incident-response new [description] # Start a new incident
/incident-response update [status] # Post a status update
/incident-response postmortem # Generate postmortem from incident data
If no mode is specified, ask what phase the incident is in.
┌─────────────────────────────────────────────────────────────────┐
│ INCIDENT RESPONSE │
├─────────────────────────────────────────────────────────────────┤
│ Phase 1: TRIAGE │
│ ✓ Assess severity (SEV1-4) │
│ ✓ Identify affected systems and users │
│ ✓ Assign roles (IC, comms, responders) │
│ │
│ Phase 2: COMMUNICATE │
│ ✓ Draft internal status update │
│ ✓ Draft customer communication (if needed) │
│ ✓ Set up war room and cadence │
│ │
│ Phase 3: MITIGATE │
│ ✓ Document mitigation steps taken │
│ ✓ Track timeline of events │
│ ✓ Confirm resolution │
│ │
│ Phase 4: POSTMORTEM │
│ ✓ Blameless postmortem document │
│ ✓ Timeline reconstruction │
│ ✓ Root cause analysis (5 whys) │
│ ✓ Action items with owners │
└─────────────────────────────────────────────────────────────────┘
| Level | Criteria | Response Time |
|---|---|---|
| SEV1 | Service down, all users affected | Immediate, all-hands |
| SEV2 | Major feature degraded, many users affected | Within 15 min |
| SEV3 | Minor feature issue, some users affected | Within 1 hour |
| SEV4 | Cosmetic or low-impact issue | Next business day |
Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is.
## Incident Update: [Title]
**Severity:** SEV[1-4] | **Status:** Investigating | Identified | Monitoring | Resolved
**Impact:** [Who/what is affected]
**Last Updated:** [Timestamp]
### Current Status
[What we know now]
### Actions Taken
- [Action 1]
- [Action 2]
### Next Steps
- [What's happening next and ETA]
### Timeline
| Time | Event |
|------|-------|
| [HH:MM] | [Event] |
## Postmortem: [Incident Title]
**Date:** [Date] | **Duration:** [X hours] | **Severity:** SEV[X]
**Authors:** [Names] | **Status:** Draft
### Summary
[2-3 sentence plain-language summary]
### Impact
- [Users affected]
- [Duration of impact]
- [Business impact if quantifiable]
### Timeline
| Time (UTC) | Event |
|------------|-------|
| [HH:MM] | [Event] |
### Root Cause
[Detailed explanation of what caused the incident]
### 5 Whys
1. Why did [symptom]? → [Because...]
2. Why did [cause 1]? → [Because...]
3. Why did [cause 2]? → [Because...]
4. Why did [cause 3]? → [Because...]
5. Why did [cause 4]? → [Root cause]
### What Went Well
- [Things that worked]
### What Went Poorly
- [Things that didn't work]
### Action Items
| Action | Owner | Priority | Due Date |
|--------|-------|----------|----------|
| [Action] | [Person] | P0/P1/P2 | [Date] |
### Lessons Learned
[Key takeaways for the team]
If ~~monitoring is connected:
If ~~incident management is connected:
If ~~chat is connected: