Help us improve
Share bugs, ideas, or general feedback.
From sdlc-cross-role
Transforms an incident into systemic improvements: postmortem (engineer) → root cause (QA) → security review (security) → process improvement (tech lead) → backlog items (PM).
npx claudepluginhub sethdford/claude-skills --plugin sdlc-cross-roleHow this skill is triggered — by the user, by Claude, or both
Slash command
/sdlc-cross-role:incident-to-improvementThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Transforms an incident into systemic improvements: postmortem (engineer) → root cause (QA) → security review (security) → process improvement (tech lead) → backlog items (PM).
Conduct post-incident reviews to document lessons learned and implement process improvements preventing recurrence.
Conducts blameless postmortems for outages and incidents with timeline reconstruction, root cause analysis (5 Whys, fishbone), and corrective action tracking.
Activate for: incident, outage, system failure, post-mortem, incident post-mortem, root cause analysis, RCA, five whys, corrective action, lessons learned, incident log, incident report, P1, P2, major incident, incident review, incident timeline, what went wrong, service outage, payment failure, data breach incident, incident response, MTTD, MTTR, incident management, on-call, escalation, incident retrospective, corrective action tracker, lessons learned brief. NOT for: change impact assessment (use official /change-request), risk register building (use official risk-assessment auto-skill), compliance obligation mapping (use official compliance-tracking auto-skill).
Share bugs, ideas, or general feedback.
Transforms an incident into systemic improvements: postmortem (engineer) → root cause (QA) → security review (security) → process improvement (tech lead) → backlog items (PM).
An incident is a temporary failure. An improvement is a permanent change that makes that failure less likely or less severe in the future. The difference between teams that learn from incidents and teams that repeat them is whether they systematically convert incidents into improvements.
The incident-to-improvement cycle:
ISO/IEC 12207 Reference:
Output: Postmortem narrative (what happened, when, how resolved)
Output: Root cause analysis (the actual causes, not just the symptom)
Output: Security impact assessment and remediation steps
Output: Process improvement recommendations with priority and effort
Output: Backlog items with owners, priority, effort estimates
Incident Report: [Incident ID / Name]
Date: [When it occurred]
Duration: [How long it lasted]
Customer Impact: [Who was affected, how, for how long]
Timeline:
[Time 1] - What happened / What was deployed / What config changed
[Time 2] - Alert fired / Customer reported / System degraded
[Time 3] - Investigation started
[Time 4] - Root cause identified
[Time 5] - Fix deployed / Rollback executed
[Time 6] - System stable
Postmortem (Engineer):
Detection method: [alert / customer report / monitoring]
Resolution: [code change / config change / rollback]
Steps taken to resolve: [list]
Root Cause Analysis (QA + Engineer):
Root causes: [list]
Testing gap: [What test would have caught this?]
Similar patterns: [Related incidents / bugs]
Monitoring gap: [What alert would have detected this faster?]
Security Review:
Data compromise: [Yes / No / Unknown] - Details if applicable
Exploitability: [Yes / No] - If an attacker could cause this
Verification needed: [Audit logs, forensics, customer notification]
Regulatory notification: [Required / Not required]
Process Improvements:
Improvements: [list]
Priority: [Critical / High / Medium]
Effort: [estimate in hours]
Owner: [engineer / QA / tech lead]
Backlog Items:
[ ] [Item 1] - [Description] - [Owner] - [Effort] - [Priority]
[ ] [Item 2] - [Description] - [Owner] - [Effort] - [Priority]
[ ] [Item 3] - [Description] - [Owner] - [Effort] - [Priority]
Sign-Off:
Engineer: ********\_******** Date: **\_\_\_**
QA: **********\_\_********** Date: **\_\_\_**
Security: ******\_\_\_\_****** Date: **\_\_\_**
Tech Lead: ******\_\_\_****** Date: **\_\_\_**
PM: **********\_********** Date: **\_\_\_**
Postmortem without root cause
Improvements that don't match the root cause
Not closing the loop
Skipping the security review
Treating incidents as individual events
Not including QA in root cause analysis