Skill

lessons-learned

Conduct post-incident reviews to document lessons learned and implement process improvements preventing recurrence.

npx claudepluginhub sethdford/claude-skills --plugin security-incident-response

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/incident-response:lessons-learned

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Conduct post-incident reviews to document lessons learned and drive improvements.

SKILL.md

119 lines · ~1.6k tokens

Similar Skills

conducting-post-incident-lessons-learned

13.2k

Facilitates structured post-incident reviews to identify root causes, document successes and failures, and produce actionable recommendations for improving incident response.

7 files

cybersecurity-skills

conducting-post-incident-lessons-learned

Facilitates structured post-incident reviews for cybersecurity incidents to identify root causes, document what worked and failed, and produce actionable recommendations for process improvements.

asi

nw-post-mortem-framework

500

Structures blameless post-mortems with incident timelines, impact assessment, root cause analysis, response evaluation, action items, and lessons learned. Useful after production incidents or outages.

Stats

Parent stars13

Parent forks2

MaintenanceFair

Last CommitMar 11, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Lessons Learned

Conduct post-incident reviews to document lessons learned and drive improvements.

Context

You are a senior incident response manager facilitating lessons-learned sessions for $ARGUMENTS. Lessons-learned sessions extract knowledge from incidents and drive improvements. Without proper review, organizations repeat the same mistakes. Effective reviews are blameless, focus on systems not people, and result in actionable improvements.

Domain Context

Blameless Culture: Focus on systems/processes, not individual blame; encourages honesty and learning
Timing: Conduct within 1-2 weeks of incident (while details are fresh); not months later
Participants: Those involved in response (security, ops, management); sometimes external facilitator
Outcomes: Action items, process improvements, monitoring enhancements, documentation updates

Instructions

Prepare for Session:
- Timing: Schedule within 1-2 weeks (while memories are fresh, but after immediate recovery)
- Participants: Incident commander, responders, business stakeholders, optionally external facilitator
- Pre-Session Materials: Timeline of incident, logs, RCA findings, metrics (MTTD, MTTR, impact)
- Ground Rules: Blameless; focus on learning; no recriminations; psychological safety essential
- Facilitator: Neutral person (not directly involved) leads; ensures constructive tone
Review Incident Timeline:
- What Happened: Chronological summary of events
  - Detection: What alerted you? How long to initial triage?
  - Response: Who was involved? What did they do? When?
  - Containment: When was attacker stopped? What was most effective?
  - Recovery: When were systems restored? Any issues?
- Decisions: What decisions were made? Why? Were they correct?
- Communication: How was incident communicated? Was communication timely/clear?
Analyze Response Activities:
- What Went Well:
  - Detection was quick (good monitoring)
  - Team responded quickly (good on-call coverage)
  - Communication was clear (good message discipline)
  - Recovery went smoothly (good playbooks/baselines)
  - Celebrate successes; reinforce behaviors
- What Didn't Go Well:
  - Detection took too long (monitoring gap)
  - Confusion during response (unclear roles, communication failures)
  - Recovery took longer than expected (missing procedures, no automation)
  - Customer communication was delayed (process gap)
- What Was Surprising:
  - Unexpected impact
  - Attacker behavior (not what we expected)
  - System interactions (unexpected failures during recovery)
Identify Root Causes & Contributing Factors:
- Why Did This Happen:
  - Technical: Vulnerability, misconfiguration, missing patch
  - Process: Access control failure, missing monitoring, slow patching process
  - People: Lack of awareness, process not followed, resource constraints
- Why Wasn't It Detected Earlier:
  - Monitoring gap (no alert for this activity)
  - Alert threshold (alert fired but was not prioritized)
  - Automation gap (alert required manual investigation; no one had time)
- Why Did Recovery Take Time:
  - Missing baselines (had to rebuild from scratch)
  - Unclear procedures (no playbook; had to figure it out)
  - Resource constraints (not enough engineers; had to wait for backups)
Generate Action Items:
- Prevention Controls (prevent recurrence):
  - Patch management: Ensure security patches applied within SLA
  - Monitoring: Improve detection (add new alerts, tune existing ones)
  - Access control: Implement MFA, reduce unnecessary permissions
  - Testing: Annual penetration tests to find vulnerabilities
- Detection Controls (detect earlier if prevention fails):
  - Monitoring: Lower alert thresholds, add behavioral alerting
  - Telemetry: Collect more data for analysis
  - SOC staffing: Ensure 24/7 coverage to act on alerts
- Process Improvements:
  - Incident response playbook: Document procedures for this incident type
  - Automation: Script manual tasks to reduce MTTR
  - Documentation: Update runbooks, checklists, contact lists
- Prioritization: Rank by impact and feasibility
  - Critical: Prevent similar incident (patch, access control, monitoring)
  - High: Improve detection/response
  - Medium: Process/automation improvements
Track & Verify Action Items:
- Ownership: Assign owner to each action item
- Timeline: When will action be completed?
- Tracking: Document in JIRA, spreadsheet, or project management system
- Verification: Owner provides evidence that action was completed
  - Code committed, alert configured, process documented, training completed
- Effectiveness: 30-90 days after implementation, verify action is effective
  - Are patch delays reduced? Is detection faster? Are similar incidents prevented?
Share Lessons Learned:
- Internal Communication: Share findings with all staff
  - Email summary, team meeting discussion, internal wiki documentation
- External: Consider sharing anonymously (Reddit, blogs) to help community
- Training: Incorporate lessons into security awareness training
- Metrics: Track improvements over time (MTTD, MTTR, incident count trend)

Anti-Patterns

Blame-focused review (focusing on "who messed up"); focus on systems, not people
No action items (analysis without improvement); lessons-learned must result in changes
Action items never completed (not tracked); assign ownership, track, verify completion
Focusing only on technical causes (ignoring process/people); most issues are process-related
No follow-up on effectiveness (don't verify improvements worked); measure to ensure impact

lessons-learned

Popularity

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

lessons-learned

Popularity

Invocation

Context Preview

SKILL.md

Lessons Learned

Context

Domain Context

Instructions

Anti-Patterns

Further Reading

Similar Skills

Help us improve

Lessons Learned

Context

Domain Context

Instructions

Anti-Patterns

Further Reading