From agent-almanac
Investigates root causes of compliance deviations and develops CAPAs using 5-Why, fishbone, fault tree analysis. For audit findings, validated system incidents, regulatory observations, data integrity issues.
npx claudepluginhub pjt222/agent-almanacThis skill uses the workspace's default tool permissions.
---
Guides non-conformance investigations, root cause analysis, CAPA management, SPC interpretation, and supplier quality audits in regulated manufacturing (FDA, IATF, AS9100).
Applies Fishbone (Ishikawa) diagrams and 5 Whys to identify root causes of problems and recommend corrective actions. Outputs as YAML or Mermaid diagrams.
Conducts blameless postmortems for software incidents: assembles timelines, quantifies impact, root cause analysis (5 Whys, fishbone diagrams), corrective actions with owners and deadlines. Use for outages, failures, RCA.
Share bugs, ideas, or general feedback.
Conduct a structured root cause investigation and develop effective corrective and preventive actions for compliance deviations.
# Root Cause Investigation
## Document ID: RCA-[CAPA-ID]
## CAPA Reference: CAPA-[YYYY]-[NNN]
### 1. Trigger
| Field | Value |
|-------|-------|
| Source | [Audit finding / Deviation / Inspection observation / Monitoring alert] |
| Reference | [Finding ID, deviation ID, or observation number] |
| System | [Affected system name and version] |
| Date discovered | [YYYY-MM-DD] |
| Severity | [Critical / Major / Minor] |
| Investigator | [Name, Title] |
| Investigation deadline | [Date — per severity: Critical 15 days, Major 30 days, Minor 60 days] |
### 2. Problem Statement
[Objective, factual description of what happened, what should have happened, and the gap between the two. No blame, no assumptions.]
### 3. Immediate Containment (if required)
| Action | Owner | Completed |
|--------|-------|-----------|
| [e.g., Restrict system access pending investigation] | [Name] | [Date] |
| [e.g., Quarantine affected batch records] | [Name] | [Date] |
| [e.g., Implement manual workaround] | [Name] | [Date] |
Expected: Investigation initiated with clear problem statement and containment actions within 24 hours for critical findings. On failure: If containment cannot be implemented immediately, escalate to QA Director and document the risk of delayed containment.
Choose the method based on problem complexity:
### Investigation Method Selection
| Method | Best For | Complexity | Output |
|--------|----------|-----------|--------|
| **5-Why Analysis** | Single-cause problems, straightforward failures | Low | Linear cause chain |
| **Fishbone (Ishikawa)** | Multi-factor problems, process failures | Medium | Cause-and-effect diagram |
| **Fault Tree Analysis** | System failures, safety-critical events | High | Boolean logic tree |
**Selected method:** [5-Why / Fishbone / Fault Tree / Combination]
**Rationale:** [Why this method is appropriate for this problem]
Expected: Method selected matches the problem complexity — don't use a fault tree for a simple procedural error, and don't use 5-Why for a complex systemic failure. On failure: If the first method does not reach a convincing root cause, apply a second method. Convergence across methods strengthens the conclusion.
### 5-Why Analysis
| Level | Question | Answer | Evidence |
|-------|----------|--------|----------|
| Why 1 | Why did [the problem] occur? | [Immediate cause] | [Evidence reference] |
| Why 2 | Why did [immediate cause] occur? | [Contributing factor] | [Evidence reference] |
| Why 3 | Why did [contributing factor] occur? | [Deeper cause] | [Evidence reference] |
| Why 4 | Why did [deeper cause] occur? | [Systemic cause] | [Evidence reference] |
| Why 5 | Why did [systemic cause] occur? | [Root cause] | [Evidence reference] |
**Root cause:** [Clear statement of the fundamental cause]
### Fishbone Analysis
Analyse causes across six standard categories:
| Category | Potential Causes | Confirmed? | Evidence |
|----------|-----------------|------------|----------|
| **People** | Inadequate training, unfamiliarity with SOP, staffing shortage | [Y/N] | [Ref] |
| **Process** | SOP unclear, missing step, wrong sequence | [Y/N] | [Ref] |
| **Technology** | System misconfiguration, software bug, interface failure | [Y/N] | [Ref] |
| **Materials** | Incorrect input data, wrong version of reference document | [Y/N] | [Ref] |
| **Measurement** | Wrong metric, inadequate monitoring, missed threshold | [Y/N] | [Ref] |
| **Environment** | Organisational change, regulatory change, resource constraints | [Y/N] | [Ref] |
**Contributing causes:** [List confirmed causes]
**Root cause(s):** [The fundamental cause(s) — may be more than one]
### Fault Tree Analysis
**Top event:** [The undesired event]
Level 1 (OR gate — any of these could cause the top event):
├── [Cause A]
│ Level 2 (AND gate — both needed):
│ ├── [Sub-cause A1]
│ └── [Sub-cause A2]
├── [Cause B]
│ Level 2 (OR gate):
│ ├── [Sub-cause B1]
│ └── [Sub-cause B2]
└── [Cause C]
**Minimal cut sets:** [Smallest combinations of events that cause the top event]
**Root cause(s):** [Fundamental failures identified in the tree]
Expected: Root cause analysis reaches the fundamental cause (not just the symptom) with supporting evidence for each step. On failure: If the analysis produces only symptoms ("user made an error"), push deeper. Ask: "Why was the user able to make that error? What control should have prevented it?"
Distinguish clearly between correction, corrective action, and preventive action:
### CAPA Plan
| Category | Definition | Action | Owner | Deadline |
|----------|-----------|--------|-------|----------|
| **Correction** | Fix the immediate problem | [e.g., Re-enable audit trail for batch module] | [Name] | [Date] |
| **Corrective Action** | Eliminate the root cause | [e.g., Remove admin ability to disable audit trail; require change control for all audit trail configuration changes] | [Name] | [Date] |
| **Preventive Action** | Prevent recurrence in other areas | [e.g., Audit all systems for audit trail disable capability; add monitoring alert for audit trail configuration changes] | [Name] | [Date] |
### CAPA Details
**CAPA-[YYYY]-[NNN]-CA1: [Corrective Action Title]**
- **Root cause addressed:** [Specific root cause from Step 3]
- **Action description:** [Detailed description of what will be done]
- **Success criteria:** [Measurable outcome that proves the action worked]
- **Verification method:** [How effectiveness will be checked]
- **Verification date:** [When effectiveness will be verified — typically 3-6 months after implementation]
**CAPA-[YYYY]-[NNN]-PA1: [Preventive Action Title]**
- **Risk addressed:** [What recurrence or spread this prevents]
- **Action description:** [Detailed description]
- **Success criteria:** [Measurable outcome]
- **Verification method:** [How effectiveness will be checked]
- **Verification date:** [Date]
Expected: Every CAPA action traces to a specific root cause, has measurable success criteria, and includes an effectiveness verification plan. On failure: If success criteria are vague ("improve compliance"), rewrite them to be specific and measurable ("zero audit trail configuration changes outside change control for 6 consecutive months").
After CAPA implementation, verify that the actions actually worked:
### Effectiveness Verification
**CAPA-[YYYY]-[NNN] — Verification Record**
| CAPA Action | Verification Date | Method | Evidence | Result |
|-------------|------------------|--------|----------|--------|
| CA1: [Action] | [Date] | [Method: audit, sampling, metric review] | [Evidence reference] | [Effective / Not Effective] |
| PA1: [Action] | [Date] | [Method] | [Evidence reference] | [Effective / Not Effective] |
### Effectiveness Criteria Check
- [ ] The original problem has not recurred since CAPA implementation
- [ ] The corrective action eliminated the root cause (evidence: [reference])
- [ ] The preventive action has been applied to similar systems/processes
- [ ] No new issues were introduced by the CAPA actions
### CAPA Closure
| Field | Value |
|-------|-------|
| Closure decision | [Closed — Effective / Closed — Not Effective / Extended] |
| Closed by | [Name, Title] |
| Closure date | [YYYY-MM-DD] |
| Next review | [If recurring, when to re-check] |
Expected: Effectiveness verification demonstrates that the root cause was actually eliminated, not just that the action was completed. On failure: If verification shows the CAPA was not effective, reopen the investigation and develop revised actions. Do not close an ineffective CAPA.
### CAPA Trend Analysis
| Period | Total CAPAs | By Source | Top 3 Root Cause Categories | Recurring? |
|--------|------------|-----------|---------------------------|------------|
| Q1 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] |
| Q2 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] |
### Systemic Issues
| Issue | Frequency | Systems Affected | Recommended Action |
|-------|-----------|-----------------|-------------------|
| [e.g., Training gaps] | [N occurrences in 12 months] | [Systems] | [Systemic programme improvement] |
Expected: Trend analysis identifies systemic issues that individual CAPAs miss. On failure: If trending reveals recurring root causes despite CAPAs, the CAPAs are treating symptoms. Escalate to management review for systemic intervention.
conduct-gxp-audit — audits generate findings that require CAPAsmonitor-data-integrity — monitoring detects anomalies that trigger investigationsmanage-change-control — CAPA-driven changes go through change controlprepare-inspection-readiness — open and overdue CAPAs are top inspection targetsdesign-training-program — when root cause is training-related, improve the training programme