This skill should be used when the user asks to "perform root cause analysis", "investigate production issue", "analyze incident", "find root cause", "debug production error", "trace the cause", or mentions investigating production problems, alerts, or outages. Provides systematic RCA methodology and investigation workflows.
/plugin marketplace add evangelosmeklis/thufir/plugin install thufir@thufirThis skill inherits all available tools. When active, it can use any tool Claude has access to.
examples/rca-report-template.mdreferences/investigation-checklist.mdreferences/rca-patterns.mdRoot cause analysis (RCA) is a systematic investigation process to identify the underlying cause of production incidents, errors, and outages. This skill provides structured methodologies for conducting effective RCA that goes beyond surface-level symptoms to find actionable root causes.
Apply this skill when:
Establish a clear timeline of events:
Create a visual timeline connecting:
Distinguish between symptoms and root causes:
Symptoms are observable effects:
Root causes are underlying reasons:
Always trace from symptoms to root causes by asking "why?" repeatedly.
Ask "why?" five times to drill down from symptom to root cause:
Example:
Root cause: Configuration change in commit abc123 reduced pool size inappropriately.
Base conclusions on evidence:
Avoid speculation—validate hypotheses with data.
Collect the triggering incident data:
Determine:
Construct chronological timeline:
Identify relevant code:
Focus on:
Use git to find recent changes to relevant code:
Prioritize commits made shortly before incident started.
Connect code changes to incident timeline:
Look for temporal correlation between changes and symptoms.
Synthesize findings to pinpoint root cause:
Ensure root cause is:
Validate the identified root cause:
Create RCA report including:
See examples/rca-report-template.md for report structure.
When analyzing error messages:
Example:
Error: ConnectionPoolExhausted: Could not acquire connection within timeout
Search for: ConnectionPoolExhausted or Could not acquire connection
Find: Connection pool configuration and usage
Trace: Recent changes to pool size or connection usage patterns
Git blame identifies when lines were last changed:
git blame path/to/file.js
Focus on:
Cross-reference blame timestamps with incident timeline.
Look for metric patterns indicating root cause:
Compare metrics before, during, and after incident.
Consider dependencies that could cause issues:
Check dependency health metrics and status pages.
This skill works in conjunction with:
The RCA agent orchestrates these skills to perform end-to-end investigation.
For detailed patterns and advanced techniques:
references/rca-patterns.md - Common incident patterns and solutionsreferences/investigation-checklist.md - Step-by-step investigation checklistWorking examples in examples/:
rca-report-template.md - Standard RCA report formatFive Whys: Ask "why?" five times to find root cause Timeline: Map when issue started, what changed, when detected Evidence: Metrics + Logs + Code + Config changes Root Cause: Specific, actionable, validated cause (not symptom) Report: Summary, timeline, root cause, evidence, fix
Apply this systematic methodology to transform vague production issues into clear, actionable root causes supported by evidence.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.