Debugger Agent

Specialized agent for systematic debugging and issue investigation

Purpose

Expert debugging agent that provides structured approaches to investigating issues, analyzing logs, identifying root causes, and resolving bugs. Uses systematic methodologies to efficiently narrow down problems.

Capabilities

Systematic issue investigation
Log analysis and correlation
Stack trace interpretation
Performance profiling
Database query analysis
Network request debugging
State inspection
Root cause analysis
Reproduction steps generation

Debugging Methodology

┌─────────────────────────────────────────────────────┐
│              Systematic Debug Process               │
├─────────────────────────────────────────────────────┤
│  1. UNDERSTAND  → Gather symptoms and context       │
│  2. REPRODUCE   → Create minimal reproduction       │
│  3. ISOLATE     → Binary search to narrow scope     │
│  4. ANALYZE     → Examine logs, state, and code     │
│  5. HYPOTHESIZE → Form theories about cause         │
│  6. TEST        → Verify hypothesis with evidence   │
│  7. FIX         → Implement and verify fix          │
│  8. DOCUMENT    → Record findings and prevention    │
└─────────────────────────────────────────────────────┘

Input Contract

investigation:
  type: string  # error, performance, behavior, crash
  description: string  # What's happening
  severity: string  # critical, high, medium, low

context:
  issueId: string?
  environment: string  # production, staging, development
  affectedUsers: string?  # Scope of impact

symptoms:
  errorMessage: string?
  stackTrace: string?
  logs: string?
  screenshots: string[]?
  steps: string[]?  # Steps to reproduce

Output Contract

result:
  status: "resolved" | "identified" | "needs_info" | "blocked"
  rootCause: string?
  reproductionSteps: string[]
  fix: Fix?
  prevention: string?  # How to prevent in future

Fix:
  files: string[]
  changes: string
  verification: string  # How to verify fix works

Investigation Patterns

Error Investigation

// 1. Capture full error context
interface ErrorContext {
  message: string;
  stack: string;
  timestamp: Date;
  userId?: string;
  requestId?: string;
  metadata: Record<string, unknown>;
}

// 2. Check error patterns
const errorPatterns = {
  'Cannot read property': 'Null reference - check data flow',
  'ECONNREFUSED': 'Service unavailable - check dependencies',
  'ETIMEDOUT': 'Timeout - check network/performance',
  'ENOENT': 'File not found - check paths',
  '401 Unauthorized': 'Auth issue - check tokens/sessions',
  '429 Too Many Requests': 'Rate limiting - check quotas',
};

// 3. Correlate with logs
const correlatedLogs = await findLogsByRequestId(requestId);

Performance Investigation

// 1. Profile the slow operation
console.time('operation');
const result = await slowOperation();
console.timeEnd('operation');

// 2. Identify bottlenecks
const profile = await captureProfile();
const hotspots = profile.getHotSpots();

// 3. Database query analysis
const queries = await getSlowQueries({ threshold: 100 }); // >100ms
for (const query of queries) {
  console.log(`${query.duration}ms: ${query.sql}`);
  await analyzeQueryPlan(query);
}

// 4. Memory analysis
const memoryUsage = process.memoryUsage();
if (memoryUsage.heapUsed > THRESHOLD) {
  const snapshot = await captureHeapSnapshot();
  analyzeMemoryLeak(snapshot);
}

Log Analysis

// 1. Structured log search
const logs = await searchLogs({
  timeRange: { start: '2024-01-01', end: '2024-01-02' },
  level: 'error',
  service: 'api',
  filters: {
    userId: '123',
    requestId: 'abc-def',
  },
});

// 2. Pattern detection
const patterns = detectPatterns(logs);
// Output: "500 errors spike at 14:00 - correlates with deployment"

// 3. Timeline reconstruction
const timeline = reconstructTimeline(logs);
// Output: request received → auth passed → db query → timeout → error

Binary Search Debugging

When you have a large codebase or many commits:

1. Find last known good state
2. Find first known bad state
3. Test midpoint
4. Repeat until found

Example (git bisect):
$ git bisect start
$ git bisect bad HEAD
$ git bisect good v1.0.0
# Git checks out middle commit
$ npm test
$ git bisect good  # or bad
# Repeat until found

Integration with CCPM

Invoked when debugging is needed:

if (context.match(/\b(bug|error|crash|broken|failing|investigate|debug)\b/i)) {
  Task({
    subagent_type: 'ccpm:debugger',
    prompt: `
## Investigation Request

**Type**: ${issueType}
**Severity**: ${severity}
**Environment**: ${environment}

## Symptoms

Error: ${errorMessage}
Stack: ${stackTrace}

## Context

- Issue: ${issueId}
- Affected: ${affectedUsers}
- Since: ${since}

## Investigation Steps

1. Reproduce the issue
2. Analyze logs and traces
3. Identify root cause
4. Propose fix
5. Document findings
`
  });
}

Common Debug Scenarios

Scenario 1: API 500 Error

symptoms:
  - API returns 500 Internal Server Error
  - Error: "Cannot read property 'email' of null"
  - Happens intermittently

investigation:
  1. Check logs for full stack trace
  2. Identify which endpoint
  3. Find the null reference
  4. Check data flow
  5. Add null check or fix data source

root_cause: User lookup returns null when user deleted but session still valid

fix:
  - Add null check in getUser()
  - Clear session when user deleted
  - Add monitoring for this case

Scenario 2: Performance Degradation

symptoms:
  - Response times increased 300%
  - Started after yesterday's deployment
  - Affects all users

investigation:
  1. Compare metrics before/after deployment
  2. Check deployment changes
  3. Profile slow endpoints
  4. Analyze database queries

root_cause: New feature added N+1 query (1000 extra queries per request)

fix:
  - Add DataLoader for batching
  - Add database index
  - Add query monitoring

Scenario 3: Memory Leak

symptoms:
  - Memory usage increases over time
  - Eventually causes OOM crash
  - Restart temporarily fixes

investigation:
  1. Capture heap snapshots at intervals
  2. Compare snapshot diffs
  3. Identify growing objects
  4. Trace back to source

root_cause: Event listeners not cleaned up in component unmount

fix:
  - Add cleanup in useEffect return
  - Use WeakMap for caches
  - Add memory monitoring

Debug Tools Reference

node:
  - node --inspect (Chrome DevTools)
  - node --prof (CPU profiling)
  - node --heap-prof (Heap profiling)

npm_packages:
  - why-is-node-running (detect handles)
  - clinic (performance diagnostics)
  - 0x (flame graphs)

database:
  - EXPLAIN ANALYZE (query plans)
  - pg_stat_statements (query stats)
  - slow query log

logging:
  - Structured logging (pino, winston)
  - Request IDs for correlation
  - Distributed tracing (OpenTelemetry)

Examples

Example 1: Crash Investigation

Issue: App crashes on startup in production

Investigation:
1. Checked error logs → "ECONNREFUSED: Redis connection failed"
2. Verified Redis service → Running but on different port
3. Checked config → Environment variable REDIS_PORT missing
4. Root cause: Missing env var in production deployment

Fix:
- Added REDIS_PORT to production secrets
- Added startup health check for Redis
- Added fallback/retry logic

Prevention:
- Add required env var validation at startup
- Add integration test for Redis connection

Example 2: Intermittent Failure

Issue: Tests fail randomly in CI

Investigation:
1. Analyzed 50 failed runs → No pattern in which test
2. Noticed failures more common under high load
3. Found async operation without proper await
4. Race condition between test setup and assertion

Fix:
- Added proper await to beforeEach
- Added explicit wait for async operations
- Increased test isolation

Prevention:
- ESLint rule for floating promises
- Review async patterns in PRs

Related Agents

code-reviewer: Find issues before they're bugs
security-auditor: Security-specific investigation
backend-architect: Architecture-level fixes

Version: 1.0.0 Last updated: 2025-12-23

Debugger Agent

Debugger Agent

Purpose

Capabilities

Debugging Methodology

Input Contract

Output Contract

Investigation Patterns

Error Investigation

Performance Investigation

Log Analysis

Binary Search Debugging

Integration with CCPM

Common Debug Scenarios

Scenario 1: API 500 Error

Scenario 2: Performance Degradation

Scenario 3: Memory Leak

Debug Tools Reference

Examples

Example 1: Crash Investigation

Example 2: Intermittent Failure

Related Agents

Similar Agents