Performance profiling -- CPU, memory, concurrency, benchmarking with statistical rigor.
From godmodenpx claudepluginhub arbazkhan971/godmodeThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
/godmode:perf, "profile this", "find bottleneck"# Node.js CPU profile
node --cpu-prof --cpu-prof-dir=./profiles app.js
# Go CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile
# Python memory
python -c "import tracemalloc; tracemalloc.start()"
Target: <application/function>
Symptom: <slow response, high CPU, OOM, hang>
Language: <Node.js|Python|Go|Rust|Java>
For each hotspot found:
Function: <name> at <file>:<line>
CPU share: <N>% of total
Root cause: <why slow>
Remediation: <optimized code>
Expected improvement: <N>% reduction
IF hotspot >10% CPU: must address. IF hotspot <5% CPU: diminishing returns.
Tools: Node.js (--heap-prof), Python (tracemalloc), Go (pprof), Rust (DHAT), Java (JFR+JMC).
Pattern: baseline -> load -> analyze growth.
For each finding:
Type: LEAK | EXCESSIVE ALLOCATION | UNBOUNDED CACHE
Source: <file>:<line>
Growth rate: <MB per hour/request>
Retention chain: <root -> ... -> leaked object>
IF growth is linear under load: confirmed leak. IF cache grows without bound: add max size + eviction.
# Go race detector
go test -race ./...
# C/C++ thread sanitizer
gcc -fsanitize=thread -o app app.c && ./app
Common patterns: counter without atomic, lazy init without sync, collection modified during iteration, read-modify-write without transaction.
For each finding:
Type: RACE | DEADLOCK | LIVELOCK | STARVATION
Severity: CRITICAL | HIGH | MEDIUM
Code path: <file>:<line>
Reproducibility: always | under load | intermittent
Deadlock prevention: consistent lock ordering, try-lock with timeout, channels over shared memory.
Protocol:
1. Dedicated hardware, disable frequency scaling
2. Warm-up iterations + measurement iterations
3. Report: mean, median, p95, p99, stddev, 95% CI
4. CV > 10% = unstable environment
5. Comparison: interleaved runs, Welch's t-test,
p < 0.05 = statistically significant
IF benchmark variance >10%: fix environment first. NEVER benchmark on shared CI runners for final results. NEVER skip warm-up for JIT languages (Node, Java).
CPU: top 3 hotspots with % and file:line
Memory: leaks found, growth rate, peak RSS
Concurrency: races, deadlock risks, contention
Benchmarks: values with 95% CI
Priority fixes: ordered by impact
Append .godmode/perf-results.tsv:
timestamp module finding_type location severity metric_before metric_after status
KEEP if: improvement is statistically significant
AND no regression AND tests pass.
DISCARD if: no significant improvement OR regression.
Never keep optimization without measured evidence.
STOP when FIRST of:
- All hotspots >10% CPU addressed
- Memory stable under sustained load
- Zero races detected by sanitizer
- 3 consecutive fixes < 5% improvement
On failure: git reset --hard HEAD~1. Never pause.
| Failure | Action |
|---|---|
| No clear hotspot | Check sampling rate, profile under load |
| Optimization regresses | Measure all metrics, revert if needed |
| Noisy benchmarks | Pin CPU, disable turbo, dedicated HW |