Measure and compare code performance through systematic benchmarking, establishing baselines, and tracking performance over time.
/plugin marketplace add marcel-Ngan/ai-dev-team/plugin install marcel-ngan-ai-dev-team@marcel-Ngan/ai-dev-teamThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Measure and compare code performance through systematic benchmarking, establishing baselines, and tracking performance over time.
| Focus | Use Case | Duration |
|---|---|---|
| Function | Algorithm comparison | Milliseconds |
| Operation | I/O, serialization | Milliseconds |
| Memory | Allocation patterns | Per-operation |
| Focus | Use Case | Duration |
|---|---|---|
| API Endpoint | Request/response cycle | Seconds |
| Workflow | End-to-end process | Seconds-minutes |
| Batch Job | Data processing | Minutes-hours |
| Type | Pattern | Purpose |
|---|---|---|
| Stress | Ramp to failure | Find limits |
| Soak | Sustained load | Stability |
| Spike | Sudden increase | Resilience |
| Breakpoint | Incremental increase | Capacity |
## Benchmark Best Practices
### Isolation
- Warm up JIT/cache before measuring
- Isolate from other processes
- Control for GC interference
- Use consistent environment
### Statistical Rigor
- Run sufficient iterations (n > 100 for microbenchmarks)
- Report percentiles, not just averages
- Calculate standard deviation
- Detect and handle outliers
### Reproducibility
- Document environment precisely
- Use deterministic inputs
- Version control benchmarks
- Automate execution
// Benchmark specification
interface BenchmarkSpec {
name: string;
description: string;
// Setup
setup: () => Promise<void>;
teardown: () => Promise<void>;
// Target
target: () => Promise<void>;
// Configuration
iterations: number;
warmupIterations: number;
timeout: number;
// Assertions
assertions: {
maxMean: number;
maxP95: number;
maxP99: number;
maxMemory: number;
};
}
## Benchmark Report
**Suite:** {{suite_name}}
**Date:** {{date}}
**Environment:** {{environment}}
**Commit:** {{commit_hash}}
### Environment Details
| Property | Value |
|----------|-------|
| CPU | {{cpu_model}} |
| Cores | {{cpu_cores}} |
| Memory | {{total_memory}} |
| OS | {{os_version}} |
| Runtime | {{runtime_version}} |
---
### Results Summary
| Benchmark | Mean | Median | P95 | P99 | Status |
|-----------|------|--------|-----|-----|--------|
| processOrder | 12.3ms | 11.8ms | 18.2ms | 24.1ms | ✅ PASS |
| searchProducts | 45.6ms | 42.1ms | 78.3ms | 112.4ms | ⚠️ WARN |
| generateReport | 234ms | 228ms | 312ms | 456ms | ❌ FAIL |
---
### Detailed Results
#### processOrder
**Configuration:**
- Iterations: 10,000
- Warmup: 1,000
- Timeout: 5,000ms
**Timing Distribution:**
0-10ms |████████████████████████████████████ 72% 10-20ms |████████████ 24% 20-30ms |██ 3% 30-40ms |▏ 0.8% 40ms+ |▏ 0.2%
**Statistics:**
| Metric | Value |
|--------|-------|
| Mean | 12.3ms |
| Median | 11.8ms |
| Std Dev | 4.2ms |
| Min | 8.1ms |
| Max | 67.2ms |
| P50 | 11.8ms |
| P90 | 16.4ms |
| P95 | 18.2ms |
| P99 | 24.1ms |
| P99.9 | 45.3ms |
**Memory:**
| Metric | Value |
|--------|-------|
| Avg Allocation | 1.2 MB |
| Peak | 4.8 MB |
| GC Count | 12 |
| GC Time | 34ms |
**Assertions:**
- [x] Mean < 20ms (12.3ms)
- [x] P95 < 25ms (18.2ms)
- [x] P99 < 50ms (24.1ms)
- [x] Memory < 10MB (4.8MB)
---
### Comparison with Baseline
| Benchmark | Current | Baseline | Change | Status |
|-----------|---------|----------|--------|--------|
| processOrder | 12.3ms | 11.8ms | +4.2% | ✅ OK |
| searchProducts | 45.6ms | 38.2ms | +19.4% | ⚠️ REGRESS |
| generateReport | 234ms | 198ms | +18.2% | ❌ REGRESS |
### Regression Alert: searchProducts
**Current:** 45.6ms (P95: 78.3ms)
**Baseline:** 38.2ms (P95: 52.1ms)
**Regression:** +19.4%
**Likely Causes:**
1. Recent change to database query (commit abc123)
2. New validation logic added (commit def456)
**Recommended Action:** Investigate commits since baseline
---
### Trend Analysis
searchProducts - Last 10 Runs Mean Response Time (ms)
50 | ● 45 | ● ● 40 | ● ● ● ● ● ● 35 | ● ● ● ● ● +------------------------------------ 1 2 3 4 5 6 7 8 9 10 Run #
**Trend:** Degrading since run #6 (commit xyz789)
# Load test specification
load_test:
target: "https://api.example.com"
scenarios:
- name: "Normal Load"
duration: "5m"
vus: 100
- name: "Peak Load"
duration: "2m"
vus: 500
- name: "Stress Test"
duration: "10m"
stages:
- duration: "2m", target: 100
- duration: "3m", target: 500
- duration: "3m", target: 1000
- duration: "2m", target: 0
thresholds:
http_req_duration: ["p(95) < 500"]
http_req_failed: ["rate < 0.01"]
## Load Test Results
**Test:** Peak Load Simulation
**Duration:** 5 minutes
**Virtual Users:** 500 concurrent
### Summary
| Metric | Value | Threshold | Status |
|--------|-------|-----------|--------|
| Requests | 145,234 | - | - |
| RPS | 484 | - | - |
| Avg Response | 187ms | - | - |
| P95 Response | 423ms | <500ms | ✅ PASS |
| Error Rate | 0.3% | <1% | ✅ PASS |
### Response Time Distribution
| Percentile | Time |
|------------|------|
| P50 | 156ms |
| P75 | 234ms |
| P90 | 367ms |
| P95 | 423ms |
| P99 | 687ms |
### Resource Utilization
| Resource | Avg | Peak |
|----------|-----|------|
| CPU | 67% | 89% |
| Memory | 72% | 81% |
| DB Connections | 45/100 | 78/100 |
### Bottleneck Analysis
- Database connection pool approaching limit at peak
- CPU headroom acceptable
- Memory stable, no leaks detected
name: Performance Benchmarks
on:
pull_request:
push:
branches: [main]
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Benchmarks
run: npm run benchmark
- name: Compare with Baseline
run: npm run benchmark:compare
- name: Comment PR with Results
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
// Post benchmark results as PR comment
performance-profiling - Identify what to optimizeperformance-optimization - Apply optimizationstesting-execution - Test executionAtlassian:createJiraIssue - Performance bug trackingAtlassian:createConfluencePage - Benchmark documentationThis skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.