Run API load tests with k6, Artillery, or Gatling to measure performance...
Executes comprehensive load tests to measure API performance and identify bottlenecks.
/plugin marketplace add jeremylongshore/claude-code-plugins-plus-skills/plugin install api-migration-tool@claude-code-plugins-plusExecute comprehensive load tests to measure API performance, identify bottlenecks, and validate scalability under realistic traffic patterns.
This command supports multiple load testing tools to accommodate different testing scenarios and team preferences:
Alternative approaches considered:
USE WHEN:
DON'T USE WHEN:
Required:
Recommended:
Install Tools:
# k6 (recommended for most use cases)
brew install k6 # macOS
sudo apt-get install k6 # Ubuntu
# Artillery
npm install -g artillery
# Gatling
wget https://repo1.maven.org/maven2/io/gatling/highcharts/gatling-charts-highcharts-bundle/3.9.5/gatling-charts-highcharts-bundle-3.9.5.zip
unzip gatling-charts-highcharts-bundle-3.9.5.zip
Establish clear performance targets before running tests:
Document expected behavior under different load levels:
Create test scripts matching realistic user behavior patterns:
k6 test script (load-test.js):
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp-up
{ duration: '5m', target: 100 }, // Sustained load
{ duration: '2m', target: 200 }, // Scale up
{ duration: '5m', target: 200 }, // Sustained peak
{ duration: '2m', target: 0 }, // Ramp-down
],
thresholds: {
http_req_duration: ['p(95)<200', 'p(99)<500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/v1/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
sleep(1);
}
Artillery config (artillery.yml):
config:
target: 'https://api.example.com'
phases:
- duration: 60
arrivalRate: 10
name: "Warm up"
- duration: 300
arrivalRate: 50
name: "Sustained load"
- duration: 120
arrivalRate: 100
name: "Peak load"
processor: "./flows.js"
scenarios:
- name: "Product browsing flow"
flow:
- get:
url: "/v1/products"
capture:
- json: "$.products[0].id"
as: "productId"
- get:
url: "/v1/products/{{ productId }}"
- think: 3
Run tests with appropriate parameters and monitor system resources:
# k6 test execution with custom parameters
k6 run load-test.js \
--vus 100 \
--duration 10m \
--out json=results.json \
--summary-export=summary.json
# Artillery with real-time reporting
artillery run artillery.yml \
--output report.json
# Gatling test execution
./gatling.sh -s com.example.LoadTest \
-rf results/
Monitor system metrics during execution:
Review metrics to identify performance bottlenecks:
Response Time Analysis:
# k6 summary shows percentile distribution
http_req_duration..............: avg=156ms p(95)=289ms p(99)=456ms
http_req_failed................: 0.12% (12 failures / 10000 requests)
http_reqs......................: 10000 166.67/s
vus............................: 100 min=0 max=100
Key metrics to examine:
Create actionable reports with findings and optimization suggestions:
Performance Report Structure:
# Load Test Results - 2025-10-11
## Test Configuration
- Duration: 10 minutes
- Virtual Users: 100
- Target: https://api.example.com/v1/products
## Results Summary
- Total Requests: 10,000
- Success Rate: 99.88%
- Avg Response Time: 156ms
- p95 Response Time: 289ms
- Throughput: 166.67 RPS
## Findings
1. Database query optimization needed (p99 spikes to 456ms)
2. Connection pool exhausted at 150 concurrent users
3. Memory leak detected after 8 minutes
## Recommendations
1. Add database indexes on product_id and category
2. Increase connection pool from 20 to 50
3. Fix memory leak in image processing service
The command generates structured performance reports:
Console Output:
Running load test with k6...
execution: local
script: load-test.js
output: json (results.json)
scenarios: (100.00%) 1 scenario, 200 max VUs, 17m0s max duration
data_received..................: 48 MB 80 kB/s
data_sent......................: 2.4 MB 4.0 kB/s
http_req_blocked...............: avg=1.23ms p(95)=3.45ms p(99)=8.91ms
http_req_connecting............: avg=856µs p(95)=2.34ms p(99)=5.67ms
http_req_duration..............: avg=156.78ms p(95)=289.45ms p(99)=456.12ms
http_req_failed................: 0.12%
http_req_receiving.............: avg=234µs p(95)=567µs p(99)=1.23ms
http_req_sending...............: avg=123µs p(95)=345µs p(99)=789µs
http_req_tls_handshaking.......: avg=0s p(95)=0s p(99)=0s
http_req_waiting...............: avg=156.42ms p(95)=288.89ms p(99)=455.34ms
http_reqs......................: 10000 166.67/s
iteration_duration.............: avg=1.16s p(95)=1.29s p(99)=1.46s
iterations.....................: 10000 166.67/s
vus............................: 100 min=0 max=200
vus_max........................: 200 min=200 max=200
JSON Report:
{
"metrics": {
"http_req_duration": {
"avg": 156.78,
"p95": 289.45,
"p99": 456.12
},
"http_req_failed": 0.0012,
"http_reqs": {
"count": 10000,
"rate": 166.67
}
},
"root_group": {
"checks": {
"status is 200": {
"passes": 9988,
"fails": 12
}
}
}
}
Test a REST API endpoint with gradual ramp-up and threshold validation:
// basic-load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
export const options = {
// Ramp-up pattern: 0 -> 50 -> 100 -> 50 -> 0
stages: [
{ duration: '1m', target: 50 }, // Ramp-up to 50 users
{ duration: '3m', target: 50 }, // Stay at 50 users
{ duration: '1m', target: 100 }, // Spike to 100 users
{ duration: '3m', target: 100 }, // Stay at 100 users
{ duration: '1m', target: 50 }, // Scale down to 50
{ duration: '1m', target: 0 }, // Ramp-down to 0
],
// Performance thresholds (test fails if exceeded)
thresholds: {
'http_req_duration': ['p(95)<300', 'p(99)<500'],
'http_req_failed': ['rate<0.01'], // Less than 1% errors
'errors': ['rate<0.1'],
},
};
export default function () {
// Test parameters
const baseUrl = 'https://api.example.com';
const params = {
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${__ENV.API_TOKEN}`,
},
};
// API request
const res = http.get(`${baseUrl}/v1/products?limit=20`, params);
// Validation checks
const checkRes = check(res, {
'status is 200': (r) => r.status === 200,
'response time < 300ms': (r) => r.timings.duration < 300,
'has products': (r) => r.json('products').length > 0,
'valid JSON': (r) => {
try {
JSON.parse(r.body);
return true;
} catch (e) {
return false;
}
},
});
// Track custom error metric
errorRate.add(!checkRes);
// Simulate user think time
sleep(Math.random() * 3 + 1); // 1-4 seconds
}
// Teardown function (runs once at end)
export function teardown(data) {
console.log('Load test completed');
}
Run command:
# Set API token and execute
export API_TOKEN="your-token-here"
k6 run basic-load-test.js \
--out json=results.json \
--summary-export=summary.json
# Generate HTML report from JSON
k6-reporter results.json --output report.html
Test API breaking point with gradual load increase until failure:
# stress-test.yml
config:
target: 'https://api.example.com'
phases:
# Gradual ramp-up to find breaking point
- duration: 60
arrivalRate: 10
name: "Phase 1: Baseline (10 RPS)"
- duration: 60
arrivalRate: 50
name: "Phase 2: Moderate (50 RPS)"
- duration: 60
arrivalRate: 100
name: "Phase 3: High (100 RPS)"
- duration: 60
arrivalRate: 200
name: "Phase 4: Stress (200 RPS)"
- duration: 60
arrivalRate: 400
name: "Phase 5: Breaking point (400 RPS)"
# Environment variables
variables:
api_token: "{{ $processEnvironment.API_TOKEN }}"
# HTTP settings
http:
timeout: 10
pool: 50
# Custom plugins
plugins:
expect: {}
metrics-by-endpoint: {}
# Success criteria
ensure:
p95: 500
p99: 1000
maxErrorRate: 1
# Test scenarios
scenarios:
- name: "Product CRUD operations"
weight: 70
flow:
# List products
- get:
url: "/v1/products"
headers:
Authorization: "Bearer {{ api_token }}"
expect:
- statusCode: 200
- contentType: json
- hasProperty: products
capture:
- json: "$.products[0].id"
as: "productId"
# Get product details
- get:
url: "/v1/products/{{ productId }}"
headers:
Authorization: "Bearer {{ api_token }}"
expect:
- statusCode: 200
- hasProperty: id
# Think time (user reading)
- think: 2
# Search products
- get:
url: "/v1/products/search?q=laptop"
headers:
Authorization: "Bearer {{ api_token }}"
expect:
- statusCode: 200
- name: "User authentication flow"
weight: 20
flow:
- post:
url: "/v1/auth/login"
json:
email: "test@example.com"
password: "password123"
expect:
- statusCode: 200
- hasProperty: token
capture:
- json: "$.token"
as: "userToken"
- get:
url: "/v1/users/me"
headers:
Authorization: "Bearer {{ userToken }}"
expect:
- statusCode: 200
- name: "Shopping cart operations"
weight: 10
flow:
- post:
url: "/v1/cart/items"
headers:
Authorization: "Bearer {{ api_token }}"
json:
productId: "{{ productId }}"
quantity: 1
expect:
- statusCode: 201
- get:
url: "/v1/cart"
headers:
Authorization: "Bearer {{ api_token }}"
expect:
- statusCode: 200
- hasProperty: items
Run with custom processor:
// flows.js - Custom logic for Artillery
module.exports = {
// Before request hook
setAuthToken: function(requestParams, context, ee, next) {
requestParams.headers = requestParams.headers || {};
requestParams.headers['X-Request-ID'] = `req-${Date.now()}-${Math.random()}`;
return next();
},
// After response hook
logResponse: function(requestParams, response, context, ee, next) {
if (response.statusCode >= 400) {
console.log(`Error: ${response.statusCode} - ${requestParams.url}`);
}
return next();
},
// Custom function to generate dynamic data
generateTestData: function(context, events, done) {
context.vars.userId = `user-${Math.floor(Math.random() * 10000)}`;
context.vars.timestamp = new Date().toISOString();
return done();
}
};
Execute stress test:
# Run with environment variable
API_TOKEN="your-token" artillery run stress-test.yml \
--output stress-results.json
# Generate HTML report
artillery report stress-results.json \
--output stress-report.html
# Run with custom config overrides
artillery run stress-test.yml \
--config config.phases[0].duration=30 \
--config config.phases[0].arrivalRate=20
Enterprise-grade load test with complex scenarios and detailed reporting:
// LoadSimulation.scala
package com.example.loadtest
import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._
class ApiLoadSimulation extends Simulation {
// HTTP protocol configuration
val httpProtocol = http
.baseUrl("https://api.example.com")
.acceptHeader("application/json")
.authorizationHeader("Bearer ${accessToken}")
.userAgentHeader("Gatling Load Test")
.shareConnections
// Feeders for test data
val userFeeder = csv("users.csv").circular
val productFeeder = csv("products.csv").random
// Custom headers
val sentHeaders = Map(
"X-Request-ID" -> "${requestId}",
"X-Client-Version" -> "1.0.0"
)
// Scenario 1: Browse products
val browseProducts = scenario("Browse Products")
.feed(userFeeder)
.exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
.exec(
http("List Products")
.get("/v1/products")
.headers(sentHeaders)
.check(status.is(200))
.check(jsonPath("$.products[*].id").findAll.saveAs("productIds"))
)
.pause(2, 5)
.exec(
http("Get Product Details")
.get("/v1/products/${productIds.random()}")
.check(status.is(200))
.check(jsonPath("$.id").exists)
.check(jsonPath("$.price").ofType[Double].saveAs("price"))
)
.pause(1, 3)
// Scenario 2: Search and filter
val searchProducts = scenario("Search Products")
.exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
.exec(
http("Search Products")
.get("/v1/products/search")
.queryParam("q", "laptop")
.queryParam("minPrice", "500")
.queryParam("maxPrice", "2000")
.headers(sentHeaders)
.check(status.is(200))
.check(jsonPath("$.total").ofType[Int].gt(0))
)
.pause(2, 4)
.exec(
http("Apply Filters")
.get("/v1/products/search")
.queryParam("q", "laptop")
.queryParam("brand", "Dell")
.queryParam("sort", "price")
.check(status.is(200))
)
// Scenario 3: Checkout flow
val checkout = scenario("Checkout Flow")
.feed(userFeeder)
.feed(productFeeder)
.exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
.exec(
http("Add to Cart")
.post("/v1/cart/items")
.headers(sentHeaders)
.body(StringBody("""{"productId": "${productId}", "quantity": 1}"""))
.asJson
.check(status.is(201))
.check(jsonPath("$.cartId").saveAs("cartId"))
)
.pause(1, 2)
.exec(
http("Get Cart")
.get("/v1/cart/${cartId}")
.check(status.is(200))
.check(jsonPath("$.total").ofType[Double].saveAs("total"))
)
.pause(2, 4)
.exec(
http("Create Order")
.post("/v1/orders")
.body(StringBody("""{"cartId": "${cartId}", "paymentMethod": "credit_card"}"""))
.asJson
.check(status.in(200, 201))
.check(jsonPath("$.orderId").saveAs("orderId"))
)
.exec(
http("Get Order Status")
.get("/v1/orders/${orderId}")
.check(status.is(200))
.check(jsonPath("$.status").is("pending"))
)
// Load profile: Realistic production traffic pattern
setUp(
// 70% users browse products
browseProducts.inject(
rampUsersPerSec(1) to 50 during (2 minutes),
constantUsersPerSec(50) during (5 minutes),
rampUsersPerSec(50) to 100 during (3 minutes),
constantUsersPerSec(100) during (5 minutes),
rampUsersPerSec(100) to 0 during (2 minutes)
).protocols(httpProtocol),
// 20% users search
searchProducts.inject(
rampUsersPerSec(1) to 15 during (2 minutes),
constantUsersPerSec(15) during (10 minutes),
rampUsersPerSec(15) to 0 during (2 minutes)
).protocols(httpProtocol),
// 10% users complete checkout
checkout.inject(
rampUsersPerSec(1) to 10 during (3 minutes),
constantUsersPerSec(10) during (10 minutes),
rampUsersPerSec(10) to 0 during (2 minutes)
).protocols(httpProtocol)
).protocols(httpProtocol)
.assertions(
global.responseTime.max.lt(2000),
global.responseTime.percentile3.lt(500),
global.successfulRequests.percent.gt(99)
)
}
Supporting data files:
users.csv:
userId,accessToken
user-001,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
user-002,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
user-003,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
products.csv:
productId,category
prod-001,electronics
prod-002,clothing
prod-003,books
Run Gatling simulation:
# Using Gatling Maven plugin
mvn gatling:test -Dgatling.simulationClass=com.example.loadtest.ApiLoadSimulation
# Using standalone Gatling
./gatling.sh -s com.example.loadtest.ApiLoadSimulation \
-rf results/
# Generate report only (from previous run)
./gatling.sh -ro results/apisimulation-20251011143022456
Gatling configuration (gatling.conf):
gatling {
core {
outputDirectoryBaseName = "api-load-test"
runDescription = "Production load simulation"
encoding = "utf-8"
simulationClass = ""
}
charting {
indicators {
lowerBound = 100 # Lower bound for response time (ms)
higherBound = 500 # Higher bound for response time (ms)
percentile1 = 50 # First percentile
percentile2 = 75 # Second percentile
percentile3 = 95 # Third percentile
percentile4 = 99 # Fourth percentile
}
}
http {
ahc {
pooledConnectionIdleTimeout = 60000
readTimeout = 60000
requestTimeout = 60000
connectionTimeout = 30000
maxConnections = 200
maxConnectionsPerHost = 50
}
}
data {
writers = [console, file]
}
}
Common errors and solutions:
Connection Refused:
Error: connect ECONNREFUSED 127.0.0.1:8080
Solution: Verify API is running and accessible. Check network connectivity and firewall rules.
Timeout Errors:
http_req_failed: 45.2% (4520 failures / 10000 requests)
Solution: Increase timeout values or reduce concurrent users. API may be overwhelmed.
SSL/TLS Errors:
Error: x509: certificate signed by unknown authority
Solution: Add insecureSkipTLSVerify: true or configure proper CA certificates.
Rate Limiting:
HTTP 429 Too Many Requests
Solution: Reduce request rate or increase rate limits on API server. Add backoff logic.
Memory Exhaustion:
JavaScript heap out of memory
Solution: Increase Node.js memory limit: NODE_OPTIONS=--max-old-space-size=4096 k6 run test.js
Authentication Failures:
HTTP 401 Unauthorized
Solution: Verify API tokens are valid and not expired. Check authorization headers.
--vus N # Number of virtual users (default: 1)
--duration Xm # Test duration (e.g., 10m, 30s)
--iterations N # Total iterations across all VUs
--stage "Xm:N" # Add load stage (duration:target)
--rps N # Max requests per second
--max-redirects N # Max HTTP redirects (default: 10)
--batch N # Max parallel batch requests
--batch-per-host N # Max parallel requests per host
--http-debug # Enable HTTP debug logging
--no-connection-reuse # Disable HTTP keep-alive
--throw # Throw errors on failed HTTP requests
--summary-trend-stats # Custom summary stats (e.g., "avg,p(95),p(99)")
--out json=file.json # Export results to JSON
--out influxdb=http://... # Export to InfluxDB
--out statsd # Export to StatsD
--target URL # Override target URL
--output FILE # Save results to JSON file
--overrides FILE # Override config with JSON file
--variables FILE # Load variables from JSON
--config KEY=VALUE # Override single config value
--environment ENV # Select environment from config
--solo # Run test without publishing
--quiet # Suppress output
--plugins # List installed plugins
--dotenv FILE # Load environment from .env file
-s CLASS # Simulation class to run
-rf FOLDER # Results folder
-rd DESC # Run description
-nr # No reports generation
-ro FOLDER # Generate reports only
/api-mock-server - Create mock API for testing without backend/api-monitoring-dashboard - Set up real-time monitoring during load tests/api-cache-manager - Configure caching to improve performance under load/api-rate-limiter - Implement rate limiting to protect APIs/deployment-pipeline-orchestrator - Integrate load tests into CI/CD pipeline/kubernetes-deployment-creator - Configure autoscaling based on load test findingsexport API_TOKEN=$(vault read -field=token secret/api)Symptoms: Response times vary by > 50% between identical test runs Diagnosis:
Symptoms: API handling only 100 RPS despite 20% CPU usage Diagnosis:
Symptoms: 0.1% errors at 100 RPS, 5% errors at 500 RPS Diagnosis:
ulimit -n 65536Symptoms: Memory usage grows continuously without stabilizing Diagnosis:
Symptoms: k6/Artillery process terminated with OOM error Diagnosis:
NODE_OPTIONS=--max-old-space-size=8192--quiet or --summary-export only