Performance Testing

Performance Testing Types

Load Testing

Purpose: Verify system performance under expected load

Simulates expected user traffic and data volume
Identifies performance bottlenecks under normal conditions
Establishes performance baselines
Validates SLA compliance

Key Metrics:

Response time (average, median, p95, p99)
Throughput (requests per second, transactions per second)
Error rate
Resource utilization (CPU, memory, disk, network)

Stress Testing

Purpose: Identify system breaking points

Exceeds expected load to find limits
Tests system recovery after failure
Identifies failure modes and error handling
Validates graceful degradation

Key Metrics:

Maximum concurrent users before failure
Maximum throughput before failure
Time to recover after load reduction
Error patterns and failure modes

Spike Testing

Purpose: Handle sudden traffic increases

Simulates sudden traffic spikes (e.g., flash sales, viral content)
Tests system elasticity and auto-scaling
Validates queuing and throttling mechanisms
Identifies race conditions under load

Key Metrics:

Response time during spike
Error rate during spike
Time to stabilize after spike
Queue depth and processing time

Soak Testing

Purpose: Verify stability over extended periods

Runs sustained load for hours or days
Identifies memory leaks and resource exhaustion
Tests database connection pool stability
Validates garbage collection efficiency

Key Metrics:

Memory usage over time
Response time trends
Error rate over time
Resource utilization trends

Volume Testing

Purpose: Test with large data volumes

Tests performance with realistic data sizes
Identifies database query performance issues
Tests file system and storage performance
Validates data migration performance

Key Metrics:

Query execution time with large datasets
Index usage and effectiveness
Storage I/O performance
Data processing throughput

Performance Testing Tools

JMeter

Best for: Load and stress testing

Open source, Java-based
Supports multiple protocols (HTTP, JDBC, JMS, etc.)
Distributed testing support
Extensive plugin ecosystem
GUI and CLI modes

<!-- JMeter Test Plan Example -->
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan>
  <hashTree>
    <TestPlan guiclass="TestPlanGui">
      <stringProp name="TestPlan.comments">Load Test</stringProp>
    </TestPlan>
    <hashTree>
      <ThreadGroup guiclass="ThreadGroupGui">
        <stringProp name="ThreadGroup.num_threads">100</stringProp>
        <stringProp name="ThreadGroup.ramp_time">10</stringProp>
        <stringProp name="ThreadGroup.duration">60</stringProp>
      </ThreadGroup>
      <hashTree>
        <HTTPSamplerProxy guiclass="HttpTestSampleGui">
          <stringProp name="HTTPSampler.domain">example.com</stringProp>
          <stringProp name="HTTPSampler.path">/api/users</stringProp>
        </HTTPSamplerProxy>
      </hashTree>
    </hashTree>
  </hashTree>
</jmeterTestPlan>

Gatling

Best for: High-performance load testing

Scala-based, DSL for test scenarios
High performance, low resource usage
Real-time metrics and reporting
Good for continuous integration
Supports HTTP, WebSocket, JMS

// Gatling Example
import io.gatling.core.Predef._
import io.gatling.http.Predef._

class LoadTest extends Simulation {
  val httpProtocol = http.baseUrl("https://example.com")
  
  val scn = scenario("User Journey")
    .exec(http("Get Users").get("/api/users"))
    .pause(1)
    .exec(http("Get User").get("/api/users/1"))
  
  setUp(
    scn.inject(
      rampUsers(100).during(10.seconds),
      constantUsersPerSec(50).during(60.seconds)
    )
  ).protocols(httpProtocol)
}

k6

Best for: Developer-friendly performance testing

JavaScript-based, easy to learn
Modern CLI and cloud integration
Good for CI/CD pipelines
Supports HTTP/1.1, HTTP/2, WebSocket
Grafana integration for visualization

// k6 Example
import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
  stages: [
    { duration: '10s', target: 100 },
    { duration: '60s', target: 100 },
    { duration: '10s', target: 0 },
  ],
};

export default function () {
  let res = http.get('https://example.com/api/users');
  check(res, {
    'status was 200': (r) => r.status == 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });
  sleep(1);
}

Locust

Best for: Python-based load testing

Python-based, easy to write tests
Web UI for real-time monitoring
Distributed testing support
Good for complex user scenarios
Event-based architecture

# Locust Example
from locust import HttpUser, task, between

class WebsiteUser(HttpUser):
    wait_time = between(1, 3)
    
    @task
    def get_users(self):
        self.client.get("/api/users")
    
    @task(2)
    def get_user(self):
        self.client.get("/api/users/1")

Key Performance Metrics

Response Time

Average: Mean response time across all requests
Median: Middle value, less affected by outliers
p95: 95th percentile, 95% of requests complete within this time
p99: 99th percentile, 99% of requests complete within this time
Min/Max: Fastest and slowest response times

Throughput

Requests Per Second (RPS): Number of requests handled per second
Transactions Per Second (TPS): Number of business transactions per second
Concurrent Users: Number of simultaneous users
Hits Per Second: Number of HTTP requests per second

Error Rate

HTTP Error Rate: Percentage of HTTP errors (4xx, 5xx)
Application Error Rate: Percentage of application-level errors
Timeout Rate: Percentage of requests that timed out
Connection Error Rate: Percentage of connection failures

Resource Utilization

CPU Usage: Processor utilization percentage
Memory Usage: RAM consumption and availability
Disk I/O: Read/write operations and latency
Network I/O: Bandwidth utilization and latency
Database Connections: Active and idle connection counts

Performance Profiling

Application Profiling

CPU Profiling: Identify CPU-intensive methods
Memory Profiling: Detect memory leaks and allocation patterns
Thread Profiling: Identify thread contention and deadlocks
Database Profiling: Analyze query performance and execution plans

Tools

Java: JProfiler, VisualVM, YourKit
Node.js: Node.js Profiler, Clinic.js
Python: cProfile, Py-Spy
Go: pprof
.NET: dotTrace, Visual Studio Profiler

Bottleneck Identification

Database: Slow queries, missing indexes, N+1 queries
Network: Latency, bandwidth limitations, connection pooling
Application: Inefficient algorithms, excessive object creation
External Services: Third-party API latency, rate limiting
Caching: Cache misses, stale data, cache stampede

Performance Baselines and SLAs

Establishing Baselines

Run tests in production-like environment
Collect metrics over multiple runs
Account for normal variability
Document test conditions and data
Store baselines in version control

SLA Definitions

Response Time SLAs: Maximum acceptable response times
Availability SLAs: Minimum uptime requirements (e.g., 99.9%)
Throughput SLAs: Minimum requests per second
Error Rate SLAs: Maximum acceptable error rate

Example SLAs

API Response Times:
- p50 < 200ms
- p95 < 500ms
- p99 < 1000ms

Availability: 99.9% (8.76 hours downtime/year)

Error Rate: < 0.1%

Throughput: 1000 RPS

Cloud-Based Performance Testing

Cloud Testing Benefits

Scalable infrastructure on demand
Geographic distribution
Realistic load simulation
Pay-as-you-go pricing
Integration with cloud services

Cloud Testing Platforms

AWS: EC2, Lambda, Fargate for distributed testing
Google Cloud: Compute Engine, Cloud Functions
Azure: Virtual Machines, Azure Functions
Managed Services: BlazeMeter, LoadRunner Cloud, k6 Cloud

Cloud Testing Best Practices

Use multiple regions for geographic testing
Leverage auto-scaling for flexible load
Monitor cloud costs during testing
Clean up resources after testing
Use cloud-native monitoring and logging

Performance Test Planning

Test Scenarios

Define realistic user journeys
Identify critical paths
Include happy path and edge cases
Account for different user types
Consider peak and off-peak patterns

Load Models

Constant Load: Steady user count over time
Ramp-up Load: Gradually increase users
Spike Load: Sudden increase in users
Step Load: Incremental increases with plateaus
Random Load: Variable user patterns

Test Data

Use realistic data volumes
Include edge cases and boundary values
Account for data distribution
Refresh data between test runs
Consider data privacy and security

Environment Setup

Mirror production configuration
Use production-like data
Monitor system resources
Isolate test environment
Document environment differences

performance-testing

Performance Testing

Performance Testing Types

Load Testing

Stress Testing

Spike Testing

Soak Testing

Volume Testing

Performance Testing Tools

JMeter

Gatling

k6

Locust

Key Performance Metrics

Response Time

Throughput

Error Rate

Resource Utilization

Performance Profiling

Application Profiling

Tools

Bottleneck Identification

Performance Baselines and SLAs

Establishing Baselines

SLA Definitions

Example SLAs

Cloud-Based Performance Testing

Cloud Testing Benefits

Cloud Testing Platforms

Cloud Testing Best Practices

Performance Test Planning

Test Scenarios

Load Models

Test Data

Environment Setup

Similar Skills