Stats

Actions

Tags

Help us improve

Share bugs, ideas, or general feedback.

estimation-techniques | systems-design | ClaudePluginHub

Skill

estimation-techniques

From systems-design

Provides back-of-envelope calculations for system design and capacity planning, estimating QPS, storage, bandwidth, latency. Includes powers of 2, time units, availability targets, and essential latency numbers.

$

npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-design

Popularity

Parent stars

67

Parent forks

10

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/systems-design:estimation-techniques

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGlobGrep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill provides frameworks for back-of-envelope calculations essential for system design and capacity planning.

Supporting Files

references/latency-numbers.md

SKILL.md

400 lines · ~2.2k tokens

Similar Skills

capacity-planning

876

Produces a structured capacity planning document for a service covering traffic forecasts, resource requirements, scaling strategy, and cost projections.

capacity-planning

13

Forecast infrastructure resource needs (compute, storage, network). Model growth scenarios and utilization targets. Use when planning infrastructure investments or optimizing costs.

infrastructure-design

analyzing-capacity-planning

2.2k

Analyzes CPU, memory, storage, network utilization; forecasts growth trends; recommends scaling strategies with cost estimates. Useful for infrastructure capacity planning and bottleneck identification.

4 files6 tools

capacity-planning-analyzer

Stats

LanguagePython

Parent stars67

Parent forks10

MaintenanceGood

Last CommitFeb 15, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

back-of-envelope

capacity-planning

latency-numbers

Help us improve

Share bugs, ideas, or general feedback.

Estimation Techniques

This skill provides frameworks for back-of-envelope calculations essential for system design and capacity planning.

When to Use This Skill

Keywords: back-of-envelope, estimation, QPS, storage calculation, bandwidth, latency, capacity planning, scale estimation

Use this skill when:

Estimating system capacity requirements
Calculating storage needs for a feature
Determining bandwidth requirements
Sizing infrastructure for expected load
Justifying architectural decisions with numbers
Preparing for system design interviews

Core Principle

Estimation is not about precision, it's about order of magnitude.

Getting within 10x is usually good enough for architectural decisions. The goal is to identify if you need:

1 server or 100 servers
1 GB or 1 TB of storage
10 ms or 1 second latency

Essential Numbers to Know

Powers of 2

Power	Value	Approximate
2^10	1,024	~1 Thousand (KB)
2^20	1,048,576	~1 Million (MB)
2^30	1,073,741,824	~1 Billion (GB)
2^40	1,099,511,627,776	~1 Trillion (TB)

Time Conversions

Unit	Seconds	Useful For
1 minute	60	Short operations
1 hour	3,600	Batch jobs
1 day	86,400 (~100K)	Daily aggregations
1 month	2,592,000 (~2.5M)	Monthly calculations
1 year	31,536,000 (~30M)	Annual projections

Availability Targets

Availability	Downtime/Year	Downtime/Month	Downtime/Day
99% (two 9s)	3.65 days	7.31 hours	14.4 min
99.9% (three 9s)	8.76 hours	43.8 min	1.44 min
99.99% (four 9s)	52.6 min	4.38 min	8.64 sec
99.999% (five 9s)	5.26 min	26.3 sec	864 ms

Latency Numbers Every Programmer Should Know

See full reference: references/latency-numbers.md

Quick reference:

Operation	Latency	Relative
L1 cache reference	0.5 ns	1x
L2 cache reference	7 ns	14x
Main memory reference	100 ns	200x
SSD random read	16 us	32,000x
HDD seek	2 ms	4,000,000x
Round trip same datacenter	0.5 ms	1,000,000x
Round trip CA to Netherlands	150 ms	300,000,000x

Estimation Patterns

Pattern 1: QPS (Queries Per Second)

Formula

QPS = (Number of Users) x (Actions per User per Day) / (Seconds per Day)

Example: Twitter-like service

Given:
- 300 million monthly active users
- 50% are daily active = 150M DAU
- Average user reads 20 tweets/day

QPS = 150M * 20 / 86,400
    = 3 billion / 100,000
    = 30,000 QPS

Peak load (typically 2-3x average):
Peak QPS = 30,000 * 3 = 90,000 QPS

Pattern 2: Storage Estimation

Formula

Storage = (Number of Items) x (Size per Item) x (Replication Factor) x (Time Period)

Example: Photo storage service

Given:
- 100 million users
- 10% upload daily = 10M uploads/day
- Average photo size = 2 MB
- Keep 5 years of data
- Replication factor = 3

Daily storage = 10M * 2 MB = 20 TB
Yearly storage = 20 TB * 365 = 7.3 PB
5-year storage = 7.3 * 5 = 36.5 PB
With replication = 36.5 * 3 = ~110 PB

Pattern 3: Bandwidth Estimation

Formula

Bandwidth = (QPS) x (Request Size or Response Size)

Example: Video streaming service

Given:
- 1 million concurrent viewers
- Average bitrate = 5 Mbps
- Peak hours: 8 PM - 11 PM

Bandwidth = 1M * 5 Mbps = 5 Tbps

With 20% overhead: ~6 Tbps

CDN egress cost (rough):
$0.02/GB * 6 Tbps * 3 hours * 3600 sec/hour / 8 bits/byte
= massive cost (hence why Netflix built their own CDN)

Pattern 4: Cache Size Estimation

Formula

Cache Size = (QPS) x (Cache TTL) x (Response Size) x (Unique Ratio)

Example: API response cache

Given:
- 10,000 QPS
- Cache TTL = 5 minutes = 300 seconds
- Average response = 10 KB
- 20% of requests are unique

Cache entries = 10,000 * 300 * 0.20 = 600,000 entries
Cache size = 600,000 * 10 KB = 6 GB

With overhead (keys, metadata): ~10 GB

Pattern 5: Database Sizing

Formula

DB Size = (Number of Rows) x (Row Size) x (Index Overhead) x (Replication)

Example: User profile database

Given:
- 500 million users
- Average profile = 1 KB (name, email, settings, etc.)
- Index overhead = 30%
- Primary + 2 replicas = 3x

Data size = 500M * 1 KB = 500 GB
With indexes = 500 GB * 1.3 = 650 GB
With replication = 650 GB * 3 = ~2 TB

Memory for hot data (20%): ~400 GB

Common Estimation Scenarios

Scenario 1: URL Shortener

Requirements:
- 100M new URLs/month
- 10:1 read:write ratio

Writes:
- 100M / (30 * 24 * 3600) = ~40 writes/second
- Peak: ~100 writes/second

Reads:
- 40 * 10 = 400 reads/second
- Peak: ~1000 reads/second

Storage (5 years):
- 100M URLs/month * 60 months = 6 billion URLs
- Average URL = 100 bytes (short) + 500 bytes (long) = 600 bytes
- 6B * 600 bytes = 3.6 TB
- With indexes and overhead: ~5 TB

Scenario 2: Chat Application

Requirements:
- 10M daily active users
- Average 50 messages sent/day
- Average 200 messages received/day

Message throughput:
- Sends: 10M * 50 / 86,400 = ~6,000 messages/second
- Peak: ~20,000 messages/second

Connections:
- Each user maintains 1-3 connections (phone, laptop, tablet)
- Peak concurrent: 10M * 0.1 (10% online) * 2 = 2M connections

Storage (1 year):
- 10M users * 50 msgs/day * 365 days = 182B messages/year
- Average message = 200 bytes
- 182B * 200 bytes = 36.4 TB/year

Scenario 3: Video Streaming

Requirements:
- 100M monthly active users
- 30% watch daily = 30M DAU
- Average 1 hour/day viewing

Concurrent viewers (peak):
- 30M DAU / 24 hours * 3 (peak factor) = ~4M concurrent

Bandwidth:
- Average stream: 5 Mbps
- 4M * 5 Mbps = 20 Tbps peak bandwidth

Storage (library of 10K titles):
- Average video = 2 hours
- Multiple qualities: 480p (1GB), 720p (3GB), 1080p (5GB), 4K (20GB)
- Per title: ~30 GB
- Library: 10K * 30 GB = 300 TB

Estimation Tips

Round Aggressively

Instead of:      Use:
86,400 seconds   ~100,000 (10^5)
2.5 million      ~3 million
7.3 petabytes    ~10 petabytes

Use Orders of Magnitude

Think in powers of 10:

Thousands (10^3)
Millions (10^6)
Billions (10^9)
Trillions (10^12)

State Your Assumptions

Always verbalize:

"I'm assuming 10% of users are active at peak"
"I'm estimating average message size at 200 bytes"
"I'm using a 3x replication factor"

Sanity Check Results

After calculating, ask:

"Does this make sense?"
"Is this in the right order of magnitude?"
"What would change if my assumption is off by 10x?"

Common Mistakes

Mistake 1: Ignoring Peak vs Average

Problem

Sizing for average load.

Average QPS: 10,000
Peak QPS: 30,000 (often 2-3x average)

If you size for 10,000, you'll fail at peak.

Mistake 2: Forgetting Replication

Problem

Calculating raw storage without copies.

Data: 1 TB
With 3 replicas: 3 TB
With backups: 4-5 TB

Mistake 3: Not Accounting for Growth

Problem

Sizing for current, not future.

Current users: 10M
Expected growth: 50%/year
Year 3: 10M * 1.5^3 = 34M users

Size for at least 2x current to avoid near-term issues.

Mistake 4: Over-Precision

Problem

Calculating to 3 decimal places.

Bad: "We need exactly 3,456,789 IOPS"
Good: "We need roughly 3-4 million IOPS"

Quick Reference Calculations

Need	Formula
QPS	users * actions/day / 86400
Storage/day	items/day * size/item
Bandwidth	QPS * response_size
Cache hit rate	1 - (DB_QPS / total_QPS)
Servers needed	QPS / QPS_per_server
Shards needed	data_size / max_shard_size

Related Skills

design-interview-methodology - Overall interview framework
quality-attributes-taxonomy - NFR definitions (scalability, performance)
database-scaling - Database capacity planning (Phase 3)
caching-strategies - Cache sizing and hit rates (Phase 3)

Related Commands

/sd:estimate <scenario> - Calculate capacity interactively

Related Agents

capacity-planner - Guided estimation with calculations

References

references/latency-numbers.md - Complete latency reference table

Version History

v1.0.0 (2025-12-26): Initial release

Last Updated

Date: 2025-12-26 Model: claude-opus-4-5-20251101