Skill

m10-performance

Guides performance optimization decisions: measuring bottlenecks, choosing algorithms/data structures, reducing allocations, parallelizing, and improving cache efficiency. References tools like perf, flamegraph, criterion.

Rust

performance

Popularity

Stars

1,167

Forks

100

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/rust-skills:m10-performance

Not user invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

> **Layer 2: Design Choices**

Supporting Files

patterns/optimization-guide.md

SKILL.md

158 lines · ~1k tokens

Stats

LanguageShell

Stars1,167

Forks100

MaintenanceExcellent

Last CommitMay 24, 2026

Actions

View Source View Plugin View on GitHub View README

Performance Optimization

Layer 2: Design Choices

Core Question

What's the bottleneck, and is optimization worth it?

Before optimizing:

Have you measured? (Don't guess)
What's the acceptable performance?
Will optimization add complexity?

Performance Decision → Implementation

Goal	Design Choice	Implementation
Reduce allocations	Pre-allocate, reuse	`with_capacity`, object pools
Improve cache	Contiguous data	`Vec`, `SmallVec`
Parallelize	Data parallelism	`rayon`, threads
Avoid copies	Zero-copy	References, `Cow<T>`
Reduce indirection	Inline data	`smallvec`, arrays

Thinking Prompt

Before optimizing:

Have you measured?
- Profile first → flamegraph, perf
- Benchmark → criterion, cargo bench
- Identify actual hotspots
What's the priority?
- Algorithm (10x-1000x improvement)
- Data structure (2x-10x)
- Allocation (2x-5x)
- Cache (1.5x-3x)
What's the trade-off?
- Complexity vs speed
- Memory vs CPU
- Latency vs throughput

Trace Up ↑

To domain constraints (Layer 3):

"How fast does this need to be?"
    ↑ Ask: What's the performance SLA?
    ↑ Check: domain-* (latency requirements)
    ↑ Check: Business requirements (acceptable response time)

Question	Trace To	Ask
Latency requirements	domain-*	What's acceptable response time?
Throughput needs	domain-*	How many requests per second?
Memory constraints	domain-*	What's the memory budget?

Trace Down ↓

To implementation (Layer 1):

"Need to reduce allocations"
    ↓ m01-ownership: Use references, avoid clone
    ↓ m02-resource: Pre-allocate with_capacity

"Need to parallelize"
    ↓ m07-concurrency: Choose rayon or threads
    ↓ m07-concurrency: Consider async for I/O-bound

"Need cache efficiency"
    ↓ Data layout: Prefer Vec over HashMap when possible
    ↓ Access patterns: Sequential over random access

Quick Reference

Tool	Purpose
`cargo bench`	Micro-benchmarks
`criterion`	Statistical benchmarks
`perf` / `flamegraph`	CPU profiling
`heaptrack`	Allocation tracking
`valgrind` / `cachegrind`	Cache analysis

Optimization Priority

1. Algorithm choice     (10x - 1000x)
2. Data structure       (2x - 10x)
3. Allocation reduction (2x - 5x)
4. Cache optimization   (1.5x - 3x)
5. SIMD/Parallelism     (2x - 8x)

Common Techniques

Technique	When	How
Pre-allocation	Known size	`Vec::with_capacity(n)`
Avoid cloning	Hot paths	Use references or `Cow<T>`
Batch operations	Many small ops	Collect then process
SmallVec	Usually small	`smallvec::SmallVec<[T; N]>`
Inline buffers	Fixed-size data	Arrays over Vec

Common Mistakes

Mistake	Why Wrong	Better
Optimize without profiling	Wrong target	Profile first
Benchmark in debug mode	Meaningless	Always `--release`
Use LinkedList	Cache unfriendly	`Vec` or `VecDeque`
Hidden `.clone()`	Unnecessary allocs	Use references
Premature optimization	Wasted effort	Make it work first

Anti-Patterns

Anti-Pattern	Why Bad	Better
Clone to avoid lifetimes	Performance cost	Proper ownership
Box everything	Indirection cost	Stack when possible
HashMap for small sets	Overhead	Vec with linear search
String concat in loop	O(n^2)	`String::with_capacity` or `format!`

Related Skills

When	See
Reducing clones	m01-ownership
Concurrency options	m07-concurrency
Smart pointer choice	m02-resource
Domain requirements	domain-*

m10-performance

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

m10-performance

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Performance Optimization

Core Question

Performance Decision → Implementation

Thinking Prompt

Trace Up ↑

Trace Down ↓

Quick Reference

Optimization Priority

Common Techniques

Common Mistakes

Anti-Patterns

Related Skills

Similar Skills

Performance Optimization

Core Question

Performance Decision → Implementation

Thinking Prompt

Trace Up ↑

Trace Down ↓

Quick Reference

Optimization Priority

Common Techniques

Common Mistakes

Anti-Patterns

Related Skills

Similar Skills