From golang-skills
Optimizes Go code performance using patterns like strconv over fmt, avoiding repeated string-to-byte conversions, specifying slice/map capacities, and passing values. Includes benchmarking script with benchstat support.
npx claudepluginhub cxuu/golang-skills --plugin golang-skillsThis skill is limited to using the following tools:
- **`scripts/bench-compare.sh`** — Runs Go benchmarks N times with optional baseline comparison via benchstat. Supports saving results for future comparison. Run `bash scripts/bench-compare.sh --help` for options.
Detects Go performance anti-patterns like unnecessary allocations, inefficient string handling, slice/map growth, and suggests optimizations with sync.Pool, benchmarking, and pprof profiling.
Provides Go performance optimization patterns for identified bottlenecks: allocation reduction, CPU efficiency, memory layout, GC tuning, pooling, caching, hot-paths. Use after profiling/benchmarks or for code reviews.
Profiles and optimizes Go code for CPU hotspots, memory allocations, and concurrency using pprof, benchmarks, benchstat, and statistical verification.
Share bugs, ideas, or general feedback.
scripts/bench-compare.sh — Runs Go benchmarks N times with optional baseline comparison via benchstat. Supports saving results for future comparison. Run bash scripts/bench-compare.sh --help for options.Performance-specific guidelines apply only to the hot path. Don't prematurely optimize—focus these patterns where they matter most.
When converting primitives to/from strings, strconv is faster than fmt:
s := strconv.Itoa(rand.Int()) // ~2x faster than fmt.Sprint()
| Approach | Speed | Allocations |
|---|---|---|
fmt.Sprint | 143 ns/op | 2 allocs/op |
strconv.Itoa | 64.2 ns/op | 1 allocs/op |
Read references/STRING-OPTIMIZATION.md when choosing between strconv and fmt for type conversions, or for the full conversion table.
Convert a fixed string to []byte once outside the loop:
data := []byte("Hello world")
for i := 0; i < b.N; i++ {
w.Write(data) // ~7x faster than []byte("...") each iteration
}
Read references/STRING-OPTIMIZATION.md when optimizing repeated byte conversions in hot loops.
Specify container capacity where possible to allocate memory up front. This minimizes subsequent allocations from copying and resizing as elements are added.
Provide capacity hints when initializing maps with make():
m := make(map[string]os.DirEntry, len(files))
Note: Unlike slices, map capacity hints do not guarantee complete preemptive allocation—they approximate the number of hashmap buckets required.
Provide capacity hints when initializing slices with make(), particularly when appending:
data := make([]int, 0, size)
Unlike maps, slice capacity is not a hint—the compiler allocates exactly that much memory. Subsequent append() operations incur zero allocations until capacity is reached.
| Approach | Time (100M iterations) |
|---|---|
| No capacity | 2.48s |
| With capacity | 0.21s |
The capacity version is ~12x faster due to zero reallocations during append.
Don't pass pointers as function arguments just to save a few bytes. If a function refers to its argument x only as *x throughout, then the argument shouldn't be a pointer.
func process(s string) { // not *string — strings are small fixed-size headers
fmt.Println(s)
}
Common pass-by-value types: string, io.Reader, small structs.
Exceptions:
Choose the right strategy based on complexity:
| Method | Best For |
|---|---|
+ | Few strings, simple concat |
fmt.Sprintf | Formatted output with mixed types |
strings.Builder | Loop/piecemeal construction |
strings.Join | Joining a slice |
| Backtick literal | Constant multi-line text |
Read references/STRING-OPTIMIZATION.md when choosing a string concatenation strategy, using strings.Builder in loops, or deciding between fmt.Sprintf and manual concatenation.
Always measure before and after optimizing. Use Go's built-in benchmark framework and profiling tools.
go test -bench=. -benchmem -count=10 ./...
Read references/BENCHMARKS.md when writing benchmarks, comparing results with benchstat, profiling with pprof, or interpreting benchmark output.
Validation: After applying optimizations, run
bash scripts/bench-compare.shto measure the actual impact. Only keep optimizations with measurable improvement.
| Pattern | Bad | Good | Improvement |
|---|---|---|---|
| Int to string | fmt.Sprint(n) | strconv.Itoa(n) | ~2x faster |
Repeated []byte | []byte("str") in loop | Convert once outside | ~7x faster |
| Map initialization | make(map[K]V) | make(map[K]V, size) | Fewer allocs |
| Slice initialization | make([]T, 0) | make([]T, 0, cap) | ~12x faster |
| Small fixed-size args | *string, *io.Reader | string, io.Reader | No indirection |
| Simple string join | s1 + " " + s2 | (already good) | Use + for few strings |
| Loop string build | Repeated += | strings.Builder | O(n) vs O(n²) |
make with capacity hints or initializing maps and slices