From java
Review Java performance and concurrency decisions with evidence-driven profiling, classify bottlenecks as CPU, I/O, contention, or allocation, evaluate virtual-thread fit, and recommend the smallest measured change. Use when the user asks to optimize Java performance, analyze concurrency design, use virtual threads, reduce allocation pressure, profile Java code, or needs guidance on Java performance and concurrency tradeoffs.
npx claudepluginhub ririnto/sinon --plugin javaThis skill uses the workspace's default tool permissions.
Review Java performance and concurrency decisions with emphasis on evidence, workload shape, and modern JVM capabilities. The common case is not heroic optimization; it is identifying whether the bottleneck is CPU, blocking I/O, contention, or allocation churn, then making the smallest measured change that fits that shape.
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Review Java performance and concurrency decisions with emphasis on evidence, workload shape, and modern JVM capabilities. The common case is not heroic optimization; it is identifying whether the bottleneck is CPU, blocking I/O, contention, or allocation churn, then making the smallest measured change that fits that shape.
ScopedValue as a version-sensitive alternative to broad ThreadLocal usage when immutable request context is the real problem; it is preview on Java 21-24 and finalized in Java 25.Start with a profiler-ready JVM launch shape:
java -XX:StartFlightRecording=duration=60s,filename=profile.jfr,settings=profile -jar app.jar
Use when: you need a first real profile for latency, throughput, allocation, or lock analysis.
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.annotations.Measurement;
@State(Scope.Benchmark)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@Warmup(iterations = 3, time = 1)
@Measurement(iterations = 5, time = 1)
public class ParserBenchmark {
private final byte[] payload = "key=value,name=test".getBytes();
@Benchmark
public Result parse() {
return Parser.parse(payload);
}
}
Maven dependency:
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>${verifiedVersion}</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${verifiedVersion}</version>
<scope>provided</scope>
</dependency>
Gradle Groovy DSL:
def jmhVersion = "${verifiedVersion}"
dependencies {
implementation "org.openjdk.jmh:jmh-core:$jmhVersion"
annotationProcessor "org.openjdk.jmh:jmh-generator-annprocess:$jmhVersion"
}
Gradle Kotlin DSL:
val jmhVersion = "${verifiedVersion}"
dependencies {
implementation("org.openjdk.jmh:jmh-core:$jmhVersion")
annotationProcessor("org.openjdk.jmh:jmh-generator-annprocess:$jmhVersion")
}
| Collector | Flag | Best for | Avoid when |
|---|---|---|---|
| G1 (default) | -XX:+UseG1GC | General-purpose, balanced latency/throughput | Sub-millisecond pause requirements |
| ZGC | -XX:+UseZGC | Low-latency, large heaps (JDK 15+ production-ready) | Very small heaps, JDK 11 |
| Shenandoah | -XX:+UseShenandoahGC | Low-latency alternative to ZGC | Not available in all JDK distributions |
| Parallel | -XX:+UseParallelGC | Batch/throughput-focused, max throughput | Latency-sensitive services |
Rule: do not change the collector without allocation and pause evidence from JFR or GC logs.
workload: request-per-thread service with blocking network or disk calls
evidence:
- flame graph or JFR shows waiting time dominating compute time
- thread dump shows many blocked or parked request threads
baseline: JDK 21+
first_change:
- isolate the blocking call path
- confirm the bottleneck is waiting rather than CPU saturation
- evaluate virtual threads only after the blocking path is confirmed
not_first:
- generic pool-size increases
- GC tuning before the waiting path is measured
import java.io.IOException;
import java.net.http.HttpResponse;
import java.net.http.HttpResponse.BodyHandlers;
Response load(UserId id) throws IOException {
HttpResponse<String> response = httpClient.send(requestFor(id), BodyHandlers.ofString());
return mapper.readValue(response.body(), Response.class);
}
workload: allocation-heavy request or batch path
evidence:
- JFR or heap profile shows high allocation rate in one hot path
- GC pauses or CPU time track object churn rather than lock contention
top_allocators:
- parser
- serializer
- intermediate collections
first_change:
- reduce transient object creation in the hot path
- collapse unnecessary intermediate materialization
- re-measure before discussing collector tuning
not_first:
- collector swaps without allocation evidence
- broad object pooling in ordinary code
import java.util.List;
List<Result> parse(List<String> lines) {
return lines.stream()
.map(line -> line.trim())
.filter(line -> !line.isEmpty())
.map(line -> line.split(","))
.map(parts -> new Result(parts[0], Integer.parseInt(parts[1])))
.toList();
}
(JDK 21+)import java.util.concurrent.Executors;
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
executor.submit(() -> service.handle(request));
}
| Primitive | Use when | Avoid when |
|---|---|---|
synchronized | Simple mutual exclusion, low contention | Virtual threads on JDK 21-23 (pinning risk) |
ReentrantLock | Need tryLock(), timed lock, or interruptible lock | Simple cases where synchronized suffices |
ReadWriteLock | Many readers, few writers | Write-heavy or low-contention workloads |
AtomicInteger / AtomicReference | Single-variable atomic updates | Multi-variable compound actions |
Semaphore | Limiting concurrent access to a resource | Simple mutual exclusion (use synchronized or ReentrantLock) |
CountDownLatch | One-time event waiting for N tasks to complete | Repeated reset needed (use CyclicBarrier) |
workload: CPU-bound or mixed CPU plus blocking path
evidence:
- flame graph dominated by parsing, serialization, crypto, or business computation
- carrier-thread or pinned-thread warnings appear in tracing or JFR
baseline: JDK 21-23 or JDK 24+
why_virtual_threads_are_not_first:
- thread model does not remove CPU saturation
- pre-JDK-24 synchronized pinning guidance still matters on 21-23
- JDK 24+ removes synchronized-driven pinning, but native or JNI pinning can remain
first_change:
- optimize the hot computation path or blocking primitive causing pinning
- then re-evaluate concurrency model changes
not_first:
- blanket migration to virtual threads
- lock-free rewrites without evidence
synchronized Result parse(byte[] payload) {
return parser.parse(payload);
}
On out-of-memory:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap.hprof -jar app.jar
On demand:
jcmd <pid> GC.heap_dump /tmp/heap.hprof
Analyze with Eclipse MAT or jhsdb:
jhsdb hat dump --heap-dump-file /tmp/heap.hprof
synchronized pinning, state whether the target runtime is Java 21-23 or Java 24+ before giving version-specific advice.Return:
| If the blocker is... | Open... |
|---|---|
| virtual-thread fit, limits, ScopedValue usage, pinning diagnosis | virtual-threads.md |
| additional profiling commands, JFR attach patterns, evidence interpretation heuristics | performance-patterns.md |