From dotnet-skills
Diagnosing .NET performance issues. dotnet-counters, dotnet-trace, dotnet-dump, flame graphs.
npx claudepluginhub wshaddix/dotnet-skillsThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Diagnostic tool guidance for investigating .NET performance problems. Covers real-time metric monitoring with dotnet-counters, event tracing and flame graph generation with dotnet-trace, and memory dump capture and analysis with dotnet-dump. Focuses on interpreting profiling data (reading flame graphs, analyzing heap dumps, correlating GC metrics) rather than just invoking tools.
Version assumptions: .NET SDK 8.0+ baseline. All three diagnostic tools (dotnet-counters, dotnet-trace, dotnet-dump) ship with the .NET SDK -- no separate installation required.
Out of scope: OpenTelemetry metrics collection and distributed tracing setup -- see [skill:dotnet-observability]. Microbenchmarking setup (BenchmarkDotNet) is owned by this epic's companion skill -- see [skill:dotnet-benchmarkdotnet]. Performance architecture patterns (Span<T>, ArrayPool, sealed devirtualization) are owned by this epic's companion skill -- see [skill:dotnet-performance-patterns]. Continuous benchmark regression detection in CI -- see [skill:dotnet-ci-benchmarking]. Architecture patterns (caching, resilience) -- see [skill:dotnet-architecture-patterns].
Cross-references: [skill:dotnet-observability] for GC/threadpool metrics interpretation and OpenTelemetry correlation, [skill:dotnet-benchmarkdotnet] for structured benchmarking after profiling identifies hot paths, [skill:dotnet-performance-patterns] for optimization patterns to apply based on profiling results.
dotnet-counters provides real-time monitoring of .NET runtime metrics without modifying application code. Use it as a first-pass triage tool to identify whether a performance problem is CPU-bound, memory-bound, or I/O-bound before reaching for heavier instrumentation.
# List running .NET processes
dotnet-counters ps
# Monitor default runtime counters for a process
dotnet-counters monitor --process-id <PID>
# Monitor with a specific refresh interval (seconds)
dotnet-counters monitor --process-id <PID> --refresh-interval 2
| Provider | Counters | What It Tells You |
|---|---|---|
System.Runtime | CPU usage, GC heap size, Gen 0/1/2 collections, threadpool queue length, exception count | Overall runtime health |
Microsoft.AspNetCore.Hosting | Request rate, request duration, active requests | HTTP request throughput and latency |
Microsoft.AspNetCore.Http.Connections | Connection duration, current connections | WebSocket/SignalR connection load |
System.Net.Http | Requests started/failed, active requests, connection pool size | Outbound HTTP client behavior |
System.Net.Sockets | Bytes sent/received, datagrams, connections | Network I/O volume |
# Monitor runtime and ASP.NET counters together
dotnet-counters monitor --process-id <PID> \
--counters System.Runtime,Microsoft.AspNetCore.Hosting
# Monitor only GC-related counters
dotnet-counters monitor --process-id <PID> \
--counters System.Runtime[gc-heap-size,gen-0-gc-count,gen-1-gc-count,gen-2-gc-count]
Applications can publish custom counters for domain-specific metrics:
using System.Diagnostics.Tracing;
[EventSource(Name = "MyApp.Orders")]
public sealed class OrderMetrics : EventSource
{
public static readonly OrderMetrics Instance = new();
private EventCounter? _orderProcessingTime;
private IncrementingEventCounter? _ordersProcessed;
private OrderMetrics()
{
_orderProcessingTime = new EventCounter("order-processing-time", this)
{
DisplayName = "Order Processing Time (ms)",
DisplayUnits = "ms"
};
_ordersProcessed = new IncrementingEventCounter("orders-processed", this)
{
DisplayName = "Orders Processed",
DisplayRateTimeScale = TimeSpan.FromSeconds(1)
};
}
public void RecordProcessingTime(double milliseconds)
=> _orderProcessingTime?.WriteMetric(milliseconds);
public void RecordOrderProcessed()
=> _ordersProcessed?.Increment();
protected override void Dispose(bool disposing)
{
_orderProcessingTime?.Dispose();
_ordersProcessed?.Dispose();
base.Dispose(disposing);
}
}
Monitor custom counters:
dotnet-counters monitor --process-id <PID> --counters MyApp.Orders
Use counter values to direct further investigation. See [skill:dotnet-observability] for correlating these runtime metrics with OpenTelemetry traces:
| Symptom | Counter Evidence | Next Step |
|---|---|---|
| High CPU usage | cpu-usage > 80%, threadpool-queue-length low | CPU profiling with dotnet-trace |
| Memory growth | gc-heap-size increasing, frequent Gen 2 GC | Memory dump with dotnet-dump |
| Thread starvation | threadpool-queue-length growing, threadpool-thread-count at max | Check for sync-over-async or blocking calls |
| Request latency | request-duration high, active-requests normal | Trace individual requests with dotnet-trace |
| GC pauses | High gen-2-gc-count, time-in-gc > 10% | Allocation profiling with dotnet-trace gc-collect |
# Export to CSV for analysis
dotnet-counters collect --process-id <PID> \
--format csv \
--output counters.csv \
--counters System.Runtime
# Export to JSON for programmatic consumption
dotnet-counters collect --process-id <PID> \
--format json \
--output counters.json
dotnet-trace captures detailed event traces from a running .NET process. Traces can be analyzed as flame graphs to identify CPU hot paths, or configured for allocation tracking to find GC pressure sources.
CPU sampling records stack frames at a fixed interval to build a statistical profile of where the application spends time:
# Collect a CPU sampling trace (default profile)
dotnet-trace collect --process-id <PID> --duration 00:00:30
# Collect with the cpu-sampling profile (explicit)
dotnet-trace collect --process-id <PID> \
--profile cpu-sampling \
--output cpu-trace.nettrace
| Approach | Overhead | Best For | Tool |
|---|---|---|---|
| CPU sampling | Low (~2-5%) | Finding CPU hot paths in production | dotnet-trace --profile cpu-sampling |
| Instrumentation | High (10-50%+) | Exact call counts, method entry/exit timing | Rider/VS profiler, PerfView |
CPU sampling is safe for production use due to low overhead. Use it as the default approach. Reserve instrumentation for development environments where exact call counts matter.
Trace files (.nettrace) must be converted to a flame graph format for visual analysis:
Using Speedscope (browser-based, recommended):
# Convert to Speedscope format
dotnet-trace convert cpu-trace.nettrace --format Speedscope
# Opens cpu-trace.speedscope.json -- load at https://www.speedscope.app/
Using PerfView (Windows, deep .NET integration):
# Convert to Chromium trace format (also viewable in chrome://tracing)
dotnet-trace convert cpu-trace.nettrace --format Chromium
Flame graphs display call stacks where:
Analysis workflow:
Common patterns in .NET flame graphs:
| Pattern | Likely Cause | Investigation |
|---|---|---|
Wide System.Linq frames | LINQ-heavy hot path with delegate overhead | Replace with foreach loops or Span-based processing |
Wide JIT_New / gc_heap::allocate | Excessive allocations triggering GC | Allocation profiling with --profile gc-collect |
Wide Monitor.Enter / SpinLock | Lock contention | Review synchronization strategy |
Wide System.Text.RegularExpressions | Regex backtracking | Use RegexOptions.NonBacktracking or compile regex |
| Deep async state machine frames | Async overhead in tight loops | Consider sync path for CPU-bound work |
The gc-collect profile captures allocation events to identify what code paths allocate the most memory:
# Collect allocation data
dotnet-trace collect --process-id <PID> \
--profile gc-collect \
--duration 00:00:30 \
--output alloc-trace.nettrace
This produces a trace that shows:
Correlate allocation data with GC counter evidence from dotnet-counters. If gen-2-gc-count is high, the allocation trace shows which code paths produce long-lived objects that survive to Gen 2. See [skill:dotnet-performance-patterns] for zero-allocation patterns to apply once hot allocation sites are identified.
Target specific event providers for focused tracing:
# Trace specific providers with keywords and verbosity
dotnet-trace collect --process-id <PID> \
--providers "Microsoft-Diagnostics-DiagnosticSource:::FilterAndPayloadSpecs=[AS]System.Net.Http"
# Trace EF Core queries (useful with [skill:dotnet-efcore-patterns])
dotnet-trace collect --process-id <PID> \
--providers Microsoft.EntityFrameworkCore
# Trace ASP.NET Core request processing
dotnet-trace collect --process-id <PID> \
--providers Microsoft.AspNetCore
| Format | Extension | Viewer | Cross-Platform |
|---|---|---|---|
| NetTrace | .nettrace | PerfView, VS, dotnet-trace convert | Yes (capture); Windows (PerfView) |
| Speedscope | .speedscope.json | https://www.speedscope.app/ | Yes |
| Chromium | .chromium.json | Chrome DevTools (chrome://tracing) | Yes |
dotnet-dump captures and analyzes process memory dumps. Use it to investigate memory leaks, large object heap fragmentation, and object reference chains. Unlike dotnet-trace, dumps capture a point-in-time snapshot of the entire managed heap.
# Capture a full heap dump
dotnet-dump collect --process-id <PID> --output app-dump.dmp
# Capture a minimal dump (faster, smaller, but less detail)
dotnet-dump collect --process-id <PID> --type Mini --output app-mini.dmp
When to capture:
gc-heap-size)Open the dump in the interactive analyzer:
dotnet-dump analyze app-dump.dmp
Lists objects on the managed heap grouped by type, sorted by total size:
> dumpheap -stat
Statistics:
MT Count TotalSize Class Name
00007fff2c6a4320 125 4,000 System.String[]
00007fff2c6a1230 8,432 269,824 System.String
00007fff2c7b5640 2,100 504,000 MyApp.Models.OrderEntity
00007fff2c6a0988 15,230 1,218,400 System.Byte[]
Analysis approach:
System.Byte[] counts often indicate unbounded buffering or stream handling issuesFilter by type:
> dumpheap -type MyApp.Models.OrderEntity
> dumpheap -type System.Byte[] -min 85000
The -min 85000 filter shows Large Object Heap entries (objects >= 85,000 bytes that cause Gen 2 GC pressure).
Traces the reference chain from a GC root to a specific object, explaining why it is not collected:
> gcroot 00007fff3c4a2100
HandleTable:
00007fff3c010010 (strong handle)
-> 00007fff3c3a1000 MyApp.Services.CacheService
-> 00007fff3c3a1020 System.Collections.Generic.Dictionary`2
-> 00007fff3c4a2100 MyApp.Models.OrderEntity
Found 1 unique root(s).
Common root types and their meaning:
| Root Type | Meaning | Likely Issue |
|---|---|---|
strong handle | Static field or GC handle | Static collection growing without eviction |
pinned handle | Pinned for native interop | Buffer pinned longer than needed |
async state machine | Captured in async closure | Long-running async operation holding references |
finalizer queue | Waiting for finalizer thread | Finalizer backlog blocking collection |
threadpool | Referenced from thread-local storage | Thread-static cache without cleanup |
Shows objects waiting for finalization, which delays their collection by at least one GC cycle:
> finalizequeue
SyncBlocks to be cleaned up: 0
Free-Threaded Interfaces to be released: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
----------------------------------
generation 0 has 12 finalizable objects
generation 1 has 45 finalizable objects
generation 2 has 230 finalizable objects
Ready for finalization 8 objects
Key indicators:
~Destructor() without IDisposable.Dispose() being called are the primary cause| Command | Purpose | When to Use |
|---|---|---|
dumpobj <address> | Display field values of a specific object | Inspect object state after finding it with dumpheap |
dumparray <address> | Display array contents | Investigate large arrays found in heap stats |
eeheap -gc | Show GC heap segment layout | Investigate LOH fragmentation |
gcwhere <address> | Show which GC generation holds an object | Determine if an object is pinned or in LOH |
dumpmt <MT> | Display method table details | Investigate type metadata |
threads | List all managed threads with stack traces | Identify deadlocks or blocking |
clrstack | Display managed call stack for current thread | Correlate thread state with heap data |
dumpheap -stat output between the two dumps -- look for types whose count or total size grew significantlygcroot on instances of the growing type to find the retention chain# Tip: save dumpheap output for comparison
# In dump 1:
> dumpheap -stat > /tmp/heap-before.txt
# In dump 2:
> dumpheap -stat > /tmp/heap-after.txt
# Compare externally:
# diff /tmp/heap-before.txt /tmp/heap-after.txt
Use the diagnostic tools in a structured investigation workflow:
1. dotnet-counters (triage)
├── CPU high? → dotnet-trace --profile cpu-sampling
│ → Convert to flame graph (Speedscope)
│ → Identify hot methods
├── Memory growing? → dotnet-dump collect
│ → dumpheap -stat (find large/numerous types)
│ → gcroot (find retention chains)
│ → Fix retention + verify with second dump
├── GC pressure? → dotnet-trace --profile gc-collect
│ → Identify allocation hot paths
│ → Apply zero-alloc patterns [skill:dotnet-performance-patterns]
└── Thread starvation? → dotnet-dump analyze
→ threads (list all managed threads)
→ clrstack (check for blocking calls)
After profiling identifies the bottleneck, use [skill:dotnet-benchmarkdotnet] to create targeted benchmarks that quantify the improvement from fixes.
.nettrace event logs is impractical. Use dotnet-trace convert --format Speedscope and open in https://www.speedscope.app/ for visual analysis.-min 85000 to find LOH objects -- objects >= 85,000 bytes go to the Large Object Heap, which is only collected in Gen 2 GC. Large LOH counts indicate potential fragmentation.