From dotnet-skills
Designing log pipelines. Aggregation, structured queries, sampling, PII scrubbing, correlation.
npx claudepluginhub wshaddix/dotnet-skillsThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Log pipeline design and operations for .NET distributed systems. Covers log aggregation architecture (ELK, Seq, Grafana Loki), structured query patterns for each platform, log sampling and volume management strategies, PII scrubbing and destructuring policies, and cross-service correlation beyond single-service log scopes. This skill addresses what happens after log emission -- the pipeline, query, and operations layer.
Out of scope: Log emission mechanics (Serilog/NLog/MEL configuration, source-generated LoggerMessage, enrichers, single-service log scopes, sink registration, OTel logging export) -- see [skill:dotnet-observability]. Application configuration and options pattern -- see [skill:dotnet-csharp-configuration]. Distributed tracing setup and trace context propagation -- see [skill:dotnet-observability].
Cross-references: [skill:dotnet-observability] for log emission, Serilog/MEL configuration, and OpenTelemetry logging export, [skill:dotnet-csharp-configuration] for appsettings.json configuration patterns used in log pipeline setup.
| Platform | Ingest | Storage | Query | Best for |
|---|---|---|---|---|
| ELK (Elasticsearch, Logstash, Kibana) | Logstash / Filebeat | Elasticsearch | KQL in Kibana | Large-scale, flexible schema, full-text search |
| Seq | HTTP API / Serilog sink | Built-in | Seq signal expressions | .NET-native, developer-friendly, structured queries |
| Grafana Loki | Promtail / OTel Collector | Loki (label-indexed) | LogQL | Cost-effective, Grafana ecosystem, label-based queries |
| Azure Monitor | OTel Collector / Application Insights SDK | Log Analytics workspace | KQL (Kusto) | Azure-native, integrated alerting, cost management |
Pattern 1: OTel Collector as central router
App (OTLP) --> OTel Collector --> Elasticsearch / Loki / Azure Monitor
|
+--> Sampling / filtering / PII scrub
The OpenTelemetry Collector acts as a vendor-neutral log router. Applications emit logs via OTLP; the collector handles filtering, sampling, enrichment, and routing to one or more backends. This decouples applications from backend choice.
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
processors:
batch:
timeout: 5s
send_batch_size: 1024
filter:
logs:
exclude:
match_type: strict
bodies:
- "Health check endpoint hit"
exporters:
elasticsearch:
endpoints: ["https://es-cluster:9200"]
logs_index: "app-logs"
loki:
endpoint: "http://loki:3100/loki/api/v1/push"
service:
pipelines:
logs:
receivers: [otlp]
processors: [batch, filter]
exporters: [elasticsearch, loki]
Pattern 2: Direct sink (smaller deployments)
App (Serilog) --> Seq / Elasticsearch sink
For smaller systems or development environments, Serilog sinks write directly to the aggregation platform. This avoids the OTel Collector but couples the application to the backend.
For .NET application-side OTLP log export configuration (builder.Logging.AddOpenTelemetry()), see [skill:dotnet-observability]. The OTLP endpoint is configured via environment variables (OTEL_EXPORTER_OTLP_ENDPOINT), keeping application code backend-agnostic.
Structured logs store each property as a queryable field. The query syntax differs by platform but the concepts are consistent: filter by property name, value, severity, and time range.
# Find errors for a specific order
level: "Error" AND OrderId: "abc-123"
# Find slow operations (custom Duration property)
Duration > 5000 AND ServiceName: "order-api"
# Wildcard on message template
message: "Failed to process*" AND NOT level: "Debug"
# Time-scoped with correlation
TraceId: "0af7651916cd43dd8448eb211c80319c" AND @timestamp >= "2025-01-15T10:00:00"
# Find errors for a specific order
@Level = 'Error' and OrderId = 'abc-123'
# Find slow operations
Duration > 5000 and Application = 'order-api'
# Free-text search combined with structured filter
@Message like '%timeout%' and @Level in ['Warning', 'Error']
# Correlation across services
TraceId = '0af7651916cd43dd8448eb211c80319c'
Seq signals are saved queries that trigger alerts. Define signals for recurring patterns (e.g., "Payment failures > 10/min") and attach notification channels.
# Filter by labels then regex on log line
{service_name="order-api"} |= "Error" | json | OrderId="abc-123"
# Structured field extraction and filtering
{service_name="order-api"} | json | Duration > 5000
# Count errors per service over time (for dashboards)
sum(rate({service_name=~".+"} |= "Error" [5m])) by (service_name)
// Find errors for a specific order
traces
| where severityLevel >= 3
| where customDimensions.OrderId == "abc-123"
| order by timestamp desc
// Slow operations
traces
| where toint(customDimensions.Duration) > 5000
| where cloud_RoleName == "order-api"
// Cross-service correlation
union traces, exceptions
| where operation_Id == "0af7651916cd43dd8448eb211c80319c"
| order by timestamp asc
High-throughput systems can generate millions of log events per minute. Without sampling, storage costs and query performance degrade rapidly.
| Strategy | How it works | Use when |
|---|---|---|
| Head-based | Decide to sample before processing | Consistent per-request; simple to implement |
| Tail-based | Decide to sample after processing | Keep all errors/slow requests, drop routine logs |
| Level-based | Sample by severity | Always keep Warning+, sample Debug/Info |
| Dynamic | Adjust rate based on volume | Handle traffic spikes without config changes |
The filter processor in the OTel Collector drops log records at the pipeline level before they reach exporters. Use it to exclude noisy low-severity logs and reduce storage volume.
Note: The tail_sampling processor operates on traces (spans), not logs. For log volume management, use the filter and transform processors instead.
processors:
filter:
logs:
exclude:
match_type: regexp
# Drop Debug and Trace logs at the collector level
severity_texts: ["DEBUG", "TRACE"]
exclude:
match_type: strict
# Exclude health check noise
bodies:
- "Health check endpoint hit"
transform:
log_statements:
- context: log
conditions:
# Keep all Warning+ logs unconditionally
- severity_number >= SEVERITY_NUMBER_WARN
statements: []
// Serilog.Expressions package for conditional log filtering
builder.Host.UseSerilog((context, loggerConfiguration) =>
{
loggerConfiguration
.ReadFrom.Configuration(context.Configuration)
// Drop health check logs entirely
.Filter.ByExcluding("RequestPath = '/health/ready'")
// Sample Debug logs at 10%
.Filter.ByExcluding(
"@Level = 'Debug' and Hash(@i) % 10 != 0");
});
Key packages:
<PackageReference Include="Serilog.Expressions" Version="5.*" />
Logs must not contain personally identifiable information (PII) in production. GDPR, HIPAA, and SOC 2 require that sensitive data is masked or excluded from log storage.
// Enricher that masks known-sensitive properties on every log event
public sealed class PiiMaskingEnricher : ILogEventEnricher
{
private static readonly HashSet<string> s_sensitiveKeys = new(
StringComparer.OrdinalIgnoreCase)
{
"Email", "PhoneNumber", "IpAddress",
"CreditCard", "SSN", "Password"
};
public void Enrich(LogEvent logEvent, ILogEventPropertyFactory factory)
{
var propertiesToMask = logEvent.Properties
.Where(p => s_sensitiveKeys.Contains(p.Key))
.Select(p => p.Key)
.ToList();
foreach (var key in propertiesToMask)
{
logEvent.AddOrUpdateProperty(
factory.CreateProperty(key, "***REDACTED***"));
}
}
}
// Registration
loggerConfiguration.Enrich.With<PiiMaskingEnricher>();
processors:
attributes:
actions:
# Mask email addresses using regex
- key: user.email
action: update
value: "***@redacted.com"
# Remove sensitive attributes entirely
- key: http.request.header.authorization
action: delete
- key: user.password
action: delete
In distributed systems, a single user request may traverse multiple services. Correlation enables tracing a request across all services and reconstructing the full event timeline.
The primary correlation mechanism is the W3C traceparent header, which propagates automatically through HttpClient when OpenTelemetry instrumentation is configured (see [skill:dotnet-observability]). All log events emitted within a traced request include TraceId and SpanId properties.
// Query all logs for a distributed operation across services
// In Seq:
TraceId = '0af7651916cd43dd8448eb211c80319c'
// In Kibana:
TraceId: "0af7651916cd43dd8448eb211c80319c"
// In Azure Monitor:
traces | where operation_Id == "0af7651916cd43dd8448eb211c80319c"
When trace context is insufficient (e.g., async workflows spanning message queues, batch jobs, or external system callbacks), add custom correlation IDs:
// Propagate a business correlation ID through Serilog LogContext
public sealed class CorrelationIdMiddleware(RequestDelegate next)
{
private const string CorrelationHeader = "X-Correlation-Id";
public async Task InvokeAsync(HttpContext context)
{
var correlationId = context.Request.Headers[CorrelationHeader]
.FirstOrDefault() ?? Guid.NewGuid().ToString("N");
context.Response.Headers[CorrelationHeader] = correlationId;
using (LogContext.PushProperty("CorrelationId", correlationId))
{
await next(context);
}
}
}
// Registration
app.UseMiddleware<CorrelationIdMiddleware>();
For asynchronous messaging (Azure Service Bus, RabbitMQ), propagate correlation through message properties:
// Producer -- attach correlation to message
var message = new ServiceBusMessage(payload)
{
CorrelationId = Activity.Current?.TraceId.ToString()
?? Guid.NewGuid().ToString("N"),
ApplicationProperties =
{
["BusinessCorrelationId"] = orderId.ToString()
}
};
// Consumer -- restore correlation in log scope
processor.ProcessMessageAsync += async args =>
{
using var scope = logger.BeginScope(new Dictionary<string, object>
{
["CorrelationId"] = args.Message.CorrelationId,
["BusinessCorrelationId"] =
args.Message.ApplicationProperties["BusinessCorrelationId"]
});
logger.LogInformation("Processing message {MessageId}", args.Message.MessageId);
await ProcessAsync(args.Message, args.CancellationToken);
};
| Practice | Rationale |
|---|---|
Always include TraceId in log output | Enables log-to-trace joins in observability platforms |
Use CorrelationId for business flows | Survives async gaps where trace context resets |
| Store correlation IDs in message headers | Enables end-to-end tracing through queues |
| Include correlation in error responses | Enables support teams to look up the full trace |
Use Serilog LogContext.PushProperty or MEL BeginScope | Automatically attaches to all log events in scope |
OTEL_EXPORTER_OTLP_ENDPOINT) or configuration so the same image works across environments.