Monitoring and observability patterns for Prometheus metrics, Grafana dashboards, Langfuse LLM tracing, and drift detection. Use when adding logging, metrics, distributed tracing, LLM cost tracking, or quality drift monitoring.
Generates monitoring patterns for Prometheus metrics, Grafana dashboards, Langfuse LLM tracing, and quality drift detection.
/plugin marketplace add yonatangross/orchestkit/plugin install orkl@orchestkitThis skill inherits all available tools. When active, it can use any tool Claude has access to.
checklists/langfuse-setup-checklist.mdchecklists/monitoring-implementation-checklist.mdexamples/orchestkit-langfuse-traces.mdexamples/orchestkit-monitoring-dashboard.mdmetadata.jsonreferences/agent-observability.mdreferences/alerting-dashboards.mdreferences/alerting-strategies.mdreferences/annotation-queues.mdreferences/cost-tracking.mdreferences/dashboards.mdreferences/distributed-tracing.mdreferences/embedding-drift.mdreferences/evaluation-scores.mdreferences/ewma-baselines.mdreferences/experiments-api.mdreferences/framework-integrations.mdreferences/langfuse-evidently-integration.mdreferences/logging-patterns.mdreferences/metrics-collection.mdComprehensive patterns for infrastructure monitoring, LLM observability, and quality drift detection. Each category has individual rule files in rules/ loaded on-demand.
| Category | Rules | Impact | When to Use |
|---|---|---|---|
| Infrastructure Monitoring | 3 | CRITICAL | Prometheus metrics, Grafana dashboards, alerting rules |
| LLM Observability | 3 | HIGH | Langfuse tracing, cost tracking, evaluation scoring |
| Drift Detection | 3 | HIGH | Statistical drift, quality regression, drift alerting |
| Silent Failures | 3 | HIGH | Tool skipping, quality degradation, loop/token spike alerting |
Total: 12 rules across 4 categories
# Prometheus metrics with RED method
from prometheus_client import Counter, Histogram
http_requests = Counter('http_requests_total', 'Total requests', ['method', 'endpoint', 'status'])
http_duration = Histogram('http_request_duration_seconds', 'Request latency',
buckets=[0.01, 0.05, 0.1, 0.5, 1, 2, 5])
# Langfuse LLM tracing
from langfuse import observe, get_client
@observe()
async def analyze_content(content: str):
get_client().update_current_trace(
user_id="user_123", session_id="session_abc",
tags=["production", "orchestkit"],
)
return await llm.generate(content)
# PSI drift detection
import numpy as np
psi_score = calculate_psi(baseline_scores, current_scores)
if psi_score >= 0.25:
alert("Significant quality drift detected!")
Prometheus metrics, Grafana dashboards, and alerting for application health.
| Rule | File | Key Pattern |
|---|---|---|
| Prometheus Metrics | rules/monitoring-prometheus.md | RED method, counters, histograms, cardinality |
| Grafana Dashboards | rules/monitoring-grafana.md | Golden Signals, SLO/SLI, health checks |
| Alerting Rules | rules/monitoring-alerting.md | Severity levels, grouping, escalation, fatigue prevention |
Langfuse-based tracing, cost tracking, and evaluation for LLM applications.
| Rule | File | Key Pattern |
|---|---|---|
| Langfuse Traces | rules/llm-langfuse-traces.md | @observe decorator, OTEL spans, agent graphs |
| Cost Tracking | rules/llm-cost-tracking.md | Token usage, spend alerts, Metrics API |
| Eval Scoring | rules/llm-eval-scoring.md | Custom scores, evaluator tracing, quality monitoring |
Statistical and quality drift detection for production LLM systems.
| Rule | File | Key Pattern |
|---|---|---|
| Statistical Drift | rules/drift-statistical.md | PSI, KS test, KL divergence, EWMA |
| Quality Drift | rules/drift-quality.md | Score regression, baseline comparison, canary prompts |
| Drift Alerting | rules/drift-alerting.md | Dynamic thresholds, correlation, anti-patterns |
Detection and alerting for silent failures in LLM agents.
| Rule | File | Key Pattern |
|---|---|---|
| Tool Skipping | rules/silent-tool-skipping.md | Expected vs actual tool calls, Langfuse traces |
| Quality Degradation | rules/silent-degraded-quality.md | Heuristics + LLM-as-judge, z-score baselines |
| Silent Alerting | rules/silent-alerting.md | Loop detection, token spikes, escalation workflow |
| Decision | Recommendation | Rationale |
|---|---|---|
| Metric methodology | RED method (Rate, Errors, Duration) | Industry standard, covers essential service health |
| Log format | Structured JSON | Machine-parseable, supports log aggregation |
| Tracing | OpenTelemetry | Vendor-neutral, auto-instrumentation, broad ecosystem |
| LLM observability | Langfuse (not LangSmith) | Open-source, self-hosted, built-in prompt management |
| LLM tracing API | @observe + get_client() | OTEL-native, automatic span creation |
| Drift method | PSI for production, KS for small samples | PSI is stable for large datasets, KS more sensitive |
| Threshold strategy | Dynamic (95th percentile) over static | Reduces alert fatigue, context-aware |
| Alert severity | 4 levels (Critical, High, Medium, Low) | Clear escalation paths, appropriate response times |
| Resource | Description |
|---|---|
| references/ | Logging, metrics, tracing, Langfuse, drift analysis guides |
| checklists/ | Implementation checklists for monitoring and Langfuse setup |
| examples/ | Real-world monitoring dashboard and trace examples |
| scripts/ | Templates: Prometheus, OpenTelemetry, health checks, Langfuse |
defense-in-depth - Layer 8 observability as part of security architecturedevops-deployment - Observability integration with CI/CD and Kubernetesresilience-patterns - Monitoring circuit breakers and failure scenariosllm-evaluation - Evaluation patterns that integrate with Langfuse scoringcaching - Caching strategies that reduce costs tracked by LangfuseSearch, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.