Help us improve
Share bugs, ideas, or general feedback.
From launchdarkly
Recommends metrics for LaunchDarkly experiments, guarded rollouts, and release policies. Surfaces existing auto-attach configuration and inventories available metrics to guide selection.
npx claudepluginhub launchdarkly/ai-toolingHow this skill is triggered — by the user, by Claude, or both
Slash command
/launchdarkly:launchdarkly-metric-chooseThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You're using a skill that helps users select the right metrics before setting up an experiment, guarded rollout, or release policy. Your job is to understand the feature context, surface what will auto-attach from existing project policies, inventory what's available and healthy, and produce a clear typed recommendation.
Creates a LaunchDarkly metric for experiments or rollouts. Instruments events (SDK setup, .env) first, then creates and verifies the metric.
Monitors PostHog A/B experiments for validity threats (SRM, contamination, exposure stalls, flag mutations) and lifecycle drift (zombie experiments, decided-yet-running, stale flag variants).
Guides selection, migration, and multi-platform coordination for experimentation tools (Statsig, PostHog, GrowthBook, Optimizely, Amplitude, Eppo, Kameleoon). Triggers on platform evaluation, switching, consolidation, or migration cost discussions.
Share bugs, ideas, or general feedback.
You're using a skill that helps users select the right metrics before setting up an experiment, guarded rollout, or release policy. Your job is to understand the feature context, surface what will auto-attach from existing project policies, inventory what's available and healthy, and produce a clear typed recommendation.
This skill is advisory. It does not create metrics, attach them to experiments, or configure rollouts. For those tasks, see the related skills at the end of this document.
This skill requires the remotely hosted LaunchDarkly MCP server to be configured in your environment.
Required MCP tools:
list-metrics — inventory available metrics with their types and event keyslist-metric-events — check which event keys have recent activityOptional MCP tools (enhance workflow):
list-release-policies — fetch project-level policies that configure which metrics auto-attach to guarded rollouts. Use this for the guarded rollout and release policy paths.Ask two questions upfront:
What is this for?
What is the change?
For experiments — skip this step. There is no pre-existing configuration to surface.
For guarded rollouts and release policy work, call list-release-policies first:
list-release-policies(projectKey)
Surface the results before making any recommendations:
Your project has 2 release policies:
Policy: "Production guardrails" (applies to: environment=production)
Auto-attaches to guarded rollouts:
✓ api-error-rate (count, LowerThanBaseline)
✓ p95-latency (value, LowerThanBaseline)
✓ [Metric group] Core Platform Health (3 metrics)
Policy: "Default" (applies to: all environments)
No metrics configured.
This tells the user what's already covered before they choose anything additional. For a guarded rollout, these metrics will appear automatically — the recommendation is about what to add on top, not rebuild from scratch.
If no policies exist or none have metrics configured, note that all metrics must be selected manually.
Call list-metrics to see all metrics in the project, then cross-reference with list-metric-events.
Organize into two groups:
| Group | Criteria | Note |
|---|---|---|
| Healthy | Event key appears in list-metric-events | Safe to recommend |
| At-risk | Event key absent from list-metric-events | Warn: may not produce data |
Show this inventory before recommending — it may reveal that a metric the user has in mind has no events flowing.
The reasoning differs meaningfully by context.
Start with the hypothesis, not the metric list.
Ask the user to complete this sentence before looking at available metrics:
"If this change succeeds, [metric] will [increase / decrease]."
The primary metric must directly measure that hypothesis — not a proxy, not a correlation. If the user can't complete the sentence, help them get there first.
Propose one primary metric. It must:
HigherThanBaseline or LowerThanBaseline)Propose typed secondary metrics. Suggest at least one of each type that applies:
| Type | Purpose | Example |
|---|---|---|
| Guardrail | Did the change break anything? | Error rate, crash rate, latency p95 |
| Counter-metric | Did A improve at the cost of B? | If primary is conversion, add support tickets or session length |
| Supporting signal | Does correlated behavior confirm the hypothesis? | If primary is signup, add onboarding step 2 completion |
One of each type is usually the right amount. More secondary metrics add noise and interpretation burden.
Guarded rollouts are safety mechanisms, not experiments. Each metric you add is a potential automatic rollback trigger — if it regresses beyond its threshold before the rollout completes, LaunchDarkly can stop and revert the release.
Start from what auto-attaches. After surfacing the release policy results in Step 2, ask: "Are the auto-attached metrics enough, or do you want to add more for this specific rollout?"
When recommending additional metrics:
Suggested starting point for any guarded rollout (if not already covered by a policy):
Release policies apply to every rollout in the project that matches their conditions. This is the highest bar.
Start from the current state. After surfacing existing policies in Step 2, ask: "Which policy are you editing, or do you want to create a new one? What environments or flag conditions will it apply to?"
When recommending metrics for a policy:
environment=production only applies to production rollouts. Help the user think through whether they want the same metrics in staging (where baselines may differ) or a separate policy.Typical strong policy candidates: error rate, a core conversion or engagement metric, latency.
Output a clear, named list. Be explicit about what each metric is for and what's already covered:
Recommended metrics for: new checkout flow guarded rollout (environment: production)
AUTO-ATTACHED (from "Production guardrails" policy):
✓ api-error-rate (count, LowerThanBaseline)
✓ p95-latency (value, LowerThanBaseline)
ADDITIONAL — recommended for this rollout:
✓ checkout-conversion (occurrence, HigherThanBaseline)
→ Confirms the rollout isn't degrading the core conversion the feature targets
⚠ page-load-time — no recent events. Instrument the event before including it,
or remove it from the list to avoid a false rollback trigger.
Then close with next steps:
device context but the experiment randomizes on user, the event won't be attributed correctly. Confirm that the context kind in track() calls matches the experiment's randomization unit.launchdarkly-metric-create — create a metric that doesn't exist yetlaunchdarkly-metric-instrument — add a track() call so events start flowing