By posthog
Query PostHog analytics and dashboards, manage feature flags and A/B experiments, audit SDK integrations and data warehouse health, instrument apps with SDKs across frameworks, analyze LLM traces costs evaluations and traffic patterns, diagnose issues, and capture Claude Code sessions for LLM analytics—all directly from your IDE.
npx claudepluginhub anthropics/claude-plugins-official --plugin posthogManually send a Claude Code session log to PostHog LLM Analytics
Set up PostHog LLM Analytics to capture Claude Code sessions
Check if Claude Code sessions are being sent to PostHog LLM Analytics
Analyze session replay patterns across experiment variants to understand user behavior differences. Use when the user wants to see how users interact with different experiment variants, identify usability issues, compare behavior patterns between control and test groups, or get qualitative insights to complement quantitative experiment results.
Audit PostHog experiments and feature flags for configuration issues, staleness, and best-practice violations. Read when the user asks to audit, health-check, or review experiments or feature flags, check flag hygiene, or verify experiment setup.
Audit the health of a PostHog project's data warehouse — find every broken or degraded pipeline item across sources, sync schemas, materialized views, batch exports, and transformations. Use when the user asks "what's broken in my warehouse?", "give me a health check", "audit my data pipeline", "why are some dashboards stale?", or wants a one-shot triage summary before deciding where to spend time. Produces a prioritized report of issues grouped by severity and type, with recommended next steps.
Identify and clean up stale feature flags in a PostHog project. Use when the user wants to find unused, fully rolled out, or abandoned feature flags, review them for safety, and then disable or delete them. Covers staleness detection, dependency checking, and safe removal workflows.
Configures the analytics side of a PostHog experiment — exposure criteria (default `$feature_flag_called` vs custom exposure events), primary and secondary metrics, the supported metric types (count, sum, ratio with `math` and `math_property`, retention with `retention_window_start` and `start_handling`), multivariate user handling ("Exclude" vs "First seen variant"), and how to read results once the experiment is live. Use when the user adds or edits a primary or secondary metric (e.g. "add a secondary metric tracking 'downloaded_file' per user"), sets up a ratio metric (e.g. "revenue from purchase_completed / pageviews"), sets up a retention metric (e.g. "$pageview → uploaded_file, 7-day window"), configures custom exposure (e.g. "only count users who hit /checkout"), changes multivariate handling, or asks "who is in the analysis?", "how do I measure impact?", "is this winning?", "what's the confidence level?", or "should I ship?".
Configures the rollout shape of a PostHog experiment — the variant split (50/50, 80/20, A/B/C ratios), the overall rollout percentage that gates how many users enter the experiment, and the disambiguation when a percentage like "roll out to 25%" could mean either. Use when the user mentions a rollout percentage, variant split, or traffic distribution; gives a ratio like 60/40, 70/30, or 80/20; asks "who sees the test variant?"; wants to increase, decrease, or change the rollout or split on a draft or running experiment; weighs equal vs uneven splits; or proposes a mid-experiment split change (often an anti-pattern that needs reset or end-and-restart).
Copy a feature flag from one PostHog project to one or more target projects in the same organization. Use when the user wants to duplicate a flag, promote a flag from staging to production, sync flags across projects, or replicate a flag configuration in a different workspace. Covers cohort remapping, scheduled-change handling, encrypted payloads, and the safe defaults (disabled in target, no scheduled changes).
Guides agents through the 3-step experiment creation flow: defining the hypothesis, configuring rollout, and setting up analytics. Delegates rollout decisions to configuring-experiment-rollout and metric setup to configuring-experiment-analytics. TRIGGER when: user asks to create a new experiment or A/B test, OR when you are about to call experiment-create. DO NOT TRIGGER when: user is updating an existing experiment, managing lifecycle, or only browsing experiments.
Debugs why session recordings aren't appearing in the local dev environment. Use when a developer reports that local replay ingestion isn't working, recordings aren't showing up despite /s calls, or the replay pipeline seems broken after hogli start. Covers the full local pipeline: SDK capture, Caddy proxy, capture-replay (Rust), Kafka, ingestion-sessionreplay (Node), recording-api (Node), SeaweedFS, and common failure modes like orphaned processes, stuck phrocs workers, and trigger misconfiguration.
Diagnose why a data warehouse sync is failing and recommend the right recovery action. Use when the user asks "why isn't my Stripe/Postgres/Hubspot sync working?", "this table has been stuck for hours", "the data in the warehouse looks wrong", or wants to troubleshoot a specific source or schema. Covers source-level vs schema-level failures, stuck Running states, credential and schema-drift errors, incremental-field misconfig, CDC prerequisite failures, and the cancel / reload / resync / delete-data recovery actions.
Diagnoses why a session recording is missing or was not captured. Use when a user asks why a session has no replay, why recordings aren't appearing, or wants to troubleshoot session replay capture issues for a specific session ID or across their project. Covers SDK diagnostic signals, project settings, sampling, triggers, ad blockers, and quota/billing scenarios.
Diagnoses the health of a project's PostHog SDK integrations — which SDKs are up to date, which are outdated, and what to do about it. Use when a user asks about PostHog SDK versions, outdated SDKs, upgrade recommendations, "SDK health", "SDK doctor", or when events or features seem off and it might be due to using an old SDK.
Investigates distributed application performance using PostHog APM (OpenTelemetry span) data via MCP. Use when the user asks about service traces, slow HTTP/database spans, error spans, trace IDs, or span attributes — not LLM analytics traces or product logs. Uses posthog:query-apm-spans, posthog:apm-trace-get, posthog:apm-services-list, posthog:apm-attributes-list, and posthog:apm-attribute-values-list.
Guides exploration of $autocapture events captured by posthog-js to understand user interactions, find CSS selectors (especially data-attr attributes), evaluate selector uniqueness, query matching clicks ad-hoc, and create actions. Use when the user asks about autocapture data, wants to find what users are clicking, needs to build actions from click events, asks about elements_chain, wants to build a trend or funnel filtered by clicks or other autocapture interactions, asks which properties autocapture sends, or asks how to filter $autocapture events. Only applies to projects using posthog-js autocapture.
Investigate LLM analytics clusters — understand usage patterns in AI/LLM traffic, compare cluster behavior, compute cost/latency metrics, and drill into individual traces within clusters.
Investigate LLM spend in PostHog — total cost over time, cost by model, provider, user, trace, or custom dimension, token and cache-hit economics, and cost regressions. Use when the user asks "how much are we spending on LLMs?", "which model / user / feature is most expensive?", "why did cost spike?", wants to build a cost dashboard or alert, or pastes a trace URL and asks about its cost.
Investigate LLM analytics evaluations of both types — `hog` (deterministic code-based) and `llm_judge` (LLM-prompt-based). Find existing evaluations, inspect their configuration, run them against specific generations, query individual pass/fail results, and generate AI-powered summaries of patterns across many runs. Use when the user asks to debug why an evaluation is failing, surface common failure modes, compare results across filters, dry-run a Hog evaluator, prototype a new LLM-judge prompt, or manage the evaluation lifecycle (create, update, enable/disable, delete).
ABSOLUTE MUST to debug and inspect LLM/AI agent traces using PostHog's MCP tools. Use when the user pastes a trace URL (e.g. /llm-observability/traces/<id>), asks to debug a trace, figure out what went wrong, check if an agent used a tool correctly, verify context/files were surfaced, inspect subagent behavior, investigate LLM decisions, or analyze token usage and costs.
Set up an LLM-judge evaluation that extracts canonical use cases for a PostHog feature at scale and streams the results to a Slack channel as a live feed. Use when someone wants to understand how users are actually using a specific AI/LLM-powered feature in production — what they're investigating, what questions they're trying to answer, and what patterns surface — without manually reading hundreds of traces. Assumes the feature emits `$ai_generation` and `$ai_evaluation` events with `$session_id` linkage to the trigger user's recording (the standard setup post the session-summary linkage PRs).
Resolves a PostHog experiment reference from natural language to a concrete experiment ID by browsing `experiment-list` (not feature-flag tools), with disambiguation when multiple experiments match. Use when the user names or quotes an experiment ("split test demo", "the File engagement boost experiment", "onboarding retention test", "landing page hero experiment", "pricing experiment"), describes it loosely ("the signup experiment", "my pricing test", "the one with the new checkout"), uses a relative reference ("latest", "most recent", "the one I created yesterday"), filters by status (running, draft, stopped, archived), or otherwise refers to an experiment by anything other than its concrete ID.
Finds the most informative session recording linked to an error tracking issue. Use when a user has an error tracking issue ID and wants to watch a replay showing what the user was doing when the error occurred. Ranks linked sessions by recency, activity score, and journey completeness, then summarizes the pre-error context. Replaces blind session picking from potentially hundreds of linked recordings.
Explore PostHog's Inbox — the surface where signal reports surface as actionable issues and trends. Use when the user asks "what's in my inbox?", "what should I look at?", "which reports are actionable?", "what's PostHog flagged recently?", asks about a specific report by ID or title, or wants to see which signal sources are configured. Covers listing, filtering, and drilling into reports, plus pointers to the deeper `signals` skill when raw signals or semantic search are needed.
Add PostHog error tracking to capture and monitor exceptions. Use after implementing features or reviewing PRs to ensure errors are tracked with stack traces and source maps. Also handles initial PostHog SDK setup if not yet installed.
Add PostHog feature flags to gate new functionality. Use after implementing features or reviewing PRs to ensure safe rollouts with feature flag controls. Also handles initial PostHog SDK setup if not yet installed.
Add PostHog SDK integration to your application. Use when setting up PostHog for the first time or reviewing PRs that need PostHog initialization. Covers SDK installation, provider setup, and basic configuration for any framework.
Add PostHog LLM analytics to trace AI model usage. Use after implementing LLM features or reviewing PRs to ensure all generations are captured with token counts, latency, and costs. Also handles initial PostHog SDK setup if not yet installed.
Add PostHog log capture to track application logs. Use after implementing features or reviewing PRs to ensure meaningful log events are captured with structured properties. Also handles initial OTLP exporter setup if not yet configured.
Add PostHog product analytics events to track user behavior. Use after implementing new features or reviewing PRs to ensure meaningful user actions are captured. Also handles initial PostHog SDK setup if not yet installed.
Diagnose why a product metric changed (dropped, spiked, or plateaued) by orchestrating breakdowns, actors, paths, lifecycle, retention, and annotations queries. Use when the user reports an anomaly, asks "why did X change?", or needs root-cause analysis for a trend, funnel, retention, stickiness, or lifecycle metric.
Investigates a session recording by gathering metadata, person profile, same-session events, and linked error tracking issues in one pass. Use when a user provides a recording or session ID and wants to understand what happened — who the user was, what they did, what errors occurred, and whether there are related error tracking issues. Replaces the manual chain of session-recording-get, persons-retrieve, execute-sql, and error-tracking-issues-list.
Guides experiment state transitions: launching, pausing, resuming, ending, shipping variants, archiving, resetting, and duplicating. Covers preconditions, implications for variant assignment and analysis, and the decision framework for when to use each action. TRIGGER when: user asks to launch, pause, resume, end, ship, archive, reset, or duplicate an experiment. DO NOT TRIGGER when: user is creating an experiment (use creating-experiments), configuring rollout (use configuring-experiment-rollout), or setting up metrics (use configuring-experiment-analytics).
Manage PostHog subscriptions — scheduled email, Slack, or webhook deliveries of insight or dashboard snapshots. Use when the user wants to subscribe to an insight or dashboard, check existing subscriptions, change delivery frequency, add or remove recipients, or stop receiving updates.
Required reading before writing any HogQL/SQL or calling execute-sql against PostHog. Use whenever the user wants to search, find, or do complex aggregations PostHog entities (insights, dashboards, cohorts, feature flags, experiments, surveys, hog flows, data warehouse, persons, etc.) and query analytics data (trends, funnels, retention, lifecycle, paths, stickiness, web analytics, error tracking, logs, sessions, LLM traces). Covers HogQL syntax differences from ClickHouse SQL, system table schemas (system.*), available functions, query examples, and the schema-discovery workflow.
Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on. Use when the user wants to "connect Stripe", "import data from Postgres", "add a new data source", "sync my warehouse tables", or wants to pick sync methods for each table. Walks through source-type discovery, credential validation, table discovery, per-table sync_type selection, and the final create call. Also covers picking a good prefix and what to do right after creation.
How to query the document_embeddings table for raw signal data using HogQL. Use when you need to perform semantic search over signals, fetch every signal that contributed to a specific report, or list signal types. For browsing the curated report layer (the Inbox) — listing reports, filtering by status/source, drilling into a single report by ID — use the `inbox-exploration` skill first; drop into this skill afterwards if the user wants the underlying observations.
Discover and use shared team skills stored in PostHog. Use when the user asks to list, browse, load, or manage "shared skills", "team skills", or references the "skills store" / "skill store".
Use when the user asks about revenue, payments, subscriptions, billing, CRM deals, support tickets, production database tables, or other data that PostHog does not collect natively. Also use when a query fails because a table does not exist or returns no results for expected external data. The data warehouse can import from SaaS tools (Stripe, Hubspot, etc.), production databases (Postgres, MySQL, BigQuery, Snowflake), and other arbitrary data sources. Covers checking existing sources, identifying the right source type, and guiding the setup.
Inspects PostHog Visual Review (VR) runs that gate PR merges with screenshot regression checks. Use when the user mentions "visual review", "VR", "snapshot diff", "screenshot test", "storybook regression", "playwright snapshot", asks why a PR is blocked or what changed visually, wants to triage the VR backlog, decide whether a snapshot diff is real vs flaky, or check whether a story has been changing across runs. Also invoke when a PR has a failing `visual-review` status check, when a PR comment mentions "Visual review", or when the user is on a branch with an open VR run.
Change the sync configuration of an existing data warehouse schema — switch sync_type, pick a different incremental_field, set primary_key_columns, choose cdc_table_mode, or change sync_frequency. Use when the user asks "switch my orders table from full refresh to incremental", "this table is syncing too slowly / too frequently", "I need to pick a different incremental column", "set up CDC for this Postgres table", or when diagnosis of a failing sync pointed to an incremental-field or PK misconfiguration.
Official PostHog plugin for AI clients. Access PostHog products directly from your AI coding tool.
Install the plugin:
claude plugin install posthog
Authenticate via OAuth:
# Just enter Claude Code anywhere
claude
# Then, use the /mcp command within Claude, select plugin:posthog:posthog, and press Enter
/mcp
Then follow the browser prompts to log into PostHog.
(Optional) Send Claude Code sessions to PostHog LLM Analytics.
Add to ~/.claude/settings.json (global) or .claude/settings.local.json (per-project):
{
"env": {
"POSTHOG_LLMA_CC_ENABLED": "true",
"POSTHOG_API_KEY": "phc_...",
"POSTHOG_HOST": "https://eu.i.posthog.com"
}
}
Both POSTHOG_LLMA_CC_ENABLED=true and POSTHOG_API_KEY are required. Sessions are sent when Claude Code exits. Set POSTHOG_LLMA_PRIVACY_MODE=true to redact prompt/output content. Add custom properties to all events with POSTHOG_LLMA_CUSTOM_PROPERTIES (JSON string, e.g. '{"ai_product": "my-app"}').
Install from the Cursor Marketplace or add manually in Cursor Settings > Plugins.
Add the marketplace:
codex plugin marketplace add PostHog/ai-plugin
Install the plugin from inside Codex:
codex
# Then run /plugins, select PostHog, and install
/plugins
gemini extensions install https://github.com/PostHog/ai-plugin
Clone and install the plugin:
git clone https://github.com/PostHog/ai-plugin
claude --plugin-dir ./ai-plugin
Authenticate via OAuth:
/mcp
Then follow the browser prompts to log into PostHog.
This plugin provides access to 27+ PostHog tools across these categories:
The plugin also ships 30+ task-specific skills that your AI client loads on demand to follow PostHog best practices — covering HogQL query patterns, experiment creation and lifecycle, feature flags, data warehouse setup and troubleshooting, LLM analytics exploration, session replay diagnostics, and SDK instrumentation. Skills activate automatically when their description matches your request (e.g. "create an experiment", "why isn't my Stripe sync working?", "audit my feature flags"), so you generally don't need to invoke them by name.
> What feature flags do I have?
> Create a feature flag called new-onboarding for 50% of users
> Show me errors from the last 24 hours
> Which errors are affecting the most users?
> How many users signed up this week?
> What's the conversion rate for the checkout funnel?
> Show me all my experiments
> What are the results of the checkout-flow experiment?
> Create a new dashboard called Product Metrics
> Add the signup funnel insight to the Growth dashboard
> What are the responses to the NPS survey?
> Create a feedback survey for the checkout page
> What's my most triggered event?
> Show me the top 10 pages by pageviews
For self-hosted PostHog instances, set the POSTHOG_MCP_URL environment variable to point to your instance:
export POSTHOG_MCP_URL="https://mcp.your-posthog-instance.com/mcp"
MIT
Claude Code skill pack for PostHog (24 skills)
Use Amplitude as an expert analyst - instrument Amplitude, discover product opportunities, analyze charts, create dashboards, manage experiments, and understand users and accounts
Amplitude-powered analytics skills — analyze dashboards, charts, experiments, feedback, and account health with AI.
AI agent skills that make SaaS products data-ready for product analytics — from codebase scan to tracking plan to working instrumentation code.
PROACTIVELY use this agent when experiments are started, modified, or when results need analysis. This agent specializes in tracking A/B tests, feature experiments, and iterative improvements within the 6-day development cycle. Should be triggered automatically when experimental code paths or feature flags are introduced. Examples:\n\n<example>\nContext: When implementing feature flags or A/B test variants\nuser: "Add a feature flag to test the new onboarding flow"\nassistant: "I've implemented the feature flag for the onboarding flow A/B test. Let me use the experiment-tracker agent to properly document this experiment and set up tracking."\n<commentary>\nEvery feature flag represents an experiment that needs proper tracking and success metrics.\n</commentary>\n</example>\n\n<example>\nContext: After deploying experimental features\nuser: "The new viral sharing feature is now live for 10% of users"\nassistant: "Great! The viral sharing feature is deployed. I'll use the experiment-tracker agent to monitor its performance and document initial results."\n<commentary>\nLive experiments need immediate tracking setup to capture valuable data from the start.\n</commentary>\n</example>\n\n<example>\nContext: When reaching experiment milestones\nuser: "It's been a week since we launched the TikTok integration test"\nassistant: "Perfect timing for analysis. Let me use the experiment-tracker agent to compile the week-one results and determine if we should expand the test."\n<commentary>\nRegular experiment checkpoints ensure data-driven decisions about feature rollouts.\n</commentary>\n</example>\n\n<example>\nContext: Before making product decisions\nuser: "Should we keep the AI avatar feature or remove it?"\nassistant: "Let me analyze the experiment data to inform this decision. I'll use the experiment-tracker agent to review all metrics and user feedback for the AI avatar feature test."\n<commentary>\nProduct decisions should be backed by experiment data, not gut feelings.\n</commentary>\n</example>
External network access
Connects to servers outside your machine
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Analytics pipeline orchestrator covering instrumentation, modeling, and dashboards
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claim