From honeydew-ai
Guides you step-by-step through defining a business metric (aggregation) on a Honeydew entity. Covers SQL expression building and pushes to Honeydew via the MCP tools.
npx claudepluginhub honeydew-ai/honeydew-ai-coding-agents-plugins --plugin honeydew-aiThis skill uses the workspace's default tool permissions.
Before creating metrics, ensure you are on the correct workspace and branch. Use `get_session_workspace_and_branch` to check the current session context. For development work, create a branch with `create_workspace_branch` (the session switches automatically). See the `workspace-branch` skill for the full workspace/branch tool reference.
Defines metrics using a standardized template covering plain English definition, formula, components, segmentation, data sources, thresholds, limitations, and drivers. Ensures clarity in analysis.
Produce a complete metrics definition doc — metric name, formula, data source, segmentation, SQL or event tracking spec, and what good/bad looks like. Given a product area, outputs the full metrics spec. Use when asked to "define KPIs", "metrics framework", "what should we measure", "north star metric", or "instrument this feature".
Provides step-by-step guidance, code, and configurations for metric calculator operations in data analytics, covering SQL queries, data visualization, statistical analysis, and business intelligence. Activates on 'metric calculator' phrases.
Share bugs, ideas, or general feedback.
Before creating metrics, ensure you are on the correct workspace and branch. Use get_session_workspace_and_branch to check the current session context. For development work, create a branch with create_workspace_branch (the session switches automatically). See the workspace-branch skill for the full workspace/branch tool reference.
A Honeydew metric is a named, reusable aggregation anchored to an entity. Unlike a calculated attribute (which is per-row), a metric collapses multiple rows into a single value. Metrics are context-sensitive: they automatically respond to whatever filters and groupings the consuming BI tool or analyst applies.
Your job is to build an aggregation function that users can later group by.
If a user asks for a group by ("sum sales by category"), ignore the group. You're building the aggregation to sum sales by ANYTHING. The user will later use it in a query with their chosen dimensions.
Use a metric when:
Do not use a metric when the value is per-row (use a calculated attribute instead).
entity.count) if available,
or COUNT(entity.key_field) on the primary key. No DISTINCT needed — primary keys are
unique by definition.entity.metric_name) anywhere you'd write a raw aggregation — inside FILTER (WHERE ...), inside
GROUP BY (...) (both fixed GROUP BY (dim) and nested GROUP BY (*, dim)), and as operands
in derived arithmetic. Always check whether an existing metric can be the building block before
reaching for SUM(...), COUNT(...), etc. The named-metric form expresses business intent,
inherits future changes to the base metric, and keeps the model DRY.related_entity.count FILTER (WHERE source_entity.key IS NOT NULL) for a filtered count.
See reference.md "Cross-entity counts" for mechanics.entity.attribute_name) rather than repeating its SQL logic.entity.attribute, not just attributeSee reference.md — particularly the "Composing from Existing Metrics" section — for the canonical patterns: filtered, fixed grouping, nested grouping, cross-entity counts, and derived metrics, each with named-metric and raw-fallback forms shown side by side.
Always use create_object with full YAML to ensure proper datatype and all properties are set.
Call create_object with yaml_text:
type: metric
entity: <entity_name>
name: <snake_case_name>
display_name: <Human Readable Name>
description: |-
<business description>
owner: <owner_email_or_team>
datatype: float|number|string|date|timestamp
sql: |-
<aggregation SQL expression>
Required fields:
type: metricentity — the entity this metric belongs toname — snake_case identifierowner — CRITICAL: always set to current username (from workspace context)datatype — CRITICAL: always set explicitly (default to float for most metrics, number for counts)sql — the aggregation expressionOptional fields:
display_name — human readable namedescription — business contextformat_string — display format (e.g., $#,##0.00)labels — categorization tagsfolder — organizational pathTo modify an existing metric:
get_entity with the entity name to find the metric and its details.search_model (with search_mode: EXACT) to find the metric's object_key.update_object with the full updated YAML (yaml_text) and the object_key.Minimal diff rule: When updating, preserve the existing field order and formatting from the current YAML. Only change the fields you need to modify. Objects are versioned in git, so unnecessary reordering or reformatting creates noisy diffs.
After a successful create_object or update_object call, the response includes a ui_url field. Always display this URL to the user so they can quickly open the object in the Honeydew application.
search_model (with search_mode: EXACT) to find the metric's object_key.delete_object with that object_key.See examples.md for full worked examples covering: basic, derived, filtered, ratio, count, distinct count, fixed grouping, nested aggregation, text summary, update, and delete.
Use these MCP tools to explore existing metrics:
get_entity — Get entity details including all its metrics, attributes, datasets, and relationsget_field — Get detailed info about a specific metric by entity and field namesearch_model — Search for metrics across the model by name (use search_mode: EXACT for known names, OR for broad discovery)list_entities — List entities to identify where to anchor new metricsMany metric requests are ambiguous. ALWAYS clarify before implementing:
| Ambiguous Term | Possible Interpretations | Ask User |
|---|---|---|
| "per day/week/month" | A) Breakdown by period (multiple rows) | "Do you want revenue for each day (fixed grouping) or average daily revenue (single KPI)?" |
| B) Average per period (single value) | ||
| "rate" | A) Ratio (X / Y) | "Is this a ratio (e.g., conversion rate = orders/visits) or velocity (e.g., orders per hour)?" |
| B) Velocity (X per time unit) | ||
| "growth" | A) Absolute difference | "Do you want absolute growth ($100 → $150 = $50) or percentage growth (50%)?" |
| B) Percentage change | ||
| "average" | A) Simple mean | "Simple average or weighted average? If weighted, by what?" |
| B) Weighted mean |
After user clarifies intent, use the correct SQL pattern. Wherever a raw aggregation appears below, prefer the named-metric form when an existing metric covers the base aggregation — see reference.md for the full named-metric ↔ raw fallback table.
| User Choice | SQL Pattern (preferred named-metric form / raw fallback) | Example |
|---|---|---|
| Breakdown by period | entity.metric GROUP BY (time_field) / AGG(field) GROUP BY (time_field) | order_header.total_revenue GROUP BY (order_header.order_date) |
| Average per period | AVG(entity.metric GROUP BY (*, time_field)) / AVG(AGG(field) GROUP BY (*, time_field)) | AVG(order_header.total_revenue GROUP BY (*, order_header.order_date)) |
| Ratio | entity.metric_a / NULLIF(entity.metric_b, 0) | orders.order_count / NULLIF(orders.customer_count, 0) |
| Velocity (per time) | entity.metric / COUNT(DISTINCT time_field) / AGG(field) / COUNT(DISTINCT time_field) | order_header.total_revenue / NULLIF(COUNT(DISTINCT order_header.order_date), 0) |
| Absolute growth | current - previous | Requires time comparison logic |
| Percentage growth | (current - previous) / NULLIF(previous, 0) * 100 | Requires time comparison logic |
CRITICAL: "Breakdown by X" or "for each X" = Fixed GROUP BY
order_header.total_revenue GROUP BY (order_header.order_date)order_header.total_revenue ← requires manual grouping at query timeWhen the user's request contains phrases suggesting a specific granularity, ask before creating:
| Phrase | Likely Intent | Clarifying Question |
|---|---|---|
| "in an order", "per order", "by order" | Fixed grouping by order_id | "Should this always be calculated per order (fixed grouping), or flexible to group by any dimension?" |
| "in a day", "per day", "daily" | Fixed grouping by date | "Should this always be at daily granularity, or flexible?" |
| "per customer", "by customer" | Fixed grouping by customer_id | "Should this always be per customer, or flexible?" |
Rule: If the request mentions "per X" or "in an X", clarify whether they want:
entity.metric GROUP BY (entity.x_id) (named-metric form preferred)entity.metric that users can group by anything laterUse the honeydew-docs MCP tools to search the Honeydew documentation when:
reference.md coversSearch for topics like: "metrics", "aggregation", "derived metrics", "fixed grouping", "time intelligence", "period over period", "YTD", "trailing window".
FILTER (WHERE ...) for filtered aggregations — NOT CASE WHEN.
The FILTER syntax is cleaner, more readable, and the native Honeydew pattern.
orders.amount FILTER (WHERE orders.is_promotional) (named-metric form preferred)SUM(orders.amount) FILTER (WHERE orders.is_promotional) (raw fallback)SUM(CASE WHEN orders.is_promotional THEN orders.amount ELSE 0 END)FILTER (WHERE ...), inside GROUP BY (dim) and GROUP BY (*, dim), and as
operands in derived arithmetic. Whenever an existing metric covers the base aggregation, use the
named-metric form rather than re-deriving from raw aggregation functions. Examples:
orders.revenue FILTER (WHERE orders.region='US') instead of SUM(orders.amount) FILTER (...)orders.revenue GROUP BY (orders.region) instead of SUM(orders.amount) GROUP BY (...)orders.revenue GROUP BY (*, orders.order_date) instead of SUM(...) GROUP BY (*, ...)orders.revenue - orders.cost instead of SUM(price) - SUM(cost)
This keeps definitions DRY and ensures changes propagate automatically.related_entity.count FILTER (WHERE source_entity.entity_key IS NOT NULL). Example: users.count FILTER (WHERE bookings.booking_id IS NOT NULL)related_entity.count directly, or COUNT(DISTINCT source_entity.fk_column) as a raw fallback.
See reference.md "Cross-entity counts" for why the filter matters.SUM(orders.net_price) instead of SUM(orders.price - orders.discount)
when orders.net_price is a calculated attribute that already encodes that subtraction.bookings.count FILTER (WHERE bookings.is_cancelled) instead of
bookings.count FILTER (WHERE bookings.status = 'cancelled') when bookings.is_cancelled is
a boolean attribute. Likewise, orders.revenue FILTER (WHERE orders.is_promotional) reuses the
is_promotional attribute rather than repeating its SQL condition.orders.revenue - orders.cost reuses two existing metrics rather than
re-deriving SUM(price) - SUM(cost) from raw columns.
Referenced objects propagate definition changes automatically and express business concepts
rather than SQL implementation details.entity.count) when available.
Otherwise, use COUNT(entity.key_field) on the primary key. No DISTINCT needed — primary
keys are unique by definition.gross_margin is better than revenue_minus_cogs_divided_by_revenue.orders.amount, not just amount.After creating ANY metric, you MUST invoke the validation skill to test and validate results.
See validation skill for:
get_data_from_fieldsQuick validation:
Call get_data_from_fields with:
metrics: ["<entity>.<metric_name>"]For filtered cross-entity count metrics — run get_data_from_fields with just the new
metric and compare it against related_entity.count standalone. If you built a filtered count
(e.g. users who have at least one booking), the result should be less than or equal to the
related entity's total. If the two values are equal, the join-forcing filter isn't working as
intended — revisit the filter expression.
COUNT(*) — use the entity's built-in count metric if available, or
COUNT(entity.key_field) on the primary key. No DISTINCT needed — primary keys are unique.GROUP BY (dim), a nested GROUP BY (*, dim), or arithmetic, the named-metric
form is preferred over the raw aggregation. The raw form is only correct when no suitable named
metric exists.FILTER (WHERE source_entity.entity_key IS NOT NULL). See reference.md for mechanics.COUNT(DISTINCT entity.fk_column) without checking if a related entity has a count
metric first. Entities auto-generate a count metric. If a relation exists,
prefer the related_entity.count FILTER (...) form. COUNT(DISTINCT) is the fallback
when there's no relation or no count metric.