From honeydew-ai
Guides creation of instructions, skills, knowledge pointers, and memory events for persistent organizational context. Enforces the hard boundary between context (rules/procedures) and the semantic layer (logic/calculations).
How this skill is triggered — by the user, by Claude, or both
Slash command
/honeydew-ai:context-item-creationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Before creating context items, ensure you are on the correct workspace and branch. Use `get_session_workspace_and_branch` to check the current session context. For development work, create a branch with `create_workspace_branch` (the session switches automatically). See the `workspace-branch` skill for the full workspace/branch tool reference.
Before creating context items, ensure you are on the correct workspace and branch. Use get_session_workspace_and_branch to check the current session context. For development work, create a branch with create_workspace_branch (the session switches automatically). See the workspace-branch skill for the full workspace/branch tool reference.
Context items are the mechanism for giving the AI analyst persistent, organizational knowledge: standing rules it must always follow, on-demand procedures it can retrieve, external references it can look up, and a historical record of notable events.
Context items belong to the context layer — they are consumed by the Honeydew AI analyst at query time to shape its behavior. They are separate from the semantic layer (entities, metrics, attributes, relations, domains), which defines your data model and business logic such as metric calculations.
There are four types:
| Type | Subtype | When retrieved | Use for |
|---|---|---|---|
| Instruction | instruction | Every prompt, automatically | Short standing rules or constraints the model must always respect |
| Skill | skill | On demand, when relevant | Long-form analysis playbooks and procedural guides |
| Knowledge | knowledge | On demand, when relevant | Pointers to external content (Confluence, Notion, etc.) |
| Memory | event | On demand, when relevant | Historical events and decisions that affect data interpretation |
Before creating anything, read the critical boundary section below. Most requests for context items are actually requests to define or document logic — which belongs in the semantic layer, not in context.
This is the most important rule in this skill. Re-read it every time you are asked to create a context item.
Context items are for rules, procedures, and history. They are NOT for logic, calculations, or data definitions.
If the information describes how to compute something, what a value means mathematically, or how data is structured — it belongs in the semantic layer (as a metric, attribute, filter, relationship, or entity definition), not in a context item.
| Request | ❌ Wrong approach | ✅ Correct approach |
|---|---|---|
| "Add a rule that revenue = price × quantity" | Instruction: "revenue is price × quantity" | Calculated attribute or metric with sql: orders.price * orders.quantity |
| "Explain how to calculate LTV" | Instruction with formula | Metric with the LTV SQL expression |
| "Document that active customers are those with an order in 90 days" | Instruction with definition | Calculated attribute is_active with sql: orders.last_order_date >= CURRENT_DATE - 90 |
| "Add a rule that we always filter to the current fiscal year" | Instruction with filter logic | Domain filter or global parameter — unless the preference is about which existing filter to apply, in which case an instruction referencing the filter by name is fine |
| "Record that the orders table has columns order_id, customer_id..." | Knowledge item | Entity and dataset definition via entity-creation skill |
| "Add a note that revenue excludes returns" | Instruction | Metric sql with FILTER (WHERE NOT orders.is_return), plus a clear description on the metric |
orders.net_revenue rather than orders.gross_revenue for revenue questions."The test: Can you express this as a SQL expression or a schema definition? If yes, it belongs in the semantic layer. Context items should reference semantic objects by name, not define or duplicate their logic.
Context items are only one of three places organizational context reaches the AI analyst. Before drafting an item, check whether the fact belongs in one of the other two — both are already part of every relevant session automatically, with no glob configuration.
| Where the context lives | When the AI sees it | Use for |
|---|---|---|
Field / entity description (AI metadata) | Whenever that field or entity is in scope | The meaning of a field, its allowed values, synonyms for an entity — facts that travel with a single semantic object |
Agent description + markdown body | Every session of that agent | What the agent covers, user-perspective phrasing, agent-wide scope or tone |
| Context item | Instructions: every prompt. Skills / knowledge / memory: only when retrieval matches. | Cross-cutting rules, on-demand playbooks, external pointers, historical events |
Routing examples:
orders.status codes mean: 'A' = active, 'L' = legacy (pre-2022 migration, treat as active), 'X' = test record." → field description (non-SQL facts about allowed values). If the rule were "active means status in ('A','L')", that's logic → make it a calculated attribute, not a description.synonyms, not an instruction.**-globbed instruction.orders.net_revenue over orders.gross_revenue." → instruction (cross-cutting, references existing semantic objects).The test: is this fact about one semantic object, one agent, or the organization more broadly? Only the third belongs in a context item. The first two are already loaded into AI context via the semantic-object or agent definition — putting them in a context item duplicates and risks drift.
Default to extending an existing item rather than creating a sibling, whenever the new fact describes the same domain concept as an existing item.
The most important "extend, don't create" target is often not a sibling context item — it's the field/entity description or the agent body (see "where does this context belong?" above). Check those first. The check below applies once you've confirmed the fact is genuinely cross-cutting and belongs in a context item at all.
The check, before drafting anything:
list_context_items and scan items whose folder or name touches the new fact's domain.get_context_item, then update_context_item extending it. Broaden the title and description to reflect the new scope; keep the existing name to avoid breaking agent globs that reference it.Smell test for "this should have been an update": two items in the same folder whose descriptions could be merged into a single coherent paragraph without losing information. Memory events especially are prone to this — a "launch" event and a "sunset" event for the same dataset are bookends of one window, not two independent facts. One memory event with both from_date and to_date is correct; two events split by sub-aspect is the smell.
Why this matters: each on-demand item is an independent retrieval candidate. Splitting one fact into N items means the model has to retrieve all N to reconstruct the concept, and any one being missed produces a partial-truth answer. One coherent item retrieves once and answers fully.
Call create_context_item with the frontmatter fields as context_item parameters and the prose body as markdown_text.
After a successful create_context_item or update_context_item call, the response includes a ui_url field. Always display this URL to the user so they can quickly open the item in the Honeydew application.
To modify an existing context item:
get_context_item with the item's name to read its current content.update_context_item with the item's name and the updated context_item fields and markdown_text.Minimal diff rule: Only change the fields you need to modify. Context items are versioned in git, so unnecessary reformatting creates noisy diffs.
Call delete_context_item with the item's name.
| Task | Tool |
|---|---|
| Create | create_context_item |
| Update | update_context_item |
| Delete | delete_context_item |
| Get one | get_context_item |
| List all | list_context_items |
Names use folder/item-name with lowercase kebab-case and / as a hierarchy separator:
finance/use-net-revenue
customer/churn-analysis-playbook
confluence/data-governance-policy
finance/revenue-redefinition-2024
A flat name (no folder) is fine when the item is genuinely cross-cutting:
root-cause-analysis
gdpr-compliance-rollout
When configuring an AI agent, context items are referenced using glob patterns. Folders let you include a coherent group of items with a single pattern:
finance/* — all finance-specific rules and skillscustomer/* — all customer analysis guidelines*/churn-* — all churn-related items regardless of team foldermenu/** — all items under menu/ including subfolders (e.g. menu/drinks/latte-art)** — all context items across all folders and subfolders (use sparingly; typically for a catch-all agent)Organize by the team or domain that owns the context, not by the item type. Items of different types (instruction + skill + knowledge) for the same domain belong in the same folder:
finance/use-net-revenue ← instruction
finance/variance-analysis ← skill
finance/accounting-policy ← knowledge
finance/revenue-redefinition ← memory
This way, an agent configured with finance/* automatically gets all finance context.
finance/use-net-revenue is good. instructions/use-net-revenue is redundant.exclude-canceled-orders is better than filter-status-not-canceled.revenue-redefinition or gdpr-rollout beats data-change-3.data/launch-may-2023 locks the scope to the launch event; when the dataset later acquires a cutoff date, the narrow name tempts you into a sibling item rather than extending. data/data-window or data/data-coverage stays accurate as the facts grow. Same principle for instructions: bookings/revenue-definition survives a redefinition; an implementation-pinned name like bookings/use-confirmed-revenue-v2 does not.Context items are stored as frontmatter documents: a YAML block between --- delimiters, followed by a markdown prose body. When calling create_context_item, pass the YAML fields as context_item parameters and the prose body as the markdown_text parameter.
The description field is the retrieval signal — the model reads it to decide whether to fetch an on-demand item. It is used for skills, knowledge, and memory events. Do not add description to instructions; their only content is the prose body.
See examples.md for what the full frontmatter documents look like.
Use when: There is a standing preference, constraint, or fact that the model must respect in every conversation. These are injected into every prompt automatically.
Keep instructions concise. They are never filtered out, so verbose instructions bloat every prompt. If the content is more than a short paragraph, turn it into a skill instead.
Parameters:
type: instructionsubtype: instructionname: folder/rule-nametitle: short human labelrelated_objects: (optional) only the fields or entities directly named or constrained by this instruction — not downstream consumers, not lineage, not entities that happen to join to a relevant one; use an entity reference instead of a field list when the instruction applies to most fields of that entitylabels: optional tags for groupingowner: (optional) defaults to the caller; only set if a different team or person should be listedGood instruction topics:
orders.gmv metric")Use when: There is a multi-step analysis procedure the model should follow when asked about a specific topic. Skills are retrieved when the model judges them relevant based on the description.
Write a rich, keyword-dense description — this is what the model reads to decide whether to retrieve the skill. A weak description means the skill will be missed.
Parameters:
type: instructionsubtype: skillname: topic-or-folder/topictitle: short human labeldescription: required — precise, keyword-rich description of when to use this skill and what it covers (this is the retrieval signal)---)labels: optional tags for groupingowner: (optional) defaults to the callerGood skill topics:
Use when: There is an external document (Confluence page, Notion database, runbook, policy) that the model should consult when answering relevant questions. The model fetches the content via an MCP server at retrieval time.
Write a precise description that includes the topics the document covers — this is the retrieval signal.
Parameters:
type: instructionsubtype: knowledgename: source-folder/document-nametitle: short human labeldescription: required — what topics this document covers (used for retrieval)external_source.tool: the MCP server name (e.g. confluence, notion)external_source.resource_id: the URI within that server (e.g. confluence://Page-Title)labels: optional tags for groupingowner: (optional) defaults to the callerUse when: Something happened that affects how data should be interpreted: a metric was redefined, a process changed, a data migration occurred, a business event shifted baseline numbers.
Memory items are retrieved on demand when the model judges them relevant to the current question.
Parameters:
type: memorysubtype: eventname: domain/event-nametitle: short human labeldescription: required — what this event is about (used for retrieval)from_date: when the event occurred (point-in-time) or when it started (if a duration)to_date: (optional) end date for events spanning a period. Use the from_date / to_date bookend pattern for windows — a dataset's data-coverage window, a feature's availability period, an experiment's run dates. Set both bounds on a single item; do not split a window into two "start" and "end" events.related_objects: (optional) only the fields or entities directly changed or affected by this event — not downstream consumers, not lineage, not entities that happen to join to a relevant one; use an entity reference instead of a field list when the event affects most fields of that entitylabels: optional tags for groupingowner: (optional) defaults to the callerGood memory topics:
See examples.md for full worked examples of each type, plus right-vs-wrong comparisons.
list_context_items — see all existing items (avoid duplicates)get_context_item — read a specific item's current content before updatinglist_agents / get_agent — see which agents reference which context globs, to understand the scoping impact of adding a new itemUse the honeydew-docs MCP tools to search the Honeydew documentation when:
Search for topics like: "context items", "instructions", "agent context", "memory events", "skills".
Semantic layer first. If the request is about logic, calculations, definitions, or data structure — build it in the semantic layer. Context items reference semantic objects; they do not replace them.
Instructions must be short. They're injected into every prompt. If it doesn't fit in 2–3 sentences, turn it into a skill.
Skill descriptions are the retrieval key. Write rich, specific, keyword-heavy descriptions. "Guide for root cause analysis of revenue drops, covering cohort isolation, period-over-period comparison, and funnel decomposition" is far better than "revenue analysis".
Folder by domain, not by type. finance/ should contain instructions, skills, knowledge, and events all related to finance. This makes agent glob patterns meaningful.
Check for adjacent items, not just duplicates, before creating. Run list_context_items first. Overlapping or contradictory instructions are worse than none — but adjacency matters as much as overlap. If an existing item describes the same domain concept (the data window, a metric's definition, a policy area), extend it rather than create a sibling. See "Before creating: extend or create?" above.
Use labels for cross-cutting concerns. Labels like pii, finance, or deprecated enable filtering that cuts across folder hierarchy.
Before creating any context item, verify:
description / AI metadata — already loaded into every relevant session, no glob needed.description + markdown body — already loaded into every session of that agent.If you are unsure, ask the user: "Should this live on the field/entity, on the agent, or as a separate context item?"
npx claudepluginhub honeydew-ai/honeydew-ai-coding-agents-plugins --plugin honeydew-aiInterviews users about their datasets and databases to generate reusable data context skills that document schema, entities, metrics, and domain knowledge.
Enriches DataHub metadata: adds/updates descriptions, tags, glossary terms, ownership, deprecation, domains, data products, structured properties, and documents.