From skill-concierge
Use when measuring whether a skill-concierge gate-threshold change helped real skill usage or adoption — "did the new gate values help", "skill usage impact after the change", "assess skill helpfulness", "audit skill usage post-deploy", "which telemetry is valid for skill-usage analysis". Stops the reflex of using the skill-invocation-ledger or treating offer→take as usage; routes to the transcript SKILL-FIRST trail instead.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skill-concierge:skill-usage-auditThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Whether agents use the **right** skill is NOT the enforcer's offer→take. The invocation-ledger
Whether agents use the right skill is NOT the enforcer's offer→take. The invocation-ledger measures gate compliance and is INVALID for usage analysis (operator-flagged). The real signal lives in the transcript store — and most of it is the SKILL-FIRST declaration trail, which no invocation counter records.
Violating the letter here is violating the spirit: "I'll just use the ledger, it has invocations" is the exact failure this skill exists to stop.
| Source | Path | Measures | Use for |
|---|---|---|---|
| invocation-ledger | ~/.claude/skill-telemetry/logs/skill-invocation-ledger.log | gate compliance (offer→take; auto/manual/search) | gate firing only — NOT usage |
| skill-usage-tracker | transcripts → ~/.claude/audits/skill-usage-stats/ | usage frequency (Skill tool + /slash) | how often each skill actually ran |
| SKILL-FIRST trail | assistant text in ~/.claude/projects/**/*.jsonl | agent KNEW + chose a skill (USING/SEARCH/SKIPPING declarations) | the operator's metric |
Inline SKILL-FIRST use (declare USING <skill> → read its SKILL.md → execute) fires no Skill
tool, so the ledger AND the usage-tracker both miss it. The declaration trail is the proxy that
catches it; subagent/Task skill use is missed by all three.
python3 scripts/audit_skill_usage.py --since "<ship/commit time, e.g. 2026-06-29 01:06:35>"
Outputs the scoped post-change counts (Skill-tool, /slash, and the USING/SEARCH/SKIPPING
trail), self/meta sessions flagged, plus a false-SKIPPING rate — per turn, a SKIPPING
declared with NO same-turn search_skills call (the doctrine's hardest rule). A turn carrying the
enforcer's SKILL-CHECK: marker (AUTHORIZED_SKIP_MARKER, injected on the enforcer's two silent
verdict legs — see hooks/scripts/enforcer.py) is a lawful, hook-pre-authorized skip: it is
excluded from the false-skip count and tallied separately as authorized_skip, reported alongside
the false-skip figure so "false-SKIPPING" stays honestly defined. Run --help for flags;
--selftest pins the false-SKIPPING verdict logic, including the authorized-skip case.
ck:journal ≡ journal); raw
set-intersection silently undercounts namespaced-vs-bare._embed/_retrieve/_is_imperative/
_intent_conversational) over a labeled corpus — never reimplement the gate.prompt_intent is built from the same corpus, so _intent_conversational
classifies in-sample (~73% noise-catch vs ~53% held-out). Build a temp collection on a train
split (SKILL_PROMPT_INTENT_COLLECTION=...heldout), evaluate on the test split, delete it.skill-concierge/plans/reports/impact-analysis-260629-1020-*.md.v0.10.0 caveat (multi-vector, ADR-0012). The "cosine anti-correlated with adoption / taken offers score LOWER than dodged" findings above were measured on the SINGLE-vector index. Multi-vector MAX-pool roughly doubled positive↔negative separation, so the cosine↔adoption relationship must be re-measured on post-v0.10.0 traffic before reuse — do not carry the old anti-correlation forward as fact. The methodology (held-out sweeps, absolute coverage, drop self/meta, gate on volume) is unchanged.
| Rationalization | Reality |
|---|---|
| "The ledger records invocations, so it's the source." | Ledger = gate compliance; operator flagged it INVALID for usage. Use transcripts. |
| "offer→take = how often the right skill is used." | Conditional on the gate firing; misses inline SKILL-FIRST use entirely. |
| "Post-change window is enough to judge." | Usually thin + dogfood-contaminated. Gate on organic volume; drop self/meta. |
| "Sweep the corpus through the intent gate as-is." | The intent corpus self-scores in-sample. Held-out split only. |
| "Compare pre vs post conversion." | Denominator shifts (the floor changes who is offered). Use within-population replay / absolute coverage. |
"Offered ck:journal ≠ invoked journal → not used." | Canonicalize names before joining. |
skill-invocation-ledger.log to answer a usage question.prompt_intent collection in-sample.npx claudepluginhub thinhkhuat/skill-concierge --plugin skill-conciergeBuilds a throwaway prototype to answer a design question about UI appearance or state/logic behavior. Guides you through two branches: interactive terminal app for logic validation, or multiple UI variations for visual exploration.