From quoth
Beta-Bernoulli confidence tracking with empirical Bayes cold-start and exponential forgetting. Use when building pattern reliability scores from noisy reward signals, cold-starting new items with partial pooling, or implementing temporal decay in non-stationary bandit systems.
How this skill is triggered — by the user, by Claude, or both
Slash command
/quoth:bayesian-confidenceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- You have binary/continuous reward signals for items (patterns, articles, recommendations)
Each item has Beta(α, β) posterior. Start at Beta(1,1) = uniform.
μ = α / (α + β)σ² = αβ / ((α+β)² · (α+β+1))Binary reward r ∈ {0,1}:
α ← α + r
β ← β + (1-r)
Continuous reward r ∈ [0,1]:
α ← α + r
β ← β + (1-r)
New item in cluster C. Estimate cluster-level prior via method of moments:
μ̂_C = mean(r_i for i in C)
σ̂²_C = var(r_i for i in C)
ν = μ̂_C(1-μ̂_C)/σ̂²_C - 1 # effective sample size
α_C = μ̂_C · ν
β_C = (1-μ̂_C) · ν
New item inherits α_new = α_C / prior_strength, β_new = β_C / prior_strength (e.g., prior_strength=5 → weak inheritance).
Decay sufficient statistics (Garivier & Moulines ALT'11):
α_t ← γ · α_{t-1} + r_t
β_t ← γ · β_{t-1} + (1-r_t)
Typical γ = 0.99/day. Equivalent to exponentially-weighted moving average over rewards.
UPDATE patterns SET
alpha = alpha + :r,
beta = beta + (1 - :r),
confidence = (alpha + :r) / (alpha + beta + 1)
WHERE id = :id
function sampleBeta(alpha, beta) {
const g1 = sampleGamma(alpha)
const g2 = sampleGamma(beta)
return g1 / (g1 + g2)
}
function sampleGamma(shape) {
if (shape < 1) return sampleGamma(shape + 1) * Math.pow(Math.random(), 1/shape)
const d = shape - 1/3, c = 1 / Math.sqrt(9*d)
while (true) {
let x, v
do { x = normalRandom(); v = 1 + c*x } while (v <= 0)
v = v*v*v
const u = Math.random()
if (u < 1 - 0.0331 * Math.pow(x, 4)) return d * v
if (Math.log(u) < 0.5*x*x + d*(1 - v + Math.log(v))) return d * v
}
}
npx claudepluginhub montinou/quothCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.