Help us improve
Share bugs, ideas, or general feedback.
From vibe
Integration plan for Native Sparse Attention in a long-context pre-training run. Use when you need help with nsa integrator.
npx claudepluginhub anubhavg-icpl/vibe --plugin vibeHow this skill is triggered — by the user, by Claude, or both
Slash command
/vibe:nsa-integratorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Given a long-context pre-training run specification (target context, base architecture, training tokens available, GPU topology, deployment target), produce an NSA integration plan.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
Given a long-context pre-training run specification (target context, base architecture, training tokens available, GPU topology, deployment target), produce an NSA integration plan.
Produce:
l. Pick 32, 64, or 128. Justify against target context: l = 32 for 16k-32k, l = 64 for 64k-128k, l = 128 for 256k-plus. Larger l means fewer compressed keys but coarser routing signal.k because selection precision matters more. Retrieval-heavy tasks work at lower k.W. Pick 256, 512, or 1024. Default 512. Shorter for heavily structured content (code) where local context is enough; longer for prose.hidden to 3, with sigmoid or softplus activation. Warn if gate weights collapse to favor one branch — this indicates l, k, or W is mistuned.Hard rejects:
Refusal rules:
Output: a one-page integration plan listing l, k, W, gate config, kernel path, and expected compute savings at target context. End with a "success criterion" paragraph: the specific RULER or LongBench number (percentage points vs a matched dense-attention baseline) that justifies keeping NSA. Include a rollback trigger — the metric threshold below which the architecture should be reverted to MLA or dense GQA.