Help us improve
Share bugs, ideas, or general feedback.
From gstack-distilled
Working with AI coding agents — User Sovereignty, agreement-as-signal, Karpathy/Willison framing, per-model overlays, and the rule to measure your prompt nudges.
npx claudepluginhub 0xabrar/gstack-distilled --plugin gstack-distilledHow this skill is triggered — by the user, by Claude, or both
Slash command
/gstack-distilled:agent-managementThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
How to work with AI coding agents (Claude, Codex, GPT, Gemini, etc.) without losing the plot.
Creates Claude Code agents from scratch or by adapting templates. Guides requirements gathering, template selection, and file generation following Anthropic best practices (v2.1.63+).
Create custom agents for Claude Code including YAML frontmatter, system prompts, tool restrictions, and discovery optimization. Use when creating, building, or designing agents, or when asked about agent creation, subagent configuration, Task tool delegation, or agent best practices.
Provides Claude Code templates for agentic design patterns like prompt chaining, routing, reflection, tool use, planning, and multi-agent workflows using 4-layer stack. For LLM task decomposition.
Share bugs, ideas, or general feedback.
How to work with AI coding agents (Claude, Codex, GPT, Gemini, etc.) without losing the plot.
Source: gstack ETHOS.md, model-overlays/, CLAUDE.md.
The one rule that overrides all others:
"AI models recommend. Users decide."
"Two AI models agreeing on a change is a strong signal. It is not a mandate. When Claude and Codex both say 'merge these two things' and the user says 'no, keep them separate' — the user is right. Always."
Karpathy: AI is the suit. You're the pilot.
Simon Willison: "Agents are merchants of complexity."
Anthropic's own research: "Experienced users interrupt Claude more often, not less. Expertise makes you more hands-on, not less."
"AI generates recommendations. The user verifies and decides. The AI never skips the verification step because it's confident."
Confidence ≠ correctness. Verify even when both models agree.
Each model has a default failure mode. Encode a one-line nudge against it.
Inherits Claude base, plus:
Anti-verbosity supplement: "Cap answers at the shortest form that contains the answer."
"You have strong internal reasoning. Use it, but do not expose chain-of-thought in outputs unless the user asks. Surface the conclusion plus evidence, not the reasoning chain."
"Bias toward action. Run commands and show results rather than explaining what commands you would run."
"If you ship system-prompt nudges for Claude, measure them."
In gstack v1.10.1.0, the team pulled a "Fan out explicitly" nudge for Opus 4.7 because evals showed:
| Setup | Fanout rate |
|---|---|
| Default Opus | 70% |
| With "fan out explicitly" nudge | 10% |
| With Anthropic's canonical text | 0% |
Three eval runs at $7 total. Pulled the bullet, shipped the retraction as a release.
The lesson: intuitions about what helps a model are often backwards. Measure before shipping prompt changes.
When two AI models agree:
When the user disagrees with both models:
Most "AI orchestration" advice treats consensus as ground truth. This explicitly rejects that.
When using o-series or thinking models, surface conclusions and evidence, not the reasoning chain. The user wants the answer with receipts, not the scratchpad.
For high-stakes decisions, emit a single question per turn. Stop. Wait for the user. Don't batch three questions hoping to save a round-trip.
"When in doubt, ask one question per turn."
"A false positive (invoking a skill that wasn't needed) is cheaper than a false negative (answering ad-hoc when a structured workflow exists)."
Most AI tooling celebrates open-ended reasoning. This inverts that — bias toward routing into structured workflows with quality gates, not letting the model "think it through."
When 3+ sessions are open in parallel:
"Every question re-grounds context (project, branch, what's happening) because they're juggling windows."
Even a small tax on context-switching pays off when you're managing multiple agent windows.
When asking the user a question, the universal format:
[Context: 1-2 sentences re-grounding]
[Question]
RECOMMENDATION: Choose X because ___
Options:
A) ...
B) ...
C) ...
Don't ask open-ended questions when you have a recommendation. Don't bury the recommendation. Don't omit it under "no preamble" rules.
Terseness is the wrong default for:
Terseness is the right default for: