Guides decisions on model sovereignty: prompting, RAG, fine-tuning (LoRA/QLoRA), distillation, local hosting for privacy, cost, and customization needs.
npx claudepluginhub habitat-thinking/ai-literacy-superpowers --plugin ai-literacy-superpowersThis skill uses the workspace's default tool permissions.
Model sovereignty is the practice of making deliberate decisions about
Provides production-ready patterns for LLM apps including RAG pipelines, chunking strategies, vector DB selection, embedding models, and AI agent architectures. Use for designing RAG systems, agents, and LLMOps.
Deep-dives into ML/AI topics by fetching official docs and GitHub sources via KB or web tools, for explaining concepts, comparing approaches, or surveying frameworks like 'how does X work?' or 'X vs Y'.
Selects SageMaker Hub, base model, and fine-tuning technique (SFT, DPO, RLVR) for use case via API queries and scripts. Activates on model mentions or fine-tuning starts.
Share bugs, ideas, or general feedback.
Model sovereignty is the practice of making deliberate decisions about which models to use, where they run, and whether to create custom models. It extends the framework's Theme #2 (Agency and Sovereignty) into the model layer.
This skill guides practitioners through the decision framework from cross-cutting Theme #17 and Appendix P of the framework.
Exhaust simpler approaches before escalating complexity. Each step adds maintenance burden.
Walk through these questions in order. Stop at the first "yes."
Does your data require local processing?
PII, regulated data, trade secrets, or data subject to residency
requirements → local hosting is non-negotiable for those interactions.
Consult references/decision-framework.md.
Does knowledge change frequently? Information changing weekly/monthly → add a RAG layer regardless of hosting choice. RAG updates instantly; fine-tuning requires retraining.
Does the model need consistent domain behaviour at scale?
Reliable format compliance, style consistency, or decision logic across
thousands of requests → fine-tune with LoRA/QLoRA. Consult
references/technique-comparison.md.
Is baseline load above the break-even threshold?
~30M tokens/day sustained → self-hosted inference is economically
justified within 4 months. Consult references/hosting-options.md.
None of the above? Use cloud API models with good prompting and context engineering.
Ask: "If my API provider changed pricing, rate-limited me, or discontinued my model tomorrow, what would happen?"
A sovereign engineer has an answer:
For data sovereignty: Start with data classification. List every type of data that flows through your AI interactions. Classify each as Public, Internal, Sensitive, or Restricted. Update MODEL_ROUTING.md with routing rules based on classification.
For cost sovereignty: Calculate your monthly token usage. Compare API costs against self-hosted alternatives at your volume. The break-even is typically 30M tokens/day sustained.
For domain sovereignty: Identify your three most common AI failure modes. If failures come from missing knowledge → evaluate RAG. If failures come from inconsistent behaviour → evaluate fine-tuning. If failures come from reasoning capability → stay on frontier APIs with better prompting.
For operational sovereignty: Identify your vendor dependency. Could you switch providers in a week? A month? Never? The answer determines your urgency.
Custom models accumulate maintenance debt. Budget for:
For detailed technique comparisons, hosting option evaluation, and current-era model recommendations, consult the reference files.