Run GGUF models locally with Mozilla Llamafile, launching OpenAI-compatible API servers configurable for GPU/CPU inference, SDK integrations, installations, startups, and connection troubleshooting in offline setups.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub jamie-bitflight/claude_skills --plugin llamafileBuild FastMCP 3.x Python MCP servers — covers provider/transform architecture (including CodeMode, Tool Search, and server-level transforms), component versioning, session state, authorization (MultiAuth, PropelAuth, connection-pooled token verifiers), evaluation creation, Pydantic validation, async patterns, STDIO and HTTP transports, nginx reverse proxy deployment, background tasks, Prefab Apps UI, security patterns, client SDK usage, testing, deployment, and migration from FastMCP v2. TypeScript is a legacy reference only and is not updated for v3.
Read The Fucking Prompt — finds the strongest user reaction to an AI instruction-following failure in a chosen session, reconstructs the triggering assistant output, and renders a shareable terminal-style PNG.
This skill should be used when the model needs to ensure code quality through comprehensive linting and formatting. It provides automatic linting workflows for orchestrators (format → lint → resolve via concurrent agents) and sub-agents (lint touched files before task completion). Prevents claiming "production ready" code without verification. Includes linting rules knowledge base for ruff, mypy, and bandit, plus the linting-root-cause-resolver agent for systematic issue resolution.
When setting up commit message validation for a project. When project has commitlint.config.js or .commitlintrc files. When configuring CI/CD to enforce commit format. When extracting commit rules for LLM prompt generation. When debugging commit message rejection errors.
Faithful information summarization with fidelity preservation, structured output, and anti-hallucination methodology. Provides skills for file, URL, and image summarization; agents for autonomous summarization tasks; and hooks for validating agent output structure.
Run AI models locally with Ollama - free alternative to OpenAI, Anthropic, and other paid LLM APIs. Zero-cost, privacy-first AI infrastructure.
When calling LLM APIs from Python code. When connecting to llamafile or local LLM servers. When switching between OpenAI/Anthropic/local providers. When implementing retry/fallback logic for LLM calls. When code imports litellm or uses completion() patterns.
Local-first resolver for Hugging Face models (GGUF, MLX, safetensors). The agent checks your own storage and any mounted drives before downloading anything.
Delegate heavy code generation to a local LLM (Ollama / LM Studio). Save tokens, keep oversight.
Run any model with an Anthropic- or OpenAI-compatible API (e.g. DeepSeek, GLM, Kimi, Qwen, MiniMax) — even your Codex subscription — as real Claude Code workflows, agent-team teammates, or one-shot subagents, driven exactly like native ones. Your main session's own auth is untouched (OAuth subscription or API key, either works); API-key providers bill the provider key via apiKeyHelper, while a Codex subscription bills through a local OAuth daemon — each worker receives its credential on demand, never through its env or argv. Requires the `cc-fleet` binary on PATH, installed separately.
Editorial "LLM Application Developer" bundle for Claude Code from Antigravity Awesome Skills.