From ai
LLM evaluation and red-teaming toolkit using promptfoo. Covers promptfooconfig.yaml configuration, 40+ assertion types (deterministic, model-graded, RAG), provider setup (OpenAI, Anthropic, Google, Ollama, HTTP, custom JS/Python), red teaming (134+ plugins, jailbreak strategies, compliance frameworks), CLI commands, caching, and CI/CD integration. Use when writing promptfooconfig.yaml, designing LLM test suites, running adversarial red team evaluations, or integrating LLM quality gates in CI/CD. Detects: promptfooconfig.yaml or promptfoo in package.json. For general LLMOps operations, use designing-genai-patterns. For general test methodology (TDD/AAA), use testing-code.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai:evaluating-with-promptfoogeneral-purposeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
詳細な手順・ガイドラインは `INSTRUCTIONS.md` を参照してください。
詳細な手順・ガイドラインは INSTRUCTIONS.md を参照してください。
npx claudepluginhub sumik5/sumik-llm-plugin --plugin aiCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.