By vanman2024
LLM testing and evaluation framework with promptfoo, DeepEval, golden datasets, and Supabase-backed eval tracking
npx claudepluginhub vanman2024/ai-dev-marketplace --plugin llm-evalsManages golden datasets for LLM evaluation - test case creation, categorization, and version control
Specializes in DeepEval pytest-style LLM testing with built-in metrics and custom evaluations
Orchestrates LLM evaluation workflows - coordinates promptfoo, DeepEval, datasets, and tracking
Specializes in promptfoo configuration, prompt regression testing, and multi-provider comparison
DeepEval pytest-style LLM testing patterns with built-in metrics, custom evaluators, and CI integration. Use when creating LLM tests, evaluating RAG quality, or measuring faithfulness/relevance.
Supabase-backed evaluation tracking with runs, cases, and scores tables. Use when storing eval results, building dashboards, or tracking regression over time.
promptfoo configuration patterns for prompt regression testing, multi-provider comparison, and assertion-based validation. Use when setting up prompt testing, comparing LLM providers, or creating eval pipelines.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Uses power tools
Uses Bash, Write, or Edit tools
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Team-oriented workflow plugin with role agents, 27 specialist agents, ECC-inspired commands, layered rules, and hooks skeleton.
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research