Plugins listed here are tagged for this topic and auto-indexed from public GitHub repositories.
Plugins focused on test authoring, coverage analysis, mocking, and test automation across frameworks.
Jest, Vitest, Pytest, Playwright, Cypress, and Go testing are common. Filter by technology to find plugins for your test runner.
Several include agents or commands that analyze source files and generate corresponding unit tests, including edge cases and mock setups.
Some generate CI pipeline configurations for test execution. Plugins with hooks can run tests on file save. Check component types for automation support.
Enforce strict TDD cycles, generate detailed multi-step implementation plans, execute them in batches or via parallel subagents, manage isolated git worktrees for features, perform root-cause debugging and technical code reviews, verify tests/builds/lints before commits or PRs, all within Claude Code sessions.
Equip Claude with 13 targeted skills to run disciplined bug diagnosis loops (reproduce-minimize-hypothesize-instrument-fix-regression-test), prototype designs via throwaway terminal/UI apps, triage and vertically slice GitHub issues, enforce TDD red-green-refactor cycles, generate structured PRDs, grill plans against domain models, and deepen codebase architecture for testability and AI navigation.
Automate comprehensive PR reviews on git diffs or pull requests using specialized AI agents that analyze code quality, test coverage, error handling, type design, comments, and simplification opportunities. Get categorized issues summary with criticals, importants, suggestions, strengths, and action plan.
Iteratively create custom Claude Code skills from scratch, refine existing ones via drafting and description optimization, run test evaluations, and benchmark performance with quantitative metrics and variance analysis.
Delegate expert-level code reviews, security audits, penetration tests, QA automation, accessibility compliance checks, performance optimizations, chaos engineering, and compliance validations to specialized sub-agents across codebases, infrastructure, and systems.
Manage Python projects via structured tracks for features, bugs, refactors: initialize context artifacts like product.md and tech-stack.md, create detailed specs and phased plans, implement tasks with strict TDD workflow using pytest coverage and git commits, monitor status, revert commits, and validate artifacts for consistency.
Generate production-ready stateful CLI harnesses for GUI applications from local paths or GitHub repos, implementing Click CLI with REPL/JSON support, pytest unit/E2E tests, and docs. List installed harnesses, refine coverage gaps, run tests to verify functionality, and validate against standards.
Build scalable production Python backends and APIs with Django 5.x async views, FastAPI microservices, Celery tasks, SQLAlchemy/Pydantic data handling, pytest testing strategies, and architecture optimizations using uv/ruff for modern 3.12+ codebases.
Run PluginEval certification pipeline on Claude plugins or skills to compute quality scores, badges (Platinum/Gold/Silver/Bronze), dimension breakdowns, anti-patterns, and recommendations via static analysis and LLM judging across 10 criteria including triggering, orchestration, and output quality. Compare skills head-to-head or evaluate directories for actionable insights.
Design scalable SEO strategies with feasibility indexes, set up validated A/B tests including sample sizes and metrics, audit analytics for reliable data signals, optimize content via brand voice analysis and Python tools, craft targeted email sequences, and diagnose site SEO issues for marketing growth workflows.
Conduct DevSecOps security audits on CI/CD pipelines, SDLC controls, and threat models; execute authorized penetration tests on web apps with Burp Suite, cloud infrastructure across AWS/Azure/GCP, and Linux systems via reconnaissance, enumeration, privilege escalation; scan projects for OWASP Top 10 vulnerabilities and reference 100 critical web exploits with mitigations.
Enforce rigorous QA and testing workflows in Claude Code sessions: drive TDD for features and fixes, debug via four-phase root cause analysis, automate browsers with Playwright/Puppeteer best practices, plan A/B experiments with gates, apply code review checklists, build reliable E2E suites, and triage pytest failures systematically.
Equip AI coding agents with production engineering skills to handle full dev lifecycles: refine ideas to specs, implement via TDD slices, run tests/debug, perform multi-axis code reviews, optimize perf/security, automate CI/CD, and execute ship checklists.
Automate full TDD cycles from GitHub issues: write specific failing tests using Jest/Vitest, pytest, JUnit, or NUnit (Red); implement minimal passing code (Green); refactor for quality and security while keeping tests green. Explore sites via Playwright to generate, run, and debug TypeScript E2E tests.
Architect production-grade autonomous AI agents using bundled skills: design tool-using multi-agents with ReAct planning and safety, build stateful LangGraph systems with persistence and human-in-loop, implement optimized RAG pipelines, engineer prompts for reliability, evaluate via benchmarks and monitoring, and set up MCP servers for LLM-tool interactions.
Automate browsers and run end-to-end tests with Playwright directly in Claude. Interact with web pages by clicking elements, filling forms, taking screenshots, generating traces, and executing testing workflows locally via npx subprocess.
Scaffold new Claude Agent SDK apps in TypeScript or Python by interactively gathering requirements, installing dependencies, and configuring projects. Verify apps post-creation or changes for SDK best practices, code quality, security, type safety, documentation, and deployment readiness.
Run isolated local Playwright servers via stdio to control headless browsers for automating interactions, vision-based verification, tracing, and devtools access in testing and debugging workflows.
Migrate React 16/17 class-component codebases to React 18.3.1 via AI agents that audit deprecations, convert unsafe lifecycles/refs/context to modern patterns, fix automatic batching regressions, upgrade dependencies to exact versions, and rewrite Enzyme tests to React Testing Library until tests pass.
Access 754 cybersecurity skills to analyze malware samples, audit cloud and Kubernetes configs, detect threats in logs and traffic, perform authorized pentests and red team simulations, harden endpoints and infrastructure, build detection rules, and conduct incident response across web, network, endpoint, cloud, mobile, and OT environments.
Organize and manage browser tabs to streamline workflows and boost productivity. Automate browser tasks via CLI—navigate pages, inspect and interact with elements using selectors or text, scrape content, compare snapshots, export screenshots or PDFs, and control multiple instances. Develop the PinchTab Go server with React dashboard, executing dev commands, unit and E2E tests, git operations, checks, and PR checklists.
Run persistent dev sessions saving 98% context via MCP server with session continuity, FTS5 knowledge base, and sandboxed execution. Orchestrate parallel subagents for GitHub issue/PR triage, TDD cycles, architecture deepening, bug diagnosis loops, CLI output processing, real-time metrics dashboards, and self-upgrades without state loss across compactions.
Conduct full product discovery cycles in your IDE: brainstorm ideas and experiments for new/existing products from PM/designer/engineer views, identify/prioritize assumptions and features, triage requests, generate interview scripts, summarize transcripts, and design metrics dashboards.
Direct AI coding agents to create or update promptfoo evaluation suites with configs, prompts, tests, deterministic assertions, and provider setups following best practices. Streamline LLM eval coverage, regression debugging, and new eval matrix generation in JavaScript or Python projects using OpenAI or Anthropic models.
Generate PRDs, OKRs, outcome roadmaps, user stories, job stories, sprint plans, release notes, and stakeholder maps. Run pre-mortems for risk analysis, retrospectives for team feedback, prioritization frameworks, meeting summaries, and test scenarios with dummy data to manage agile product execution workflows.
Generate SQL queries from natural language descriptions using your database schema for PostgreSQL, MySQL, or BigQuery. Analyze CSV or Excel user data to produce cohort retention heatmaps, engagement trends, churn insights, and research recommendations. Evaluate A/B tests for statistical significance, confidence intervals, lift, and ship/extend/stop decisions with Python-powered reports.
Automate full Playwright E2E testing workflows: set up projects in React/Next.js/Vue/Angular/Svelte, generate tests from user stories/URLs/specs using 55+ templates, diagnose/fix flaky failures with agents, migrate Cypress/Selenium suites, run on BrowserStack, sync results with TestRail, analyze coverage/gaps, review anti-patterns, and output reports to Markdown/Slack/GitHub.
Run a localhost iMessage-style web chat to test Claude Code surfaces with file upload and edit capabilities, without tokens or access control.
Implement Trail of Bits handbook security testing workflows: fuzz Rust, Python, C/C++, Ruby code with AFL++, libFuzzer, cargo-fuzz, Atheris; instrument AddressSanitizer; run static analysis via Semgrep, CodeQL; generate coverage reports, dictionaries, and bypass obstacles for vulnerability detection.
Build multi-language code graphs to map call graphs, attack surfaces, blast radius, taint propagation, privilege boundaries, and complexity hotspots for security audits. Visualize architecture with Mermaid diagrams, compare snapshots across git commits for evolution analysis, triage mutation testing survivors, generate crypto test vectors, diagram protocols, and project SARIF findings onto graphs.
Design structured workflow skills for Claude Code using multi-step phases, decision trees, subagent delegation, and progressive disclosure for pipelines, routing, and safety gates. Audit skills via 6-phase review detecting structural issues, pattern adherence, tool correctness, and anti-patterns.
Automatically generate production-ready unit tests from source code files or snippets in JavaScript/TypeScript (using Jest, Vitest, or Mocha), Python (pytest), Java (JUnit 5), and Go. Auto-detects frameworks, covers happy paths, edge cases, boundaries, errors, and provides mocks for robust testing.
Generate and execute end-to-end browser tests for full user workflows spanning frontend and backend using Playwright, Cypress, or Selenium. Create tests with page objects, scenarios, and assertions, then run them to validate complete user journeys in browsers.
Streamline end-to-end Obsidian plugin development and vault management: scaffold projects with TypeScript setups, implement UI views/events/data handling, optimize performance/security, establish local dev loops/CI/CD/release pipelines, migrate content, and troubleshoot errors using 24 specialized skills.
Generate E2E test suites in JavaScript and automate testing for iOS/Android mobile apps on simulators/emulators using Appium, Detox, XCUITest, Espresso, or Maestro. Validate UI interactions, gestures, navigation, permissions, and platform behaviors.
Automate OWASP Top 10 vulnerability scans and penetration testing on JavaScript, Python, and Java codebases using Semgrep, ESLint-security, Bandit, and dependency audits. Delegate comprehensive security audits to a specialized agent covering injections, XSS, CSRF, authentication flaws, access control, and misconfigurations.
Orchestrate autonomous multi-agent sprints to develop full features from specs.md: agents handle architecture, parallel implementation of Next.js frontends and Python/FastAPI backends, CI/CD setup, automated testing, UI QA, reviews, and iterative convergence with structured reports and git safety.
Configure and optimize mewt/muton mutation testing campaigns by scoping targets, tuning timeouts, and streamlining long-running tests for Rust, Go, TypeScript, and JavaScript codebases.
Orchestrate multi-agent teams in Claude Code to decompose complex features into atomic subtasks with dependencies, execute them in parallel, discover and load project context/standards, implement via TDD with vitest/jest/pytest, run self-reviews, and deliver security-vetted code.
Backtest crypto and stock trading strategies on historical data to compute performance metrics like Sharpe and Sortino ratios, maximum drawdowns, equity curves, and optimize parameters via grid search.
Detect UI visual regressions by capturing screenshots of components or pages with Playwright or Cypress, comparing against baselines across responsive breakpoints, generating pixel diffs, analyzing changes, and producing markdown reports with recommendations. Integrates with Percy, Chromatic, BackstopJS, and CI workflows.
Track regression tests across code releases by mapping git commits to pytest or Jest tests, tagging markers for suites, flagging coverage gaps, generating pass/fail reports with flaky detection, viewing history, and enforcing runs in CI/CD pipelines.
Automate overnight software development by configuring Git hooks for TDD enforcement with tests and lints, then run Claude autonomously for 6-8 hours to build features that pass all checks by morning.
Generate test reports by parsing JUnit XML, Jest JSON, pytest results, and coverage data into Markdown/HTML formats with metrics, failures, slowest tests, trends, and CI annotations. Aggregate results across frameworks for summaries and exports in HTML, PDF, or JSON.
Design, execute, and analyze load, stress, spike, soak, and endurance tests on APIs, web apps, and databases using k6, Artillery, JMeter, Locust, and autocannon. Identify bottlenecks, review metrics, and verify SLAs to optimize performance.
Verify blockchain smart contracts match specifications from whitepapers, PDFs, Markdown, or URLs, detecting implementation gaps, undocumented behaviors, logic discrepancies, and security issues via structured audits and generating compliance reports.
Implement property-based testing strategies across multiple languages and smart contracts to verify invariants like serialization roundtrips, idempotence, parsing, validation, normalization, and algorithms for stronger coverage than example-based tests.
Configure VSCode extensions to test APIs with httpYac including auth scripts and CI/CD workflows, monitor multiple dev server ports like Vite and Next.js in real-time, and deploy static sites via SFTP to Nginx servers with secure setups.
Format and validate code files or directories with Prettier for JavaScript, TypeScript, CSS, Markdown, JSON, HTML, Vue, and Svelte. Check compliance without changes for CI via exit codes. Automatically create configs, pre-commit hooks, and .prettierignore. For Python projects, block sensitive env file edits and run pytest suites after file operations.
Build and deploy Vizro dashboards end-to-end via a enforced 2-phase workflow: gather requirements, design layouts and visualizations in YAML specs, implement Python code from examples and schemas, then run Playwright E2E tests with browser automation.
Generate k6, Artillery, wrk, or Gatling scripts for API load, stress, and soak tests to validate performance under configurable loads. Run tests locally to measure response times, throughput, error rates, scalability, and identify bottlenecks.
Evaluate machine learning models using metrics like accuracy, precision, recall, and F1-score to perform performance analysis, validation, model comparison, and optimization. Generate production-ready AI/ML code that includes validation, error handling, performance metrics, saved artifacts, and documentation.
Run interactive penetration tests on web apps and codebases: scan HTTP security headers for CSP/HSTS issues, audit npm/pip dependencies for vulnerabilities, analyze code for secrets/injections with bandit, get severity-prioritized findings, fix suggestions, and JSON reports.
Equip Claude Code, Cursor, and 17 similar AI tools with 20 Chinese skills (14 translated + 6 original) to enforce TDD workflows, systematic debugging, design-first planning, Chinese git conventions for Gitee/Coding.net, structured code reviews, parallel multi-agent task execution, and automated verification before commits—tailored for Chinese developers building production code.
Build, debug, test, deploy, secure, monitor, and optimize production LangChain applications using 24 Claude Code skills that generate LCEL chains and agents, implement RAG pipelines, set up CI/CD and FastAPI/Express APIs, handle migrations/upgrades, apply cost/performance tuning, and enforce security best practices.
Generate OpenAPI specs and Pact consumer contracts from API code, designs, or schemas to enable consumer-driven contract testing, documentation, code generation, verification tests, and CI/CD setup.
Automate development workflows by walking through code files line-by-line in VSCode or Vim, logging timestamped work sessions with file changes in daily Markdown, generating detailed issue specs staged in Git, engaging in adaptive Socratic quizzes for learning, and delegating UI validation tasks to a browser agent using Chrome DevTools.
Execute 175 slash commands to automate git workflows like branching/PR creation/issue syncing with Linear, code quality reviews/refactors/fixes, test generation/setup/coverage, CI/CD pipelines, security/performance audits, documentation generation, project scaffolding/setup, and deployments across JS/TS/Python/Go/Rust/Svelte stacks.
Generate realistic, relationally consistent test data and idempotent seed scripts by analyzing database schemas, respecting foreign keys, constraints, and data types with Faker libraries for dev/test environments across JS, Python, C#, Prisma, Node, and TypeScript.
Automate performance regression detection in CI/CD pipelines by generating test suites, baselines, thresholds, reporting, and PR integrations. Statistically compare response times, throughput, resource usage against baselines to validate builds and spot trends early.
Generate test doubles—mocks, stubs, spies, fakes—for unit testing by analyzing code dependencies. Produces implementations, fixtures, example tests, and rationale. Works across JavaScript (Jest, Vitest, Sinon), Python (pytest, unittest.mock), Go (gomock), and more frameworks.
Scan codebases for reflected, stored, and DOM-based XSS vulnerabilities across HTML, JavaScript, CSS, and URLs. Test WAF bypass techniques and CSP protections, then receive reports on risks with remediation suggestions via commands or natural language triggers.
Generate and run mock API servers from OpenAPI specifications to simulate stateful CRUD operations with realistic Faker.js data, latency delays, error conditions, and request recording—ideal for frontend testing and backend prototyping without a live server.
Generate and execute comprehensive test suites for REST and GraphQL APIs directly from OpenAPI specs, automating request generation, schema/response validation, CRUD coverage, auth handling, error/performance checks, idempotency tests, with reporting in Jest, pytest, Supertest, or REST-assured.
Generate realistic test data for users, products, orders, technical fields, and custom schemas to populate fixtures, factories, seeds, edge cases, and databases in JS/TS/Python/Ruby apps using Faker.js, Fishery, pytest fixtures, and factory patterns.
Provision and manage isolated test environments using Docker Compose and Testcontainers for databases, caches, queues like PostgreSQL, MySQL, Redis, DynamoDB. Generate docker-compose files, env vars, seed data scripts, startup scripts, and cleanup code to enable reliable, reproducible testing without local setup conflicts.
Run mutation testing on JavaScript, Python, Java, Go, C#, or Ruby codebases to evaluate test suite quality. Introduce code mutants with tools like Stryker, mutmut, PITest, or go-mutesting, check detection rates, identify coverage gaps, and generate reports with survival scores and improvement suggestions.
Generate and run load tests with k6, JMeter, or Artillery to validate web app and API performance under stress, spike, soak, and scalability scenarios. Detect bottlenecks, set thresholds, and integrate into CI/CD pipelines for automated validation.
Validate API responses against OpenAPI and JSON schemas to ensure contract compliance, detect schema drift, and verify data integrity. Generate JSON Schema definitions, Ajv validators, Express middleware, tests, docs, and monitoring directly in Node.js, Python, or Java workflows.
Test load balancing strategies by validating traffic distribution, health checks, failover, session persistence, and SSL on live NGINX, HAProxy, AWS ALB/NLB, GCP, and Kubernetes Ingress setups. Generate Jest test suites to verify these behaviors across backends.
Analyze test coverage reports from Jest/nyc, pytest, Go test, and JaCoCo across JavaScript, Python, Go, and Java projects to identify untested code paths, branch gaps, low-coverage files, enforce thresholds, and generate detailed reports with targeted test recommendations.
Create and manage snapshot tests for UI components and data using Jest, Vitest, or pytest to catch regressions. Analyze test failures with intelligent diff reviews, selectively update snapshots for intentional changes, validate and organize snapshot files, then generate detailed analysis reports.
Perform consumer-driven contract testing with Pact (JavaScript, Python, JVM) and OpenAPI validation for REST, GraphQL, gRPC APIs to detect breaking changes, generate tests, and produce detailed reports integrable into CI/CD pipelines.
Automate database testing workflows by generating test suites with data factories, transaction wrappers for automatic rollback, schema validation, assertions, cleanup, fixtures, migrations, integrity checks, and performance monitoring across PostgreSQL, MySQL, MongoDB, SQLite, Redis using Prisma, Drizzle, Jest, Pytest.
Audit web pages and components for WCAG 2.1/2.2 accessibility compliance using axe-core, Playwright, Pa11y, Lighthouse, and more. Detect ARIA errors, keyboard navigation issues, color contrast violations, and screen reader incompatibilities, then generate markdown reports with prioritized fixes and code examples.
Orchestrate complex test workflows across Jest, Vitest, pytest, Playwright, and Cypress with parallel execution, test sharding, dependency management, flakey retries, affected test selection, and result aggregation in GitHub Actions or GitLab CI. Generate optimized configs for CI/CD pipelines.
Generate fast (<5min) smoke test suites in Jest-style JavaScript for critical paths like system health, authentication, and core features, then run them post-deployment via Playwright, curl, Bash scripts, or CI/CD to verify UI, APIs, and functionality.
Fuzz test REST and GraphQL APIs using OpenAPI specs to detect crashes, vulnerabilities, edge cases, and unexpected behaviors with tools like Schemathesis, RESTler, OWASP ZAP. Generate test suites, security reports, and reproducible payloads for input validation and security auditing.
Build production .NET backends with Akka.NET actors, Aspire orchestration, EF Core patterns, concurrency primitives, and Kubernetes clustering; run integration tests via TestContainers/Playwright; optimize performance/databases; analyze CRAP scores, benchmarks, and concurrency bugs using 61 skills and 11 agents.
Integrate SerpApi into Python and Node.js/TypeScript apps to extract structured search data from Google, Bing, YouTube, Shopping, News, and Maps. Automate setup, auth, cost-free local testing with pytest/Vitest fixtures, Redis caching, rate limiting, proxy deployment to Vercel/GCP/Fly.io, security hardening, production checklists, SEO monitoring, and legacy migrations via 18 Claude Code skills.
Enforce Test-Driven Development by auto-detecting test frameworks like Vitest, Jest, Storybook, pytest, or Go tests, installing reporters, configuring JSON output, and using pre-tool hooks to block file writes/edits until tests pass.
Run integration test suites for APIs, databases, services, queues, and files using real Dockerized dependencies without mocks. Automates full workflow: environment setup, database seeding, service orchestration, test execution with coverage reporting, and teardown cleanup. Select suites and configure environments via CLI flags.
Design and execute chaos engineering experiments to inject failures like network latency, service crashes, resource exhaustion into Kubernetes clusters and Docker containers, validating distributed system resilience and recovery using tools like Chaos Mesh and AWS FIS.
Verify suspected security bugs as true or false positives through data flow tracing, exploitability proofs with bounds analysis and race checks, then generate PoCs, executable exploits, unit tests, or negative proofs with diagrams and evidence.
Run a local ActionBook MCP server to give AI agents like Claude real-time browser automation: launch sessions, navigate sites, fill forms, click elements, handle multi-tabs, extract structured data to JSON/CSV with Playwright scripts, generate HTML research reports, and retrieve verified CSS/XPath selectors for any website.
Test web apps for cross-browser compatibility using Playwright locally across Chromium, Gecko, WebKit, and mobile viewports, or on real devices via BrowserStack, Sauce Labs, LambdaTest, Kobiton. Run interactive tests, scan JS/CSS risks, and generate reports with browser matrices.
Enforce strict TDD workflow: write minimal failing tests first for complex logic or public APIs, verify red phase failure, implement green-passing code without internal mocks, then refactor safely. Supports unit and integration tests with Jest.
Audit C/C++/Rust codebases for missing zeroization of sensitive data and compiler optimizations removing wipes. Analyze source code, LLVM IR, and control flow across optimization levels; generate PoCs to verify exploitability; validate with compilation runs and semantic checks; assemble JSON/markdown reports.
Automate comprehensive project management: audit health and permissions, generate architecture/user docs and roadmaps, handle git workflows/PRs/releases, test UX/onboarding/responsiveness via browser, consult multi-AI models, and post team updates with feedback triage.
Generate UX research artifacts like personas, empathy maps, journey maps, interview scripts, usability test plans, and diary studies. Analyze qualitative data from interviews, card sorts, and observations into affinity diagrams, themes, jobs-to-be-done, insights, and prioritized opportunities. Run full research cycles or targeted syntheses.
Build, debug, test, deploy, and optimize browser-based Node.js apps and IDEs using StackBlitz SDK and WebContainers: boot in-browser environments, embed interactive playgrounds in docs, fix common errors, run Playwright CI tests, deploy to Vercel/Netlify with production headers, and tune performance/security.
Automate AI-guided TDD/SDD workflows in Claude Code: decompose requirements into plans and tasks, execute Red-Green-Refactor cycles, auto-debug tests/builds, verify coverage/quality, generate screen specs/EARS docs, run Playwright E2E tests, and reverse-engineer designs/requirements from codebases across JS/Python/Rust/Go projects.
Orchestrate requirements-driven feature development: scan repo context, interactively confirm specs, generate optimized technical requirements, delegate code implementation for models/logic/APIs/migrations/tests, apply pragmatic reviews, and enforce quality gates for functionality/integration/maintainability.
Build robust LLM evaluation pipelines by auditing setups for issues, conducting error analysis on traces, generating synthetic test data, designing and validating LLM-as-judge prompts, evaluating RAG with custom metrics, and creating browser-based UIs for human annotation and labeling.
Automate browser workflows with persistent page state to navigate websites, fill forms, capture screenshots, extract data, test web apps, handle logins, and execute multi-step tasks without losing context.
Debug REST API failures by loading OpenAPI specs and HTTP logs into Claude Code. Uncover root causes with severity ratings, validate specs, receive targeted fix suggestions, and generate ready-to-run cURL or fetch commands for reproduction and testing.