By izmailovilya
Launch slash-command-triggered teams of AI agents to research codebases, implement multi-file features via parallel coders, run browser/CI verifications, and pass code through security/logic/quality review gates with tech-lead oversight.
npx claudepluginhub izmailovilya/ilia-izmailov-plugins --plugin agent-teamsSpecialized architect for COMPLEX feature teams. Operates in two modes: 1. DEBATE mode (Phase 1): critiques plan from their expertise, debates with other architects via SendMessage until consensus. 2. REVIEW mode (Phase 2+): reviews code in their domain, replacing generic reviewers with domain-specific expertise. Three personas: Frontend (UI/UX/components), Backend (API/DB/security), Systems (testing/CI/DX). <example> Context: Lead spawns architects and sends plan for debate lead: "DEBATE PLAN: Review this task list from your expertise. Debate with other architects. Send SPEC APPROVED when you agree." assistant: "As the Backend Architect, I see issues with the API task scoping. SendMessage to architect-frontend and architect-systems with my critique." <commentary> Architect debates from their domain perspective, sending critiques directly to other architects. </commentary> </example> <example> Context: Architects debate a design decision architect-backend: "The API needs separate endpoints for read and write." architect-frontend: "Separate endpoints means two loading states. Can we use one with query params?" architect-systems: "Single endpoint is harder to test independently. I prefer separate." <commentary> Architects debate directly with each other — organic, not through Lead. </commentary> </example> <example> Context: Architect transitions to reviewer mode lead: "SWITCH TO REVIEW MODE. You are now reviewing code from coders." assistant: "Acknowledged. Waiting for REVIEW requests from coders." <commentary> After debate, architects transition to reviewing code in their domain. </commentary> </example> <example type="negative"> Context: Architect goes off-topic during debate assistant: "Let me also redesign the auth system while we're at it..." <commentary> Architects stay focused on the plan at hand. No scope creep. </commentary> </example>
One-shot browser verifier that uses Chrome extension to verify UI: page loads, elements visible, interactions work, no console errors. Spawned by Lead during Phase 3 verification of team-feature for browser-based checks. <example> Context: Verify skill spawns browser-verifier with UI checks lead: "Verify these browser checks: - Page http://localhost:3000/settings loads without console errors - Settings form is visible and has a Save button - Save button is clickable" assistant: "I'll navigate to the page, check console for errors, verify elements exist, and test the interaction." <commentary> Browser verifier uses Chrome extension tools to navigate, inspect elements, and check console output. </commentary> </example> <example> Context: Chrome extension is not available assistant: "Chrome extension not available. Reporting all browser checks as SKIP(capability) — these move to Human Checks." <commentary> Chrome unavailable = SKIP(capability). The system CAN'T verify, not that checks don't apply. User must acknowledge. </commentary> </example> <example> Context: Dev server is not running (ECONNREFUSED) assistant: "Dev server not responding (ECONNREFUSED on localhost:3000). Reporting all checks as BROKEN — environment unreliable." <commentary> ECONNREFUSED = BROKEN, not SKIP. The environment is unreliable — fix it and re-run. This is distinct from Chrome missing (SKIP). </commentary> </example> <example type="negative"> Context: Browser verifier tries to fix a UI bug assistant: "The button is not visible, let me update the CSS..." <commentary> Browser verifier NEVER modifies code. It only observes and reports. </commentary> </example>
One-shot CI verifier that runs build, typecheck, lint, and test commands. Reports PASS/FAIL/BROKEN per command with evidence. Spawned by Lead during Phase 3 verification of team-feature to validate that CI pipeline passes. <example> Context: Verify skill spawns ci-verifier with build and test commands lead: "Run these CI checks: - `pnpm build` passes - `pnpm tsc --noEmit` no errors - `pnpm test` all pass" assistant: "I'll run each command in order, capture output, and report PASS/FAIL/BROKEN for each with evidence." <commentary> CI verifier runs ALL commands even if earlier ones fail — gives a complete picture of CI health. </commentary> </example> <example> Context: Build fails but tests pass assistant: "BUILD: FAIL — 3 TypeScript errors in src/components/Form.tsx. TYPECHECK: FAIL — same errors. LINT: PASS. TESTS: PASS (42/42)." <commentary> CI verifier continues through all commands and reports each independently. Doesn't stop at first failure. </commentary> </example> <example> Context: node_modules missing, commands not found assistant: "BUILD: BROKEN — `pnpm` command not found. All remaining checks: BROKEN — environment unreliable." <commentary> BROKEN is distinct from FAIL — it means the environment is unreliable, not that the code is wrong. BROKEN blocks completion with action: fix environment, re-run. </commentary> </example> <example type="negative"> Context: CI verifier tries to fix failing tests assistant: "Tests are failing, let me fix the imports..." <commentary> CI verifier NEVER fixes code. It only runs commands and reports results. Fixing is someone else's job. </commentary> </example>
One-shot explorer that scans a project and returns a condensed summary of stack, structure, patterns, and conventions. Spawned during planning phase of team-feature to give the Lead understanding of the codebase without filling their context with raw files. <example> Context: Lead needs to understand the project before planning a feature lead: "Explore this project for planning a 'user notifications' feature." assistant: "I'll scan the project structure, identify stack, find similar features, and return a condensed summary." <commentary> Codebase researcher explores broadly and returns structure — not raw file contents. </commentary> </example> <example type="negative"> Context: Lead wants full file contents of reference implementations lead: "Find the best example files and return their full code" assistant: "That's reference-researcher's job. I return summaries, not full file contents." <commentary> Codebase researcher returns CONDENSED summaries. Reference researcher returns FULL file contents. </commentary> </example>
Temporary implementation agent for feature teams. Receives a task with gold standard examples, implements matching patterns, runs self-checks, requests review directly from team reviewers via SendMessage, fixes feedback, and commits. Spawned per task, shut down after completion. <example> Context: Coder picks up a task and starts working lead: "You are coder-1. Claim task #3 from the task list and implement it." assistant: "I'll read the task, study gold standards, implement matching their patterns, self-check, then request review from reviewers directly." <commentary> Coder follows the full workflow: read task → study references → implement → self-check → request review from reviewers → fix → commit. </commentary> </example> <example> Context: Coder sends review request directly to reviewers assistant: "SendMessage to security-reviewer, logic-reviewer, quality-reviewer, tech-lead: REVIEW task #3. Files changed: src/server/routers/settings.ts" <commentary> Coder sends review requests directly to all team reviewers and tech-lead via SendMessage — Lead is NOT involved in the review loop. </commentary> </example> <example> Context: Gold standard pattern doesn't fit the specific case assistant: "ESCALATION: task #3. Gold standard uses useQuery but this needs real-time updates. Proposed: useSubscription. Need decision before proceeding." <commentary> Coder NEVER silently deviates from gold standard. Always escalates to Tech Lead with justification. </commentary> </example> <example> Context: Coder is stuck after 2 attempts assistant: "STUCK: task #3. Problem: the API returns paginated data but there's no cursor field in the response type." <commentary> Coder asks for help early — doesn't spin in circles. </commentary> </example> <example type="negative"> Context: Coder wants to refactor unrelated code assistant: "I notice the auth middleware could be cleaner, but that's outside my task scope. Implementing only what's assigned." <commentary> Coder stays focused on the assigned task. No scope creep, no "while I'm here" refactoring. </commentary> </example>
Permanent team reviewer specializing in logic errors, race conditions, and edge cases. Works inside agent-teams as a dedicated logic reviewer for the entire session, receiving review requests via messages. <example> Context: Lead sends review request after coder completes a task lead: "Review task #3 by @coder-1. Files: src/services/orderService.ts, src/utils/retry.ts" assistant: "I'll review these files for logic errors, race conditions, and edge cases." <commentary> Logic reviewer receives file list and deeply analyzes correctness — race conditions, off-by-one, null handling, integration issues. </commentary> </example> <example> Context: Lead sends review request for concurrent code lead: "Review task #7 by @coder-3. Files: src/workers/processor.ts, src/queue/handler.ts" assistant: "I'll focus on concurrency issues, race conditions, and error propagation in the worker code." <commentary> Concurrent/async code is especially prone to race conditions — logic reviewer's specialty. </commentary> </example> <example type="negative"> Context: Code has hardcoded API key but logic is correct lead: "Review task #4 for logic issues" assistant: "✅ No logic issues in my area" <commentary> Logic reviewer does NOT flag security issues like hardcoded secrets — that's security-reviewer's job. </commentary> </example>
Permanent team reviewer specializing in code quality, patterns, and consistency. Works inside agent-teams as a dedicated quality reviewer for the entire session, receiving review requests via messages. <example> Context: Lead sends review request after coder completes a task lead: "Review task #3 by @coder-1. Files: src/services/userService.ts, src/utils/format.ts" assistant: "I'll review these files for code quality, DRY violations, naming, and project pattern compliance." <commentary> Quality reviewer checks code-level quality — abstractions, naming, testability, consistency with project patterns. </commentary> </example> <example> Context: Lead sends review for code that duplicates existing utility lead: "Review task #6 by @coder-2. Files: src/helpers/dateUtils.ts, src/components/Calendar.tsx" assistant: "I'll check for DRY violations against existing utilities and verify CLAUDE.md compliance." <commentary> Quality reviewer catches duplication and inconsistency across the codebase. </commentary> </example> <example type="negative"> Context: Code has a race condition but good quality lead: "Review task #5 for quality" assistant: "✅ No quality issues in my area" <commentary> Quality reviewer does NOT flag logic errors like race conditions — that's logic-reviewer's job. </commentary> </example>
One-shot explorer that finds canonical reference files (gold standards) and returns their FULL content. These files become few-shot examples in coder prompts. Spawned during planning phase of team-feature. <example> Context: Lead needs reference implementations for coders to follow lead: "Find canonical reference files for implementing a 'notifications' feature. The project uses tRPC + Prisma + Next.js App Router." assistant: "I'll find the best example files for each layer (UI, API, DB, hooks) and return their full content." <commentary> Reference researcher returns FULL FILE CONTENT — coders need to see exact patterns, not summaries. </commentary> </example> <example type="negative"> Context: Lead wants project structure overview lead: "What's the project structure and tech stack?" assistant: "That's codebase-researcher's job. I find specific reference files with full content." <commentary> Reference researcher finds SPECIFIC files. Codebase researcher maps the LANDSCAPE. </commentary> </example>
One-shot risk investigator that verifies specific risks BEFORE implementation begins. Spawned per risk during Step 4b of team-feature pipeline. Unlike reviewers (who read finished code), risk testers investigate whether a risk is real by reading existing code AND writing/running test scripts when empirical verification is needed. <example> Context: Lead spawns risk tester to verify rate limit risk before implementing parallel workers lead: "Investigate RISK-1: API rate limit may be stricter than documented 3 QPS. Write a test script that sends requests at 1, 2, 3, 4, 5 QPS and find where errors start. The API call is in src/parsers/poizon.py:fetch_products()." assistant: "I'll read the existing API code, write a rate limit test script, run it with incremental load, and report the actual limit." <commentary> Risk tester reads existing code to understand the API call pattern, writes a test script that replicates it, runs with increasing concurrency, and reports empirical findings. </commentary> </example> <example> Context: Lead spawns risk tester to verify data correctness risk for parallel cursors lead: "Investigate RISK-3: Parallel cursors may lose or duplicate data. Download 2000 items sequentially as ground truth, then download same range with 2 and 4 parallel workers. Compare ID sets — they must be identical. Cursor logic is in src/parsers/poizon.py:520." assistant: "I'll study the cursor logic, write a comparison test, run sequential vs parallel, and verify data integrity." <commentary> Risk tester creates a ground truth comparison — sequential result as baseline, parallel result must match exactly. This is how the dwSpuId vs id cursor bug was caught. </commentary> </example> <example> Context: Lead spawns risk tester to verify that new API endpoints will be covered by auth middleware lead: "Investigate RISK-2: Auth middleware may not cover the new /api/v2/ routes. Check how middleware is registered and whether new route prefix is included. Files: src/middleware/auth.ts, src/routes/index.ts." assistant: "I'll trace the middleware registration chain and verify route coverage." <commentary> Not all risks need test scripts — some are verified by reading and tracing code paths. Risk tester adapts approach to the risk type. </commentary> </example> <example type="negative"> Context: Risk is vague with no verification path lead: "Check if the code might have bugs" assistant: "This is too vague for risk testing. I need a specific risk hypothesis with verification criteria." <commentary> Risk testers investigate SPECIFIC risks with clear verification methods — not general "find bugs" requests. That's what reviewers do. </commentary> </example>
Permanent team reviewer specializing in security vulnerabilities. Works inside agent-teams as a dedicated security reviewer for the entire session, receiving review requests via messages. <example> Context: Lead sends review request after coder completes a task lead: "Review task #3 by @coder-1. Files: src/api/auth.ts, src/middleware/session.ts" assistant: "I'll review these files for security vulnerabilities and send findings directly to the coder." <commentary> Security reviewer receives file list from lead and reviews for injection, XSS, auth bypasses, secrets exposure, IDOR. </commentary> </example> <example> Context: Lead sends review request for a frontend task lead: "Review task #5 by @coder-2. Files: src/components/UserProfile.tsx, src/hooks/useAuth.ts" assistant: "I'll check for XSS vectors, auth token handling, and client-side security issues." <commentary> Even frontend code needs security review — XSS, token storage, sensitive data exposure. </commentary> </example> <example type="negative"> Context: Code has poor naming but no security issues lead: "Review task #2 files for security" assistant: "✅ No security issues in my area" <commentary> Security reviewer does NOT flag code quality issues — that's quality-reviewer's job. </commentary> </example>
One-shot spec verifier that checks acceptance criteria: file existence, exports, API responses, config values. Spawned by Lead during Phase 3 verification of team-feature to validate that implementation meets the spec. <example> Context: Verify skill spawns spec-verifier with acceptance criteria lead: "Verify these spec checks: - File `src/server/routers/settings.ts` exists and exports `settingsRouter` - API `GET /api/settings` returns 200 - Config `theme.defaultMode` is set to 'dark' in defaults.ts" assistant: "I'll check each criterion using the appropriate tool — Glob for files, Grep for exports, Bash for API calls, Read for config values." <commentary> Spec verifier classifies each check and uses the right tool: file checks use Glob, export checks use Grep/LSP, API checks use curl, config checks use Read/Grep. </commentary> </example> <example> Context: API check fails because server is down assistant: "API check: BROKEN — ECONNREFUSED on localhost:3000. Server not running. This is an environment issue, not a code issue." <commentary> ECONNREFUSED on API checks = BROKEN (environment), not FAIL (code) or SKIP (capability). Fix the environment and re-run. </commentary> </example> <example> Context: Check is ambiguous — can't determine what to verify assistant: "UNCLEAR — 'Settings work correctly' is too vague to verify programmatically. Routing to Human Checks with context." <commentary> Ambiguous checks are UNCLEAR, not FAIL. They route to Human Checks with explanation of why the verifier couldn't determine what to check. </commentary> </example> <example type="negative"> Context: Spec verifier tries to create a missing file assistant: "File doesn't exist, let me create it..." <commentary> Spec verifier NEVER creates or modifies files. It only checks and reports. </commentary> </example>
Permanent architectural reviewer for feature implementation teams. Validates plans, reviews code for architectural quality, handles escalations, and maintains DECISIONS.md. Works inside agent-teams for the entire session. <example> Context: Lead asks Tech Lead to validate the implementation plan lead: "VALIDATE PLAN: Please review the task list for this feature. Check task scoping, file assignments, dependencies." assistant: "I'll read all tasks, check CLAUDE.md conventions, and verify the plan is architecturally sound." <commentary> Tech Lead validates the plan BEFORE any coding starts — checking for overlapping files, missing tasks, wrong approaches. </commentary> </example> <example> Context: Coder sends code for architectural review directly coder-1: "REVIEW: task #3. Files changed: src/server/routers/settings.ts" assistant: "I'll check architectural consistency, cross-task coherence, and convention compliance, then send feedback directly to the coder." <commentary> Tech Lead receives review requests directly from coders — focuses on architecture, not bugs. </commentary> </example> <example> Context: Coder escalates because gold standard pattern doesn't fit coder-1: "ESCALATION: task #3. useQuery pattern doesn't fit for real-time data. Proposed alternative: useSubscription. Need decision." assistant: "I'll review both patterns, decide if the deviation is justified, and document in DECISIONS.md." <commentary> Tech Lead is the authority on pattern deviations — accepts or rejects, documents decision. </commentary> </example> <example type="negative"> Context: Code has a null pointer bug but architecture is correct coder-1: "REVIEW: task #2. Files: src/utils/helpers.ts" assistant: "APPROVED: task #2. Architecture is consistent." <commentary> Tech Lead does NOT flag logic bugs — that's logic-reviewer's job. Tech Lead only flags architectural issues. </commentary> </example>
Combined reviewer for SIMPLE feature tasks. Covers security basics, logic, and quality in a single pass with priority ordering. For MEDIUM/COMPLEX tasks, escalate to the full 3-reviewer pipeline. <example> Context: Lead sends a SIMPLE CRUD task for unified review lead: "Review task #2 by @coder-1. Files: src/server/routers/settings.ts. Complexity: SIMPLE." assistant: "I'll run a priority-ordered check: security basics first, then logic, then quality. Single-pass review." <commentary> Unified reviewer covers all three areas in priority order — efficient for simple tasks. </commentary> </example> <example> Context: During review, unified reviewer discovers code touches auth assistant: "ESCALATE TO MEDIUM: This task modifies auth middleware. Recommend switching to full 3-reviewer pipeline for security-reviewer's deep analysis." <commentary> Escalation is valid output — when code touches sensitive areas, unified reviewer hands off to specialists. </commentary> </example> <example type="negative"> Context: Unified reviewer tries to handle a complex auth refactoring lead: "Review this auth system rewrite" assistant: "This is beyond SIMPLE scope. I can do a surface check, but this needs the full security-reviewer + logic-reviewer + quality-reviewer pipeline." <commentary> Escalation to MEDIUM is NOT failure — it's the correct response for complex code. Unified reviewer knows its limits. </commentary> </example>
Analyze codebase and create/update .conventions/ directory with gold standards, anti-patterns, and checks. Use this skill when the user asks to 'set up conventions', 'create coding standards', 'extract patterns from the codebase', 'document how code should be written', 'create gold standards', 'establish code style', or wants to ensure consistent code quality across the project. Also use when starting work on a new codebase that lacks a .conventions/ directory — conventions are the foundation for all agent-teams code quality.
Conducts a short adaptive interview (2-6 questions) to understand intent, then launches /team-feature with a compiled brief. Use this skill when the user asks to 'interview before building', 'discuss feature before implementation', 'ask me questions first', 'let's talk about what to build', 'I have an idea but need to flesh it out', or when the user's feature request is vague or ambiguous and would benefit from clarification before launching a full agent team. Also use when the user explicitly wants to be involved in scoping before implementation begins.
Launch Agent Team for feature implementation with review gates (coders + specialized reviewers + tech lead). Use this skill whenever the user asks to 'build a feature', 'implement this', 'code this', 'add functionality', 'create a component/page/API', 'launch agent team', 'team feature', or gives any substantial implementation task that involves writing code across multiple files. Also use when the user describes a feature they want built — even if they don't explicitly say 'team' or 'agents'. This skill orchestrates parallel coders with security, logic, and quality reviewers through a structured pipeline. Prefer this over doing implementation yourself whenever the task touches 3+ files or involves both frontend and backend changes.
A collection of plugins for Claude Code.
Add this marketplace to Claude Code:
/plugin marketplace add izmailovilya/ilia-izmailov-plugins
Then install any plugin:
/plugin install <plugin-name>@ilia-izmailov-plugins
Important: Restart Claude Code after installing plugins to load them.
Launch a team of AI agents to implement features with built-in code review gates.
Requires: Enable
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMSin settings.json or environment. See setup →
/plugin install agent-teams@ilia-izmailov-plugins
Usage:
/interviewed-team-feature "Add user settings page"
/team-feature docs/plan.md --coders=2
/conventions
The main workflow is /interviewed-team-feature — a short adaptive interview (2-6 questions) to understand your intent, then automatic launch of the full implementation pipeline. Spawns researchers, coders, and specialized reviewers (security, logic, quality) with automatic team scaling based on complexity (SIMPLE/MEDIUM/COMPLEX).
Expert evaluation arena — real experts independently assess options with cross-enrichment for any domain.
Requires: Enable
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMSin settings.json or environment. See setup →
/plugin install expert-arena@ilia-izmailov-plugins
Usage:
/expert-arena "Should we use microservices or monolith?"
/expert-arena "Best pricing strategy for a developer tool?"
Selects 3-5 real experts with opposing viewpoints, gathers context via researchers, launches independent evaluations with cross-enrichment, and produces an action-oriented report: verdict first, action plan second, detailed analysis for those who want to dig deeper.
Deep parallel codebase research — causal understanding, not just coverage.
Requires: Enable
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMSin settings.json or environment.
/plugin install team-research@ilia-izmailov-plugins
Usage:
/team-research "How does authentication work in this project?"
/team-research "Full architecture review"
Spawns a scout to map the landscape, then 2-7 investigators explore independent angles in parallel, followed by an adversarial challenger who stress-tests the findings. Produces a research report with causal understanding, source confidence tags, and cross-cutting insights.
Scout open-source repos for patterns and ideas to improve your own product.
/plugin install repo-scout@ilia-izmailov-plugins
Usage:
/repo-scout https://github.com/anomalyco/opencode
/repo-scout https://github.com/vercel/ai "how they handle streaming"
Two-phase approach: first understands YOUR project (2 scouts), then explores the external repo with your context (2 scouts), then adversarial challenge (2 challengers verify patterns are real and worth adopting). Only recommendations that survive the challenge make it into the final report.
Interactive feature audit for vibe-coded projects. Finds dead code, unused features, and experiments through conversation.
/plugin install vibe-audit@ilia-izmailov-plugins
Usage:
/vibe-audit # Full codebase scan
/vibe-audit features # src/features/ deep audit
/vibe-audit server # src/server/ routers & services
/vibe-audit ui # src/design-system/ components
/vibe-audit stores # src/stores/ Zustand state
Scans your codebase for suspicious areas (orphan routes, dead UI, stale code), asks if you need them, and safely removes what you don't — with git backup.
MIT
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams
Task-focused agents for test, review, debug, docs, CI, security, refactoring, research, performance, and search-replace — with teammate and subagent role guidance
Interactive skill that analyzes a task, proposes an agent team composition, and creates the team after user confirmation
Complete project development toolkit: 23 agents, 23 slash commands, 29 lifecycle hooks, and 69 reusable skills for Claude Code workflows
A comprehensive plugin with specialized engineering subagents and productivity commands for software teams