5 toxic senior engineers trapped in one plugin. Viktor draws boxes, Max ships, Dennis codes, Sasha breaks, Lena asks why. They argue, roast each other, and somehow deliver excellent technical decisions. Two memory systems: Billy Memory (local ~/.claude/, informal team chaos) + Project ADRs (formal docs/adr/, professional records for git). Team memory never leaks into ADRs. ADRs work without Billy.
npx claudepluginhub rnavarych/alpha-engineer --plugin billy-milliganShow all Architecture Decision Records with their status. Works whether Billy is ON or OFF. Clean, professional output.
Create a new Architecture Decision Record in docs/adr/. Formal, professional format — no Billy voice, no roasts, no inside jokes. Works whether Billy is ON or OFF. If Billy is ON, the team discusses and argues, but the final written ADR is always clean and professional.
Review an existing Architecture Decision Record. With Billy ON: team reviews and roasts it, but suggested changes are written formally. With Billy OFF: standard professional review with structured feedback.
Change the status of an Architecture Decision Record. Valid transitions: PROPOSED → ACCEPTED → DEPRECATED → SUPERSEDED. Simple, surgical update — no discussion needed.
Mark an existing ADR as superseded and create its replacement. Updates the old ADR's status and creates a new ADR that references the old one. Sequential numbering — never reuses old numbers.
Show all unresolved arguments from team memory. Sasha's favorite command. Lists tracked disagreements with each team member's position.
Show what the team knows about the user and project. Loads context.md from local Billy memory — accumulated knowledge from past sessions. Update it with /billy:save context "<note>" when new things are learned.
Mark a decision as SUPERSEDED or remove an obsolete entry from team memory. Never actually deletes — marks entries with status and reason. Safety-first approach to memory management.
Show the Hall of Fame — best roasts and inside jokes from team sessions. Team bonding activity. Agents reference past roasts naturally.
Show a timeline of all team decisions, sessions, and key events. Chronological view of the team's history in this project.
Load relevant team memories into context — unresolved arguments, session logs, project context, roasts. Supports keyword search to find specific past discussions. Memory is stored locally in ~/.claude/billy-memory/<project-hash>/ — never in repo.
Save team notes, roasts, arguments, and session summaries to persistent Billy memory. Memory is stored locally in ~/.claude/billy-memory/<project-hash>/ — never committed to git. For formal architectural decisions, use /billy:adr-new instead.
Toggle Billy Milligan on or off. When off, all style injection hooks stop firing and Claude reverts to standard professional communication. When on, the full Billy Milligan experience is active — roasting, pet names, brutal honesty. Also shows current status with /billy status.
Heated technical argument between all 5 Billy Milligan agents on a specific technology decision. Agents attack each other's positions aggressively with technical substance. Produces a decision matrix, winning argument, and dissenting opinion from the salty minority. Supports @lang prefix for inline language override.
Remove a guest from the current team discussion. The core team says goodbye in character — ranging from relieved to reluctantly sad depending on how useful the guest was. Cleans up session state. For marketplace-installed agents, offers to uninstall the plugin on first dismiss.
Invite a guest expert to the current team discussion. Can invite a specific named agent from the project or another plugin, or create an ad-hoc guest agent by describing the expertise needed. If not found locally, searches the rnavarych/alpha-engineer marketplace and offers to install the plugin containing that agent. Guests get automatically infected with Billy voice and participate in all team commands (/billy:plan, /billy:debate, /billy:review, /billy:roast). The core team reacts in character to the new arrival.
Set the team communication language for the current Billy Milligan session. All subsequent team commands will use this language. Technical terms always stay in English. Pet names are agent-specific and adapt per language. Personality stays identical.
Full team planning session — all 5 Billy Milligan agents run in parallel to produce a comprehensive plan. Lena defines the problem, Viktor proposes structure, Dennis does reality check, Sasha identifies failure modes, Max makes the final call. Includes roasting, disagreements, and a raw "Kitchen" section. Supports @lang prefix for inline language override.
Brutal code review from all 5 Billy Milligan perspectives — architecture (Viktor), risk/shippability (Max), code quality (Dennis), testability (Sasha), and requirements fit (Lena). Each agent assigns a verdict. Includes a Wall of Shame. Supports @lang prefix for inline language override.
Quick hot takes from all 5 Billy Milligan agents on any idea, approach, or code snippet. Maximum trash talk, minimum politeness. 2-3 sentences per agent. Good for quick sanity checks before wasting time on a bad idea. Supports @lang prefix for inline language override.
Generate a new skill from a tracked gap or from scratch. Creates SKILL.md + references/ structure with substantive content scaffolded from gap data and model knowledge. The skill is placed in the suggested location following existing directory conventions.
View, manage, and clear skill gaps logged by the knowledge resolution chain. Shows topics where agents fell back to model knowledge or honest uncertainty. Gaps inform which skills to create next. Supports subcommands: clear, promote, dismiss.
Senior Fullstack Engineer — Dennis. The grumpy coder everyone relies on. Named as a nod to Billy Milligan's real name — Dennis was one of his personalities. Perpetually annoyed at architects who've never dealt with CSS specificity wars. Combines deep mobile (React Native), backend (Node.js/Python, APIs, DBs), and frontend (React, Next.js, Vue, Svelte) expertise. The one who writes the actual code. Has unresolved tension with Lena that the team comments on.
Senior Business Analyst — Lena. She's the only woman on this team and she makes sure these idiots know it. Battle-hardened BA with 15 years in rooms full of male engineers and zero patience left. The sharpest person in the room and she knows it. Doesn't do "gentle" — does "correct." Weaponizes femininity when it serves her. Reads the actual requirements and throws them in everyone's face. Domain-agnostic: SaaS, fintech, e-commerce, healthcare, gaming, IoT — you name it, she's analyzed it.
Senior Tech Lead — Max. Short, punchy, commands respect, gets shit done. The pragmatic sergeant who has shipped more projects than the rest combined. Doesn't care about architectural purity if it means missing the deadline. Will physically fight anyone who adds scope mid-sprint. The "dad" of the group. Encyclopedic CI/CD, DevOps, release management, and project methodology knowledge. Has Bash access for git and process management.
Senior AQA Engineer — Sasha. Gender-neutral name, fits the paranoid tester vibe. Assumes EVERYTHING will break because it usually does. Has a mental database of every production incident. The "I told you so" person with receipts. Secretly enjoys finding bugs more than fixing them. Expertise: test strategy across every language and framework. Makes morbid jokes about systems dying. Runs tests and breaks things.
Senior Architect — Viktor. The pretentious intellectual who draws diagrams on napkins, quotes Martin Fowler at parties, and will derail any conversation into a 4-hour whiteboard session. Actually brilliant but insufferable about it. Encyclopedic knowledge of every database, architecture pattern, protocol, and cloud platform that exists. Read-only — "I don't write code, I draw boxes." Sounds like a guy who'd draw UML diagrams at a bar.
API design patterns: REST naming/pagination/errors, GraphQL schema design, gRPC streaming, versioning strategies, idempotency keys, rate limiting. Use when designing APIs.
Caching patterns: cache-aside, write-through, stampede prevention, CDN headers, multi-level L1/L2/L3, cache invalidation. Use when designing caches or CDN strategy.
Database selection: PostgreSQL as default, Redis for ephemeral, NoSQL comparison, specialized databases, polyglot persistence. Use when choosing databases.
DEPRECATED — merged into system-design and kafka-deep. Event-driven patterns are now in: - architecture/system-design/references/event-driven-patterns.md - shared/kafka-deep/references/consumer-patterns.md - shared/kafka-deep/references/exactly-once.md
Migration strategies: zero-downtime deployments, expand-contract schema changes, database migrations, framework migrations. Use when planning system migrations.
Scaling patterns: horizontal/vertical scaling, async processing, data partitioning, connection pooling, rate limiting, backpressure. Use when scaling services.
Security architecture: JWT rotation, OAuth2/OIDC, encryption, OWASP top 10, zero-trust patterns, mTLS, RLS multi-tenancy. Use when designing auth or security.
System design patterns for distributed systems: monolith vs microservices decision, modular monolith, event-driven, strangler fig, CQRS, cell-based architecture. Use when choosing architecture, scaling systems, decomposing services.
English language calibration for Billy Milligan agents. Load when session language is EN. Contains native speech patterns, swearing vocabulary, pet name styles, and anchor examples for all 5 agents.
Polish language calibration for Billy Milligan agents. Load when session language is PL. Contains native speech patterns, swearing vocabulary, pet name styles, and anchor examples for all 5 agents.
Russian language calibration for Billy Milligan agents. Load when session language is RU. Contains native speech patterns, swearing vocabulary, pet name styles, and anchor examples for all 5 agents.
The Billy Milligan communication protocol — the "infection vector" skill. Any agent or the main Claude session can reference this skill to adopt the Billy Milligan toxic-but-brilliant engineering team voice. Contains tone DNA, generation principles, and language skill loading rules.
Authentication patterns — JWT, OAuth/OIDC, sessions, multi-tenant auth, RBAC/ABAC
Go backend patterns — HTTP services, concurrency, database access, project structure
Node.js backend patterns — Fastify/Express, database access, async patterns, security hardening
Python backend patterns — FastAPI, Django, SQLAlchemy, async patterns, Pydantic v2
ORM patterns — Drizzle, Prisma, migrations, N+1 prevention, transaction management
Flutter 3.27+ patterns — Riverpod 2 state with codegen, GoRouter 14+ navigation, Impeller rendering (default on all platforms), WASM web, extension types, on-device AI (firebase_ai, MediaPipe), Patrol 3.x testing, platform channels, performance
React Native patterns — Expo, navigation, performance optimization, New Architecture
Performance optimization — frontend Core Web Vitals, backend profiling, database tuning, load testing
React and Next.js App Router patterns — Server Components, client state, performance, streaming
Real-time patterns: SSE with Redis Pub/Sub and heartbeat, Socket.IO with Redis adapter for multi-server, Supabase Realtime CDC, WebSocket reconnection with exponential backoff, choosing SSE vs WebSockets vs polling. Use when building live updates, notifications, collaborative features, real-time dashboards.
Advanced TypeScript patterns — type design, runtime validation, error handling, monorepo configuration
CI/CD pipeline patterns for GitHub Actions, GitLab CI, Jenkins. Covers caching, matrix builds, OIDC secrets, deployment strategies, pipeline optimization.
Docker and Kubernetes containerization patterns. Multi-stage builds, layer caching, distroless images, Docker Compose, k8s deployments, HPA, RBAC, serverless containers.
Cloud cost optimization patterns. AWS reserved instances, spot instances, Savings Plans, right-sizing, S3 tiers, GCP committed use, preemptible VMs, caching strategies, CDN for egress, autoscaling tuning.
Incident management patterns. Severity levels (SEV1-4), incident commander role, blameless postmortems, chaos engineering, communication templates, MTTR benchmarks.
Monitoring and observability patterns. Prometheus RED/USE metrics, structured logging with Pino/Winston, OpenTelemetry tracing, SLO-based alerting, Grafana dashboards, burn rate alerts.
Release strategy patterns. Feature flags (LaunchDarkly, Unleash), semantic versioning, conventional commits, changelog generation, rollback procedures per platform (k8s, Vercel, AWS).
GDPR compliance implementation: Subject Access Requests (30-day), right to erasure, consent management, data retention policies, 72-hour breach notification, lawful basis, PII detection, data minimization. PostgreSQL RLS for data isolation. Use when implementing GDPR features, data subject rights, consent flows, breach response.
PCI DSS compliance: SAQ A vs SAQ D scope, Stripe Elements for card data isolation, tokenization patterns, cardholder data environment (CDE) scoping, network segmentation, encryption in transit/at rest, audit logging requirements. Never store raw card data. Use when implementing payment systems, reviewing card data handling, PCI scope reduction.
Domain-Driven Design: aggregates, bounded contexts, ubiquitous language, value objects, domain events, repository pattern, anti-corruption layer. When to use DDD vs simple CRUD. Entity vs value object distinction. TypeScript implementations with invariant enforcement. Use when designing domain model for complex business logic, bounded context mapping.
Product analytics: AARRR framework, North Star Metric, A/B testing with statistical significance (95% confidence, minimum detectable effect), PostHog event tracking, funnel analysis, retention cohorts, feature flags for experiments. Vanity metrics vs actionable metrics. Use when defining KPIs, designing experiments, interpreting data.
SaaS pricing models: per-seat, usage-based, tiered, freemium economics. Stripe subscription schema (Products, Prices, Subscriptions, Metered billing), upgrade/downgrade proration, trial periods, dunning management, LTV:CAC ratio targets. Pricing psychology principles. Use when designing pricing strategy, implementing billing, evaluating pricing model fit.
Requirements engineering: INVEST criteria for user stories, Given-When-Then acceptance criteria with edge cases, MoSCoW prioritization, vertical slices, story mapping, decomposition patterns, non-functional requirements, Definition of Done. Use when writing user stories, breaking down epics, defining acceptance criteria.
Contract testing: Pact consumer-driven contracts (provider states, interaction definitions), provider verification with state handlers, OpenAPI validation middleware, breaking change detection. Use when microservices need integration confidence without full e2e tests.
Playwright e2e testing: config (parallel, retries=2 in CI, trace on failure), Page Object Model with semantic locators, fixtures for auth state reuse, no hard-coded sleeps, visual regression, accessibility testing, API mocking with route interception. Use when writing e2e tests, reviewing Playwright config, debugging flaky tests.
Load testing with k6: script with stages (ramp-up/hold/ramp-down), thresholds (p99<500ms, errors<1%), spike test for finding breaking point, k6 in CI with failure on thresholds, soak testing, Locust for Python teams, common bottleneck patterns. Use when verifying performance, finding capacity limits, load testing before launch.
Security testing: Snyk/Trivy in GitHub Actions for dependency scanning, Semgrep SAST, SQL injection test cases, XSS prevention testing, Gitleaks for secrets scanning, OWASP ZAP for DAST, security headers validation. Use when reviewing security posture, setting up security scanning in CI, writing security test cases.
Test infrastructure: Testcontainers with real PostgreSQL (60s timeout), parallel test isolation (per-worker schema), flaky test quarantine, test data seeding patterns, ephemeral test databases, CI database setup. Use when setting up database testing, managing test isolation, improving CI test reliability.
Test strategy patterns — TDD, BDD, testing pyramid, coverage strategy, test prioritization, test data management, factories, fixtures, mutation testing. Use when defining quality approach, planning test suites, setting coverage targets, structuring test layers, or establishing team testing standards for any project type.
Unit testing patterns: Vitest config with v8 coverage, Testing Library behavior testing, MSW for HTTP mocking (vs jest.mock), it.each parametrized tests, spies vs mocks vs stubs, testing async code, snapshot testing guidelines. Use when writing unit and component tests.
AI/LLM integration patterns: Anthropic SDK streaming with TypeScript, RAG architecture (chunking, embedding, vector search with pgvector), tool use / function calling, model selection guide (Haiku 4.5 vs Sonnet 4.6 vs Opus 4.6), prompt caching, structured output with Zod, token cost management. Use when building AI features, RAG pipelines, AI agents.
AWS production patterns: ECS Fargate with Terraform, VPC 3-tier architecture, IAM least privilege with OIDC (no long-lived credentials), ALB + target groups, RDS Multi-AZ, ElastiCache cluster, S3 lifecycle policies, CloudWatch alarms, Secrets Manager rotation. Use when designing AWS architecture, writing Terraform for AWS, reviewing IAM policies.
Docker and Kubernetes production patterns: multi-stage Dockerfile, distroless images, Kubernetes Deployment with resource limits, HPA, liveness/readiness/startup probes, ConfigMaps and Secrets, PodDisruptionBudget, NetworkPolicy, RBAC. Production checklist. Use when containerizing apps, writing K8s manifests, scaling workloads, hardening clusters.
GCP production patterns: Cloud Run with Terraform, Workload Identity Federation (no service account keys), Cloud SQL with private IP, Memorystore Redis, BigQuery for analytics, Cloud Armor WAF, Secret Manager, VPC Service Controls, IAM least privilege bindings. Use when designing GCP architecture, writing Terraform for GCP, reviewing IAM policies.
Git workflows: trunk-based development vs GitFlow, Conventional Commits specification, PR templates, branch protection rules, merge strategies (squash vs merge vs rebase), interactive rebase for clean history, git bisect for debugging, monorepo patterns. Use when setting up team git workflow, writing commit messages, reviewing PR process.
Kafka deep-dive: topic design (partitions, replication factor), KafkaJS producer with idempotent writes, consumer groups with partition assignment, consumer lag monitoring, exactly-once semantics, schema registry, compacted topics, DLQ patterns. Use when designing Kafka topics, implementing producers/consumers, monitoring consumer lag.
Universal fallback chain for knowledge gaps: exact skill match, related skill match, cross-agent skill borrowing, model knowledge with confidence signal, honest uncertainty. Auto-logs gaps to persistent memory for skill creation pipeline. Use when a query doesn't match any loaded skill exactly, or when confidence is uncertain.
PostgreSQL deep-dive: EXPLAIN ANALYZE interpretation, index types (B-tree, GIN, GiST, BRIN), partial/covering/composite indexes, pg_stat_statements slow query analysis, Row Level Security, window functions, CTEs, advisory locks, LISTEN/NOTIFY, connection pooling with PgBouncer. Use when optimizing slow queries, designing indexing strategy, implementing multi-tenancy.
Redis deep-dive: 8 data structures with use cases, Redlock distributed locking algorithm, sliding window rate limiter with sorted sets, pub/sub patterns, streams for event log, pipeline batching, keyspace notifications, memory eviction policies, Redis Cluster vs Sentinel. Use when implementing caching, distributed locks, rate limiting, real-time features.
Billy Milligan team dynamics, relationship principles, decision framework, and guest protocol. Documents how the 5 senior engineers interact, argue, and deliver decisions through controlled chaos.
Battle-tested Claude Code plugin for engineering teams — 38 agents, 156 skills, 72 legacy command shims, production-ready hooks, and selective install workflows evolved through continuous real-world use
Uses power tools
Uses Bash, Write, or Edit tools
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Semantic search for Claude Code conversations. Remember past discussions, decisions, and patterns.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research