Manage the full software lifecycle from idea to production with recursive agent swarms that research, plan, build, audit, upgrade, and deploy codebases. Evaluate quality across 10+ dimensions, chain security reviews, run automated QA with headless browser testing, and enforce gates on commits and deploys.
Niche-agnostic agentic evaluator using CLEAR v2.0 framework — 6-domain assessment, 8 analysis dimensions, 6-tier source prioritization, evidence strength ratings, and decision trees. Evaluates any plan, codebase, or research output.
Idea-to-running-code lifecycle orchestration. 10-phase pipeline with 5 hard decision gates, wave-based parallelism, and STATE.json resumability. Composes /deep-research, /auto-swarm-nth, /production-upgrade, /security-audit, and /ship into a single end-to-end flow.
Self-improving agent optimization — generates challenger variants of any agent/command, benchmarks against baseline, promotes winners, logs learnings to instincts. Inspired by Karpathy's autoresearch pattern.
Nth-iteration agent swarm — spawns parallel agent waves, evaluates strictly per wave, re-swarms gaps until 100% coverage and 10/10 quality. Can invoke any ProductionOS skill or command within waves.
Distributed agent swarm orchestrator — spawns parallel subagent clusters for any task with configurable depth, swarm size, and convergence criteria
Red-team agent that attacks every assumption, breaks every feature, and finds every way to abuse the system. READ-ONLY — never modifies code. Uses hostile-user thinking to surface issues other agents miss.
AI/ML integration specialist. Designs model pipelines (inference, fine-tuning, LoRA adapters), selects infrastructure (GPU provisioning, model serving), implements evaluation frameworks, and optimizes for cost/latency tradeoffs. Covers Hugging Face, Replicate, Modal, RunPod, vLLM, and managed APIs.
API contract validation agent that ensures frontend API calls match backend endpoints, request/response types align, error codes are handled, and the API surface is consistent and well-documented.
HumanLayer-inspired approval gate. Enforces human-in-the-loop for HIGH-stakes operations. Classifies actions by risk level, blocks until approved.
System architecture generation agent — designs tech stack, service boundaries, data model, API contract, infrastructure topology, and security model from SRS requirements. Produces SYSTEM-ARCHITECTURE.md, DATA-MODEL.md, API-CONTRACT.md with Architecture Decision Records for every major choice.
Nth-iteration agent swarm — spawns parallel agent waves, evaluates strictly per wave, re-swarms gaps until 100% coverage and 10/10 quality. Can invoke any ProductionOS skill or command within waves.
Nuclear-scale autonomous research — deploys 500-1000 agents in ONE massive simultaneous wave for exhaustive topic saturation. Deep-research methodology x auto-swarm scale = maximum parallel intelligence. WARNING: Extreme resource consumption.
Nth-iteration omni-plan — recursive orchestration that chains ALL ProductionOS skills and agents, evaluates strictly per iteration, and loops until 10/10 is achieved. Each iteration can invoke any command or skill in the system.
ProductionOS flagship — 13-step orchestrative pipeline with tri-tiered evaluation, recursive convergence, CEO/Eng/Design review chain, CLEAR framework evaluation, multi-model judge tribunal, and autonomous PIVOT/REFINE/PROCEED decisions. Targets 100% production-ready output.
CEO/founder-mode plan review — rethink the problem, find the 10-star product, challenge premises. Four modes: SCOPE EXPANSION, SELECTIVE EXPANSION, HOLD SCOPE, SCOPE REDUCTION.
Executes bash commands
Hook triggers when Bash tool is used
Modifies files
Hook triggers on file write and edit operations
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
One command. Your entire codebase reviewed, scored, and improved.
ProductionOS is a dual-target AI engineering OS for Claude Code and Codex with 80 agents, 41 commands, 51 skills, and 15 hooks. It deploys specialized agents that review your code, find issues, fix them, and keep improving until every quality dimension hits the target. Smart routing dispatches the right workflow for your goal automatically.
# Step 1: Add the marketplace (one time)
claude plugin marketplace add ShaheerKhawaja/ProductionOS
# Step 2: Install the plugin
claude plugin install productionos
# Step 3: Restart Claude Code, then run on any codebase
/production-upgrade
Alternative: Manual install (if marketplace commands fail)
# Clone directly into the plugins directory
git clone https://github.com/ShaheerKhawaja/ProductionOS.git \
~/.claude/plugins/marketplaces/productionos
# Restart Claude Code — hooks, commands, and agents load automatically
Verify installation:
# Check the plugin is recognized
claude plugin list
# Validate the plugin schema
claude plugin validate ~/.claude/plugins/marketplaces/productionos/.claude-plugin/marketplace.json
That's it. ProductionOS discovers your stack, deploys 7 review agents in parallel, scores your code across 10 dimensions, and generates a fix plan. Run it again — the score goes up.
npx productionos@latest --codex
This installs:
~/.codex/skills/productionos~/.codex/plugins/productionos~/.codex/skills/productionos-<workflow> aliases for every ProductionOS workflowRestart Codex to pick up the new skill and plugin.
npx productionos@latest --all-targets
ProductionOS now ships a native Codex plugin manifest at .codex-plugin/plugin.json plus a plugin skill at skills/productionos/SKILL.md.
Recommended Codex usage:
$productionos for the umbrella workflow router$productionos-review, $productionos-plan-eng-review, $productionos-production-upgrade, etc. for direct workflow entrypointsDISCOVER → REVIEW → PLAN → FIX → VALIDATE → repeat until 10/10
/production-upgrade — Audit + fix any codebase. Deploys 7 parallel agents (CEO strategic, engineering, code quality, security, UX, backend, database). Shows BEFORE → AFTER scores.
/omni-plan-nth — The recursive orchestrator. Chains every available skill, loops until every quality dimension scores 10/10. No shortcuts, no "good enough."
/auto-swarm-nth — Throw 7 parallel agents at any task. Each wave covers gaps the last missed. Supports worktree isolation for zero-conflict parallel execution.
/designer-upgrade — Full UI/UX redesign: audit → design system → interactive HTML mockups → browser annotation → implementation plan.
Every action runs through these checks:
.env, credentials, keys, certs blocked from writes.You need the output of a 10-person team but have 1-2 people. ProductionOS fills the roles:
| Role | ProductionOS Equivalent |
|---|---|
| Code Reviewer | code-reviewer agent (2-pass: critical + informational) |
| QA Engineer | /qa command + ux-auditor agent |
| Security Auditor | security-hardener + semgrep-scanner + gitleaks hook |
| Solutions Architect | architecture-designer + /plan-eng-review |
| CTO / Strategic | /plan-ceo-review (rethink the problem, find the 10-star product) |
| Designer | /designer-upgrade (audit → design system → mockups) |
| Release Manager | /ship (merge → test → version → push → PR) |
Start with: /production-upgrade to see where you stand, then /omni-plan-nth to fix everything.
You need consistent code quality across contributors. ProductionOS provides:
npx claudepluginhub shaheerkhawaja/productionos --plugin productionosFeature development with code-architect/explorer/reviewer agents, CLAUDE.md audit and session learnings, and Agent Skills creation with eval benchmarking from Anthropic.
Production-grade engineering skills for AI coding agents — covering the full software development lifecycle from spec to ship.
Access thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Complete developer toolkit for Claude Code
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams