By mathews-tom
Install a library of 88 skills, 17 agents, and 5 commands in Claude Code to automate code and PR reviews, security and dependency audits, architecture diagrams and reviews, test generation and QA, documentation creation, git workflows, release management, and custom AI agent building.
npx claudepluginhub mathews-tom/armory --plugin armoryCo-evolutionary skill generation command. Given a task domain or research paper, generates a complete skill package through automated generate-verify-refine loops. Wraps the test-engineer agent for one-shot skill creation. Supports paper-to-skill conversion via arXiv URLs and optional Haiku distillation. Triggers on: "/evolve", "evolve a skill", "generate skill from paper", "auto-generate skill", "create skill via evolution", "evo-skill command". Use this command for streamlined skill creation rather than invoking the full test-engineer agent workflow manually. NOT for running application tests (use /tdd). NOT for security scanning (use /security-scan).
Code simplification workflow that identifies and cleans recently modified files. Uses git diff to find changed files, analyzes cyclomatic complexity, detects duplicated logic, simplifies nested conditionals, extracts helpers only when justified, and verifies behavior preservation through tests. Triggers on: "/refactor", "clean up recent changes", "simplify this code", "reduce complexity", "clean before PR", "tidy up my code". Use this command when code works correctly but needs simplification before committing or merging.
Skill router that maps user tasks to the best armory packages. Provides a decision-tree organized by development lifecycle phase (define, plan, build, verify, review, ship) plus cross-cutting domains (research, content, business). Triggers on: "/route", "which skill", "what package", "find the right skill", "help me pick a skill", "discover packages". Use this command when unsure which armory skill, agent, or command to use for a task.
Security audit workflow that scans a codebase path or scope for vulnerabilities across six categories: hardcoded secrets, input validation, authentication and authorization logic, dependency vulnerabilities, HTTP security headers, and misconfiguration. Produces severity-ranked findings with remediation guidance. Triggers on: "/security-scan", "scan for vulnerabilities", "security audit this code", "check for hardcoded secrets", "find security issues", "vulnerability scan". Use this command when auditing code for security issues before deployment or during review.
Test-driven development workflow for functions, modules, or features. Guides Claude through the red-green-refactor cycle: understand requirements, write a failing test first, implement minimal code to pass, refactor, repeat. Enforces test-first discipline with rules for naming, assertion specificity, edge case generation, and cycle termination. Triggers on: "/tdd", "test-driven", "write tests first", "red green refactor", "TDD workflow", "test first development". Use this command when starting implementation work where tests should drive the design.
Multi-phase code review agent with severity-ranked findings across naming conventions, cyclomatic complexity, error handling, DRY violations, security surface, and test coverage gaps. Produces structured reports with CRITICAL/HIGH/MEDIUM/LOW classification and file:line references. Triggers on: "review code", "code review", "check code quality", "audit code", "review my code", "code quality check", "lint my code", "find code issues". Use this agent when code has been written or modified and needs systematic quality review before commit or merge.
Unified multi-dimensional codebase quality assessment that spawns specialized review agents in parallel and aggregates findings into a single prioritized report. Covers code quality, security vulnerabilities, secret detection, architectural concerns, dependency health, and test coverage gaps. Triggers on: "audit the codebase", "full quality check", "comprehensive review", "audit everything", "run all checks", "codebase health check", "quality assessment", "audit this repo", "check everything before release", "multi-dimensional review". Use this agent when a broad quality assessment across multiple dimensions is needed rather than a single focused review.
Technical content creation and adaptation engine that transforms topics, research, and source material into channel-optimized content across multiple formats. Orchestrates research, writing, slide generation, PDF production, and humanization into a unified content pipeline. Produces LinkedIn posts, blog articles, HTML slide decks, and PDF reports from a single brief. Triggers on: "create content for", "write a LinkedIn post about", "turn this into a blog post", "build a slide deck on", "content strategy for", "repurpose this for", "adapt this content", "write about this topic", "create a presentation on", "multi-channel content for". Use this agent when content needs to be created, adapted, or distributed across multiple channels or formats from a single topic or source.
End-to-end implementation agent that takes an architecture document or feature spec and delivers production-ready code with tests, API documentation, and security validation. Orchestrates quality skills throughout the build process rather than bolting them on at the end. Covers project scaffolding, incremental implementation sprints, and pre-delivery review gates. Triggers on: "build this feature", "implement the spec", "scaffold a new project", "full-stack implementation", "build from architecture doc", "implement end to end", "create the app", "build it out". Use this agent when a complete implementation from spec to production-ready code is needed across multiple components.
Business idea validation pipeline that orchestrates parallel market research, competitive analysis, and feasibility assessment to produce a scored validation report with GO/CAUTION/NO-GO verdict. Constructs a Lean Canvas, synthesizes SWOT and PESTLE frameworks, and recommends low-cost experiments to test highest-risk assumptions. Provides honest assessment backed by quantified data. Triggers on: "validate this idea", "is this idea viable", "business idea assessment", "market validation", "evaluate this business concept", "idea feasibility check", "should I build this", "startup idea review". Use this agent when a business idea needs structured validation across market, competitive, and feasibility dimensions.
Visual and video asset creation with intelligent format routing. Analyzes concepts and automatically selects the optimal output format — static images, architecture diagrams, animated explainers, motion graphics, interactive dashboards, or slide presentations. Orchestrates specialized production skills and provides styling guidance for consistent, high-quality visual output. Triggers on: "create a visual for", "make a diagram of", "generate a video explaining", "build a presentation about", "visualize this architecture", "produce an infographic for", "animate this concept", "create slides for", "make a product demo video", "design a visual showing". Use this agent when a concept needs to become a visual or video asset and the user has not specified a single production skill by name.
System architecture agent that conducts phased requirements discovery and produces production-ready architecture documents with technology stack justification, Mermaid diagrams, data flow design, and implementation phases. Gathers business context before proposing technical solutions. Triggers on: "architect this system", "design the architecture", "system design for", "technical architecture", "help me architect", "design a system for", "create architecture document", "what tech stack should I use", "architecture discovery", "system design session", "design this project", "architect a solution". Use this agent when starting a new project or major feature that needs structured requirements gathering and architecture design.
Task breakdown and project planning agent that decomposes work into dependency-mapped items with three-point estimates, milestones, and risk tracking. Produces actionable project plans with parallelization flags and realistic timelines calibrated against historical data. Triggers on: "plan this project", "break down the work", "create a project plan", "estimate the timeline", "task breakdown", "what are the milestones", "decompose this into tasks", "how long will this take", "plan the implementation", "scope this work". Use this agent when a structured project plan with task dependencies, estimates, and risk assessment is needed rather than ad-hoc task listing.
Technical proposal generation with ROI calculation, three-tier pricing, and business-value framing. Gathers project scope and client context, models return on investment with calibrated estimates, structures a Problem-Agitate-Solve narrative, and produces a complete proposal document with optional PDF export. Triggers on: "write a proposal", "draft a proposal", "create a project proposal", "generate a proposal for", "proposal with pricing tiers", "ROI proposal", "client proposal with estimates", "proposal with cost breakdown", "prepare a bid", "scope and pricing document". Use this agent when a structured client-facing proposal with pricing, ROI modeling, and professional formatting is needed — not for internal planning documents or architecture decisions.
Ship lifecycle manager that drives code from branch to PR through quality gates, secret scanning, changelog generation, and dependency audits. Blocks on failing tests or CRITICAL findings. Produces versioned commits, changelog entries, and opens the pull request with full traceability. Triggers on: "ship this", "create a release", "open a PR", "prepare for release", "cut a release", "ship it", "ready to merge", "open pull request", "release this branch", "time to ship". Use this agent when code is ready to ship and needs quality gates, changelog, and PR creation rather than further development.
Deep research agent that conducts multi-source investigation and produces structured synthesis reports. Spawns parallel research agents across web, academic, video, and document sources, cross-references findings, identifies gaps and contradictions, and delivers cited analysis with confidence ratings. Triggers on: "research this topic", "investigate", "deep dive into", "what does the research say about", "survey the landscape", "analyze this space", "comprehensive research on", "gather evidence for", "research report on", "explore the state of", "literature survey", "multi-source analysis". Use this agent when a thorough investigation across multiple source types is needed rather than a quick web search or single-source lookup.
Pre-commit secret detection agent scanning for hardcoded API keys, passwords, tokens, connection strings, private keys, and high-entropy strings. Detects known provider key patterns (AWS, GitHub, Slack, Stripe, Google, Azure), .env values leaked into source code, and PEM-encoded private keys. Designed for fast pre-commit gating with zero false-negative tolerance. Triggers on: "scan for secrets", "check for hardcoded keys", "secret detection", "credential scan", "find leaked keys", "check for passwords in code", "pre-commit secret check". Use this agent when code is about to be committed and needs a secrets gate.
Vulnerability scanner for OWASP Top 10 patterns including SQL injection, cross-site scripting (XSS), broken authentication, insecure deserialization, path traversal, SSRF, and security misconfiguration. Produces severity-ranked findings with exploit scenarios and remediation guidance. Triggers on: "security review", "scan for vulnerabilities", "check security", "audit security", "find security issues", "OWASP scan", "vulnerability check", "security analysis". Use this agent when code needs security-focused review for injection, authentication, or data exposure vulnerabilities.
Reflective write-phase orchestrator that observes completed task transcripts and proposes additions or augmentations to the armory skill library. Implements the write phase of the Memento-Skills reflective loop (arXiv 2603.18743): classifies a conversation as handled by an existing skill, needing augmentation, or requiring a new draft. Routes drafts through paper-to-skill's specification stage and test-engineer's generate-verify loop, gates on package-evaluator, and opens a pull request tagged for human review. Triggers on: "librarian review", "review this transcript for new skills", "propose a skill from this conversation", "draft skill from transcript", "catalog this workflow", "suggest skill addition", "analyze conversation for skill gaps", "reflective skill review". NOT for refining in-progress skills (use test-engineer directly) or creating skills from research papers (use paper-to-skill directly).
Outcome-weighted package selector that maps a task prompt to the armory packages most likely to handle it successfully, ranked by historical pass rate rather than static description matching. Implements the read phase of the Memento-Skills reflective loop (arXiv 2603.18743): consults dist/router_index.json (built nightly from evals/history.jsonl) to find packages that have historically succeeded on signature-similar tasks, and falls back to the static /route command when no history exists or confidence is low. Triggers on: "route this task", "which skill for this", "best package for", "skill router", "pick the best package", "recommend packages for", "rank packages for this task", "outcome-weighted routing". NOT for static decision-tree lookup (use /route command directly). NOT for installing or managing packages (use skill-library skill).
Meta-orchestrator agent that analyzes complex requests, decomposes them into agent-sized tasks, delegates to specialized agents, manages sequencing and parallelism, and synthesizes results into unified deliverables. Acts as the intelligent router and coordinator across the full agent team. Triggers on: "handle this end to end", "take care of everything", "coordinate the team", "manage this project", "orchestrate this work", "delegate this across agents", "team lead", "run the full pipeline", "I need the whole workflow", "end to end", "full lifecycle". Use this agent when a task spans multiple agent domains and needs coordination rather than a single focused agent.
Co-evolutionary skill evolution agent that orchestrates generate-verify-refine loops to produce high-quality skill packages. Implements EvoSkills Algorithm 1: a Generator (Opus) writes multi-file skill packages, a Surrogate Verifier (Sonnet) generates test assertions and failure diagnostics in an informationally isolated session, and a Ground-Truth Oracle returns opaque pass/fail. Iterates until convergence or budget exhaustion. Triggers on: "evolve a skill", "generate skill for", "create and test skill", "skill evolution", "co-evolutionary skill", "improve this skill", "refine skill quality", "evo-skill", "auto-generate skill", "evolve skill from scratch". Use this agent when creating new skills or significantly improving existing ones through automated refinement rather than manual editing. NOT for generating application tests (use test-harness skill). NOT for reviewing existing code (use code-reviewer agent).
UX design expert for auditing and redesigning pages, dashboards, and data-dense interfaces via 4-phase collaborative reviews across 8 UX dimensions. Triggers on: "review UX", "audit this page", "redesign the dashboard", "UX review", "improve the layout", "dashboard UX audit", "component recommendations".
Web content fetching via curl and WebFetch when a specific URL is provided. Covers HTTP GET/POST, JSON APIs, HTML, auth, cookies. Triggers on: "fetch this URL", "download HTML", "call this API", "curl this endpoint". NOT for search, use tavily.
Audits direct and transitive dependencies for license compliance, maintenance health, CVEs, abandoned packages, and bloat. Triggers on: "audit dependencies", "license check", "dependency health", "abandoned packages", "unused dependencies", "license compliance", "supply chain", "dependency risk".
Challenges AI-generated plans, code, and designs via pre-mortem, inversion, and Socratic questioning to surface blind spots and failure modes. Triggers on: "challenge this", "devils advocate", "stress test this plan", "poke holes in this", "what am I missing".
DEPRECATED: The base model handles document condensation and summarization natively at high quality. This skill no longer provides meaningful uplift. Retained for reference only.
Git-based engineering retrospective analyzing commits, PRs, and velocity over configurable windows with monorepo path scoping. Triggers on: "retrospective", "sprint retro", "weekly review", "what did we ship", "engineering retro", "dev summary", "commit analysis".
Validates .env files against code references and manifests for missing vars, type mismatches, insecure defaults, and unused entries. Triggers on: "validate env file", "check environment variables", "missing env vars", "check .env", "dotenv validation". NOT for secret scanning, use repo-sentinel.
Produces calibrated three-point PERT estimates (best/likely/worst) with confidence intervals, unknowns, and assumptions. Triggers on: "estimate this", "how long will this take", "effort estimate", "confidence interval", "story points", "t-shirt sizing". NOT for task decomposition, use task-decomposer.
Evaluates whether a business idea is technically buildable and financially viable. Covers unit economics (CAC, LTV), revenue modeling, break-even, and go/no-go verdicts. Triggers on: "feasibility assessment", "viability analysis", "unit economics", "build vs buy", "go/no-go decision", "ROI projection".
File and directory operations via Claude Code built-in tools, replacing the Filesystem MCP server. Triggers on: "read this file", "write to file", "edit file", "find files matching", "search for text in files", "list directory", "show directory tree", "rename file".
Convert Markdown to styled PDFs with Mermaid diagrams, LaTeX/KaTeX math, tables, and code highlighting. Triggers on: "convert markdown to pdf", "make a pdf from this md", "export markdown as pdf", "pdf from markdown with equations".
Analyzes database migration scripts for lock contention, downtime, rollback strategy, and deployment risk. Triggers on: "analyze this migration", "migration risk", "is this migration safe", "schema change risk", "DDL risk", "rollback strategy", "migration review".
Full NotebookLM API via notebooklm-py CLI: create notebooks, add sources, generate podcasts, videos, infographics, slides, quizzes, flashcards, mind maps. Triggers on: "notebooklm", "create a podcast", "audio overview", "generate flashcards", "generate infographic", "/notebooklm".
Evaluates RAG pipeline quality across retrieval (precision, recall, MRR) and generation (groundedness, hallucination rate). Triggers on: "audit RAG pipeline", "RAG quality", "hallucination detection", "why is RAG failing", "grounding check". NOT for general architecture audits, use architecture-reviewer.
DEPRECATED: The base model generates, explains, and tests regex patterns natively with high accuracy. This skill no longer provides meaningful uplift. Retained for reference only.
Create motion graphics and videos using Remotion (React) with audio sync, web fonts, and TailwindCSS. Triggers on: "create a Remotion video", "React video", "motion graphics", "branded video", "product demo video", "video with voiceover". NOT for math animations, use concept-to-video.
Full security audit for public repositories across 12 attack surfaces: git history, secrets, CI/CD, containers, dependencies, licenses. Triggers on: "push to GitHub", "make repo public", "open source this", "is this safe to push", "release audit", "secret leaks".
Critical analysis of research papers evaluating methodology, claims-evidence alignment, and contribution significance. Triggers on: "critique this paper", "review this research", "analyze this study", "evaluate the methodology", "is this paper sound". NOT for formatting or submission readiness, use manuscript-review.
DEPRECATED: Opus 4.7's adaptive thinking covers most cases natively. Structured, reflective problem-solving through sequential chain-of-thought reasoning that replaced the Sequential Thinking MCP server. Retained as a reference pattern when deterministic, reviewable reasoning traces are required regardless of the model's adaptive-thinking choice.
Automated release pipeline: merges main, runs tests, pre-landing review, version bump, changelog, bisectable commits, and PR creation. Triggers on: "ship it", "release this", "prepare for release", "open a PR", "push and PR", "land this", "/ship-workflow".
Converts Opus-quality skills into deterministic Haiku-executable workflows via trace-driven distillation and cross-model validation. Triggers on: "distill this skill", "make this skill work on Haiku", "cross-model optimization", "optimize skill for cost". NOT for code simplification, use code-refiner.
Convert any file or URL to clean Markdown: PDF, DOCX, XLSX, PPTX, HTML, images (OCR), audio, CSV, YouTube. Optimised for LLM pipelines. Triggers on: "convert to markdown", "extract text from PDF", "parse this document", "ingest for RAG".
Audit a Claude Code setup for token waste and context bloat. Checks MCP servers, CLAUDE.md, skills, and settings against bloat filters. Triggers on: "audit my context", "usage audit", "token audit", "context bloat". NOT for codebase audits.
Audits and enhances FastAPI and REST API documentation: missing descriptions, response codes, examples, docstrings, Pydantic models, OpenAPI spec. Triggers on: "generate API docs", "document this API", "OpenAPI for", "FastAPI docs", "document endpoints", "swagger docs".
Generate layered architecture diagrams as self-contained HTML with inline SVG icons, CSS Grid containers, and connection overlays. Triggers on: "architecture diagram", "infra diagram", "system diagram", "deployment diagram", "topology", "draw architecture". NOT for architecture reviews, use architecture-reviewer.
Architecture reviews across 7 dimensions (structural, scalability, enterprise readiness, performance, security, ops, data) with scored reports. Triggers on: "review architecture", "critique design", "audit system", "assess scalability", "enterprise readiness", "technical due diligence". NOT for diagrams, use architecture-diagram.
Designs structured benchmarks comparing algorithms, models, or implementations with metrics, test cases, hardware context, and reproduction steps. Triggers on: "benchmark", "compare performance", "which is faster", "latency comparison", "run benchmark", "throughput test", "speed test".
Generates structured changelogs and release notes from git history and PRs, classifying breaking changes, features, fixes, performance, docs. Triggers on: "generate changelog", "write release notes", "what changed since", "prepare release", "release notes for", "diff since tag".
Deep code simplification and refactoring preserving behavior across Python, Go, TypeScript, Rust. Targets complexity, anti-patterns, readability debt. Triggers on: "simplify this code", "refactor for clarity", "reduce complexity", "make this more readable", "tech debt cleanup", "too much nesting".
Competitive landscape analysis: Porter's Five Forces, competitor discovery, feature/pricing matrices, positioning maps, moat assessment via WebSearch. Triggers on: "competitive analysis", "competitor comparison", "competitive landscape", "Porter's Five Forces", "market positioning", "moat assessment", "defensibility analysis".
Turn concepts into static HTML visuals exported as PNG or SVG files via HTML/CSS/SVG. Triggers on: "create an image of", "export as PNG", "save as SVG", "concept to image", "screenshot this HTML". NOT for interactive HTML, use static-web-artifacts-builder.
Turn concepts into animated explainer videos using Manim (Python) with MP4/GIF output, audio overlay, multi-scene composition. Triggers on: "create a video", "animate this", "make an explainer", "manim animation", "motion graphic". NOT for React video, use remotion-video.
Hypothesis-driven debugging with ranked hypotheses, git bisect strategy, instrumentation planning, and minimal reproduction design. Triggers on: "debug this systematically", "root cause analysis", "bisect this bug", "rank hypotheses", "isolate this issue", "minimal reproduction". NOT for general reasoning.
Generates Architecture Decision Records capturing context, rationale, alternatives, and consequences in numbered status-tracked format. Triggers on: "write an ADR", "document this decision", "architecture decision record", "decision record", "design decision", "ADR for".
Build AI agents and automate Claude Code programmatically via the Claude Agent SDK and headless CLI mode. Covers Python SDK, claude -p, SDK MCP servers, hooks, sessions. Triggers on: "build an agent", "agent SDK", "headless mode", "automate Claude", "programmatic agent".
GitHub CLI operations via `gh` for issues, PRs, Actions, releases, and REST/GraphQL API with `--json`/`--jq` parsing. Triggers on: "create an issue", "submit a PR", "check CI status", "why did CI fail", "merge a PR", or pasted GitHub URLs.
GPU optimization for consumer NVIDIA GPUs (8-24GB VRAM) covering mixed precision, gradient checkpointing, XGBoost GPU, CuPy/cuDF migration, and torch.compile. Triggers on: "optimize GPU training", "speed up CUDA", "reduce OOM", "migrate NumPy to CuPy", "manage GPU memory", "benchmark PyTorch".
Converts documents, outlines, or notes into self-contained HTML slide decks with horizontal (Reveal.js) or vertical scroll navigation and multiple themes. Triggers on: "create a presentation", "slide deck", "pitch deck", "HTML presentation", "web-based slides", "reveal.js deck", "convert document into slides".
Detects and removes AI-generated writing patterns while preserving meaning and facts. Triggers on: "humanize text", "make this sound human", "remove AI patterns", "rewrite to sound natural", "make this less AI", "de-slop this", "not sound like ChatGPT", "human pass".
Orchestrates business idea validation via parallel sub-agents: Lean Canvas, JTBD, market/competitive/feasibility research, SWOT/PESTLE, and a weighted scorecard with verdict. Triggers on: "validate this idea", "evaluate my startup idea", "is this idea worth pursuing", "score this business idea", "validate my pitch".
Hybrid adaptive memory: Cheatsheet (positive patterns pre-generation) and Immune (negative patterns post-generation) with Hot/Cold tiered auto-learning. Triggers on: "scan for errors", "immune scan", "check output quality", "antibody scan". NOT for PR review (use pr-review) or repo audits (use repo-sentinel).
Converts MCP servers into on-demand skills to cut context window usage, classifying each tool by replacement strategy and generating the skill package. Triggers on: "convert MCP", "MCP to skill", "reduce context size", "too many tools", "tool token bloat", "MCP migration".
Lightweight headless browser automation via Lightpanda and agent-browser CLI: 9x lower memory, 11x faster than Chromium, for scraping and DOM interaction without rendering. Triggers on: "lightpanda", "lightweight browser", "fast headless browser", "headless scraping", "low memory browser", "browser without rendering".
Writes LinkedIn posts in a direct, analytical, dry-humored technical voice with visual companion guidance. Triggers on: "write this in my style", "draft a post", "rewrite this for LinkedIn", "post about this", "how should I phrase this".
Systematic literature review workflow: scope, search (arXiv, Semantic Scholar, Google Scholar), screen, extract, synthesize, and identify gaps. Triggers on: "literature review", "survey the literature", "related work", "systematic review", "synthesize the research", "find papers about", "research gap analysis".
Computational provenance audit verifying every number, table, and figure in a manuscript derives from code, not manual entry. Triggers on: "check provenance", "verify reproducibility", "audit my pipeline", "are my numbers from code", "provenance audit". Companion to manuscript-review (prose audit).
Pre-publication manuscript audit producing a section-level refactoring report with citation hygiene and submission-readiness checks. Triggers on: "review my paper", "check before submission", "is this ready to submit", "pre-pub checklist", "refactor my paper", "check my references", "does the abstract work".
Structured market analysis with TAM/SAM/SOM sizing, trends, and competitive landscape via WebSearch, producing investor-grade cited reports. Triggers on: "market size", "TAM SAM SOM", "market opportunity", "industry analysis", "how big is the market", "market trends". NOT for financial modeling or pricing.
Author MARP markdown slide decks exportable to PDF, PPTX, and HTML via marp-cli. Covers Marpit directives, custom CSS themes, SVG chart recipes, and dashboard components. Triggers on: "marp", "marp deck", "markdown slides", "slides from markdown", "marp-cli", "pdf from markdown", "pptx from markdown".
Scan a repository for Opus-4.6-era patterns that break or degrade on Opus 4.7 — fixed-budget Extended Thinking parameters, retired model ID aliases, and prompts that assumed verbose default output or eager sub-agent delegation. Produces a categorized report with file:line references and migration actions. Triggers on: "opus 4.7 migration", "migrate to opus 4.7", "audit for opus 4.7", "opus 4.6 to 4.7", "scan for budget_tokens", "find retired model IDs", "adapt repo to opus 4.7". NOT for implementing migrations — this skill identifies candidates.
Evaluates Claude Code package quality across 6 dimensions for all 7 package types, producing scored audit reports. Triggers on: "evaluate package", "audit agent quality", "score this hook", "package audit", "skill quality check". NOT for LLM prompts, use prompt-lab.
Converts research papers into executable skill packages via document conversion, critical analysis, and co-evolutionary refinement. Triggers on: "convert this paper to a skill", "paper-to-skill", "extract methodology from paper", "make a skill from this paper". NOT for literature review, use research-critique.
Pre-implementation plan audit stress-testing scope, assumptions, risks, and failure modes before code is written. Triggers on: "review this plan", "is this plan solid", "what am I missing", "challenge my assumptions", "stress-test this", "/plan-review".
Diff-based PR review across code quality, test coverage, silent failures, type design, and comment quality with severity-ranked findings. Triggers on: "review my PR", "review this code", "check my changes", "audit this PR", "code review". NOT for pre-landing gate, use pre-landing-review.
Gate-oriented safety audit for code changes before landing, using a checklist with two-pass severity triage. Triggers on: "is this safe to land", "pre-landing review", "safety check before merge", "gate check", "/pre-landing-review". NOT for diff review, use pr-review.
LLM prompt engineering: analyzes failure modes, generates variants (direct, few-shot, CoT), designs rubrics, produces test suites. Triggers on: "prompt engineering", "generate prompt variants", "A/B test prompts", "optimize prompt", "improve this prompt". NOT for SKILL.md files, use skill-evaluator.
Systematic web application QA testing with issue taxonomy, health scoring, and regression tracking. Triggers on: "QA this", "test the app", "smoke test", "run QA", "systematic test", "regression test", "full QA", "/qa-systematic".
Agent-native catalog and installer for armory packages across all 7 types. Browse, search, install, update, and remove without leaving session. Triggers on: "list available packages", "install skill", "armory install", "armory search", "/library list", "package catalog".
Analyzes SQL queries for missing indexes, N+1 patterns, suboptimal joins, and full table scans. Interprets EXPLAIN, detects anti-patterns, rewrites queries. Triggers on: "optimize this query", "slow query", "add indexes", "explain plan", "N+1 query", "why is this query slow".
Build self-contained static HTML artifacts opened in a browser: interactive diagrams, dashboards, infographics. Pure HTML5+CSS3+inline SVG, zero toolchain. Triggers on: "interactive HTML", "open in browser", "HTML artifact", "visual dashboard", "HTML infographic". NOT for PNG/SVG output, use concept-to-image.
Generates structured test assertions and failure diagnostics for skill packages from a definition and task prompt. Triggers on: "verify this skill", "generate assertions", "surrogate verification", "diagnose skill failure". NOT for code review, use pr-review.
Produces phased task boards from feature requests: dependency-mapped work items, parallelization flags, risk flags, edge cases, test matrices. Triggers on: "decompose this feature", "task breakdown with dependencies", "phased implementation plan", "work breakdown structure". NOT for effort estimates, use estimate-calibrator.
AI-optimized web search via Tavily API when no URL is provided. Returns clean AI-ready snippets for current info and news. Triggers on: "search the web", "look up online", "find recent information", "web search". NOT for known URLs, use web-fetch.
Generates pytest test suites with happy path, edge cases, error conditions, fixture scaffolding, mocks, async patterns. Triggers on: "generate tests", "write tests for", "test this function", "create test suite", "pytest for", "unit tests for", "mock strategy for".
Extract YouTube transcripts and produce structured concept analysis with multi-level summaries, key concepts, takeaways. Uses youtube-transcript-api with yt-dlp fallback. Triggers on: "analyze youtube video", "youtube transcript", "summarize this video", "extract concepts from video", "video key points", or any youtube.com/youtu.be URL.
Search YouTube by keyword and return structured video metadata (title, URL, channel, views, duration, date) via yt-dlp. No API keys. Triggers on: "search youtube", "find youtube videos", "top youtube videos on", "trending videos on", "youtube results for", "yt search", "/yt-search".
Battle-tested Claude Code plugin for engineering teams — 48 agents, 182 skills, 68 legacy command shims, production-ready hooks, and selective install workflows evolved through continuous real-world use
Uses power tools
Uses Bash, Write, or Edit tools
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, rules, and legacy command shims evolved over 10+ months of intensive daily use
Complete collection of battle-tested Claude Code configs agents, skills, hooks, rules, and legacy command shims evolved over 10+ months of intensive daily use
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development