By rjmurillo
Equip GitHub Copilot CLI with 81 reusable skills and agent definitions to automate code quality gates, advanced git operations like PR reviews and merge conflict resolution, ADR generation and validation, security scans for vulnerabilities, documentation audits and syncing, multi-agent planning for complex tasks, and devops pipeline fixes.
npx claudepluginhub rjmurillo/ai-agentsBLOCKING INTERCEPT: When ANY github.com URL appears in user input, STOP and use this skill. Never fetch GitHub HTML pages directly - they are 5-10MB and will exhaust your context window. This skill routes URLs to efficient API calls (1-50KB). Triggers on: pull/, issues/, blob/, tree/, commit/, compare/, discussions/.
Validate and complete session logs before commit. Auto-populates session end evidence (commit SHA, lint results, memory updates) and runs validation. Use when finishing a session, before committing, or when session validation fails.
Match file paths against steering file glob patterns to determine applicable steering guidance. Use when orchestrator needs to inject context-aware guidance based on files being modified.
Detect agent conversation loops via topic-signature similarity and emit a self-reflection nudge. Use as an orchestrator guard against repetitive responses and token-burning loops.
Validate code against style rules from .editorconfig, StyleCop.json, and Directory.Build.props. Detects line ending violations, naming convention issues, indentation problems, and charset mismatches across C#, Python, PowerShell, and JavaScript. Produces JSON reports for pre-commit hooks and CI pipelines.
Custom lints with agent-readable remediation instructions. Enforces taste invariants (file size, naming conventions, structured logging, complexity) and surfaces errors that agents can act on directly. Use when writing or reviewing code to catch style violations early.
Prove it works. Multi-dimensional quality validation across functional, non-functional, security, DevOps, DX, and observability. Run after /build.
Synchronizes CLAUDE.md navigation indexes and README.md architecture docs across a repository. Use when asked to "sync docs", "update CLAUDE.md files", "ensure documentation is in sync", "audit documentation", or when documentation maintenance is needed after code changes.
Systematically populate the Forgetful knowledge base using Serena's LSP-powered symbol analysis for accurate, comprehensive codebase understanding.
Manage execution plans as versioned artifacts with progress tracking and decision logs. Use when creating, updating, or archiving plans for complex multi-step work.
Guidance for deep knowledge graph traversal across memories, entities, and relationships. Use when needing comprehensive context before planning, investigating connections between concepts, or answering "what do you know about X" questions.
Repair malformed markdown code fence closings. Use when markdown files have closing fences with language identifiers or when generating markdown with code blocks to ensure proper fence closure.
Advanced Git workflows including rebasing, cherry-picking, bisect, worktrees, and reflog. Use when managing complex Git histories, collaborating on feature branches, or recovering from repository issues.
Research external topics, create comprehensive analysis, and incorporate learnings into memory systems
Review before merge. Five-axis code review across architecture, security, quality, tests, and standards. Run after /test.
Detect infrastructure and security-critical file changes to trigger security agent review recommendations ensuring proper security oversight for sensitive modifications.
Scan code content for CWE-22 (path traversal) and CWE-78 (command injection) vulnerabilities before PR submission. Lightweight pattern-based detection for Python, PowerShell, Bash, and C# files. Use when preparing code for review or as a pre-commit gate.
Architectural analysis workflow using Serena symbols and Forgetful memory. Use when analyzing project structure, documenting architecture, creating component entities, or building knowledge graphs from code.
Multi-phase documentation verification treating code as source of truth. Consolidates incoherence, doc-coverage, doc-sync, and comment-analyzer into a single workflow. Use when auditing documentation accuracy, verifying code examples compile, checking behavioral claims, or running pre-release doc audits.
Intelligent skill router and creator. Analyzes ANY input to recommend existing skills, improve them, or create new ones. Uses deep iterative analysis with 11 thinking models, regression questioning, evolution lens, and multi-agent synthesis panel. Phase 0 triage ensures you never duplicate existing functionality.
Create comprehensive Architectural Decision Records (ADRs). Researches the destination directory to detect existing template conventions, gathers context, determines next ADR number, generates the ADR, validates completeness, and saves. Supports multiple ADR formats (MADR, Nygard, Alexandrian, project canonical). Use when documenting technical decisions or creating new ADR files.
Multi-agent debate orchestration for Architecture Decision Records. Automatically triggers on ADR create/edit/delete. Coordinates architect, critic, independent-thinker, security, analyst, and high-level-advisor agents in structured debate rounds until consensus.
Identify code ownership before modifying validators or linters. Checks file headers for provenance indicators, reviews documentation, and determines provenance as UPSTREAM, LOCAL, VENDOR, or UNKNOWN. Prevents accidental modification of upstream tools.
Systematic multi-step codebase analysis producing prioritized findings with file-line evidence. Covers architecture reviews, security assessments, and code quality evaluations through guided exploration, investigation planning, and synthesis.
Build incrementally. Implement changes in thin vertical slices with TDD and atomic commits. Run after /plan.
Strategic framework for evaluating build, buy, partner, or defer decisions with four-phase process, tiered TCO analysis, and integration with decision quality tools
Design and document chaos engineering experiments. Guide steady state baseline, hypothesis formation, failure injection plans, and results analysis. Use for resilience testing, game days, failure injection experiments, and building confidence in system stability.
Investigate historical context of existing code, patterns, or constraints before proposing changes. Automates git archaeology, PR/ADR search, and dependency analysis to prevent removing structures without understanding their purpose.
Assess code maintainability through 5 foundational qualities (cohesion, coupling, encapsulation, testability, non-redundancy) with quantifiable scoring rubrics. Works at method/class/module levels across multiple languages. Produces markdown reports with remediation guidance.
Scaffold project documentation (README, ARCHITECTURE, API, CODE_COMMENTS) from templates with documented standards. Use when bootstrapping docs for a new or under-documented codebase.
Execute CodeQL security scans with language detection, database caching, and SARIF output. Use when performing static security analysis on Python or GitHub Actions code.
Use when setting up new development environment or troubleshooting MCP connectivity. Configures Context Hub dependencies including Forgetful MCP server and plugin prerequisites.
Detect missing documentation in code (XML docs, docstrings, JSDoc) and project files (CHANGELOG gaps). Produces coverage reports with specific gaps by file and symbol. Use for pre-PR validation, CI gates, or documentation audits.
Analyze skill content for optimal placement (Skill vs Passive Context vs Hybrid), compress markdown to pipe-delimited format (60-80% token reduction), and validate compliance against the decision framework. Based on Vercel research showing passive context achieves 100% pass rates vs 53-79% for skills.
Gather comprehensive context from Forgetful Memory, Context7 docs, and web sources before planning or implementation. Use when starting complex tasks requiring multi-source context.
Guidance for maintaining memory quality through curation. Covers updating outdated memories, marking obsolete content, and linking related knowledge. Use when memories need modification, when new information supersedes old, or when building knowledge graph connections.
Systematic abstraction discovery using Commonality Variability Analysis. Build matrix of what varies vs what's constant, then let patterns emerge. Prevents wrong abstractions by deferring pattern selection until requirements are analyzed. Use when facing multiple similar requirements and need to discover natural abstractions.
Classify problems into Cynefin Framework domains (Clear, Complicated, Complex, Chaotic, Confusion) and recommend appropriate response strategies. Use when unsure how to approach a problem, facing analysis paralysis, or needing to choose between expert analysis and experimentation.
Structured decision critic that systematically stress-tests reasoning before commitment surfacing hidden assumptions verifying claims and generating adversarial perspectives to improve decision quality.
Execute GitHub operations (PRs, issues, milestones, labels, comments, merges) using Python scripts with structured output and error handling. Use when working with pull requests, issues, review comments, CI checks, or milestones instead of raw gh.
Scan repository for golden principle violations with agent-readable remediation. Enforces GP-001 through GP-008 from .agents/governance/golden-principles.md. Use when auditing compliance, preparing PRs, or running garbage collection scans.
Detect contradictions between documentation and code, ambiguous specs, and policy violations across a codebase. Use when documentation seems stale, specs conflict with implementation, or a pre-release consistency audit is needed. Produces an actionable incoherence report with resolution workflow.
Generate evidence-based documentary reports by searching across all 4 memory systems (Claude-Mem, Forgetful, Serena, DeepWiki), .agents/ artifacts, and GitHub issues. Produces investigative journalism-style analysis with full citation chains.
Manage memory citations, verify code references, and track confidence scores. Use when adding citations to memories, checking memory health, or verifying code references are still valid.
Unified four-tier memory system for AI agents. Tier 1 Semantic (Serena+Forgetful search), Tier 2 Episodic (session replay), Tier 3 Causal (decision patterns). Enables memory-first architecture per ADR-007.
Resolve merge conflicts by analyzing git history and commit intent. Handles PR conflicts, branch conflicts, and session file conflicts with automated resolution for known patterns.
Collect agent usage metrics from git history and generate health reports. Use when measuring agent adoption, reviewing system health, or producing periodic dashboards. Implements 8 key metrics from agent-metrics.md.
Deal intelligence skill for offer analysis and counter-proposal drafting. Trigger on `review this offer`, `analyze counter`, `value gap`, `draft counter`, `should I walk`. Apply when reviewing any offer (real estate, compensation, vendor, resource allocation) or designing negotiation analysis behavior in agentic systems. Quantifies value gaps, applies RADAR protocol, enforces senior-tier model routing.
Query and analyze agent JSONL event logs for debugging, performance analysis, and decision tracing. Use when investigating agent behavior, finding slow tool calls, tracing decisions, or analyzing session performance.
Triage raw unstructured input (transcripts, brain dumps) into evaluated thread inventories and a synthesized gold-found file across three phases.
Discovers, triggers, and monitors Azure DevOps pipelines (PR, Buddy Build, Buddy Release) for the current repo and branch. Auto-diagnoses failures from build logs, applies fixes, commits, pushes, and re-triggers until all pipelines pass or max retries reached. Validates PR existence and description completeness. Designed to be invoked automatically after any change-making skill creates a PR.
Plan how to build it. Decompose specs into milestones with dependencies and risk mitigations. Run after /spec.
Interactive planning and execution for complex tasks. Use when breaking down multi-step projects (planning) or executing approved plans through delegation (execution). Planning creates milestones with specifications; execution delegates to specialized agents.
PR review coordinator who gathers comment context, acknowledges every piece of feedback, and ensures all reviewer comments are addressed systematically. Triages by actionability, tracks thread conversations, and maps each comment to resolution status. Use when handling PR feedback, review threads, or bot comments.
Use when responding to PR review comments for specified pull request(s)
Guide prospective hindsight analysis to identify project risks before failure occurs. Teams imagine the project has failed spectacularly, then work backward to identify causes. Increases risk identification by 30% compared to traditional planning.
Evaluate existing solutions (libraries, SaaS, open source) before custom development to avoid reinventing the wheel. Use when considering building new features, asking "should I build or use existing", or need build vs buy cost analysis with token estimates.
Optimize system prompts for Claude Code agents using proven prompt engineering patterns. Use when users request prompt improvement, optimization, or refinement for agent workflows, tool instructions, or system behaviors.
Commit, push, and open a PR
Grade quality per product domain and architectural layer with gap tracking. Produces markdown or JSON reports showing grades (A-F), file counts, gaps, and trends over time. Use when auditing repo quality, tracking improvement, or identifying domains that need attention.
CRITICAL learning capture. Extracts HIGH/MED/LOW confidence patterns from conversations to prevent repeating mistakes and preserve what works. Use PROACTIVELY after user corrections ("no", "wrong"), after praise ("perfect", "exactly"), when discovering edge cases, or when skills are heavily used. Without reflection, valuable learnings are LOST forever. Acts as continuous improvement engine for all skills. Invoke EARLY and OFTEN - every correction is a learning opportunity.
Adversarial requirements interview that walks the design tree to elicit testable requirements before any code is written. Implements the grill-me pattern - ask relentlessly, recommend an answer for every question, and resolve dependencies between decisions one branch at a time. Skip any question the codebase can already answer.
Research external topics, create comprehensive analysis, determine project applicability, and incorporate learnings into Serena and Forgetful memory systems. Transforms knowledge into searchable, actionable project context.
Create protocol-compliant JSON session logs with verification-based enforcement. Autonomous operation with auto-incremented session numbers and objective derivation from git state. Use when starting any new session.
Fix session protocol validation failures in GitHub Actions. Use when a PR fails with "Session protocol validation failed", "MUST requirement(s) not met", "NON_COMPLIANT" verdict, or "Aggregate Results" job failure in the Session Protocol Validation workflow. With deterministic validation, failures show exact missing requirements directly in Job Summary - no artifact downloads needed.
Migrate session logs from markdown to JSON format. Use when PRs contain markdown session logs that need conversion to the new JSON schema, or when batch-migrating historical sessions.
Check investigation session QA skip eligibility per ADR-034. Validates if staged files qualify for investigation-only exemption by checking against allowed paths (.agents/sessions/, .agents/analysis/, .serena/memories/, etc).
Session management and protocol compliance skills. Use Test-InvestigationEligibility to check if staged files qualify for investigation-only QA skip per ADR-034 before committing with 'SKIPPED investigation-only' verdict.
Ship it. Pre-flight validation, CI check, and PR creation. Run after /review.
Autonomous meta-skill for creating high-quality custom slash commands using 5-phase workflow with multi-agent validation and quality gates. Use when user requests new slash command, reusable prompt automation, or wants to convert repetitive workflows into documented commands.
Design Service Level Objectives (SLOs) with SLIs, targets, alerting thresholds, and error budget calculations following Google SRE best practices. Use when defining reliability targets, designing SLOs, calculating error budgets, or establishing service level indicators.
Define what to build. Transform a problem into testable requirements with acceptance criteria.
Structured security analysis using OWASP Four-Question Framework and STRIDE methodology. Generates threat matrices with risk ratings, mitigations, and prioritization. Use for attack surface analysis, security architecture review, or when asking what can go wrong.
Guidance for using Forgetful semantic memory effectively. Applies Zettelkasten atomic memory principles. Use when deciding whether to query or create memories, structuring memory content, or understanding memory importance scoring.
Guidance for using Serena's LSP-powered symbol analysis. Use when exploring codebases, finding symbol definitions, tracing references, or when grep/text search would be imprecise.
Use when validating a PR title and description for conventional commit format, issue linking keywords, and template compliance before submission
Treat upstream validators as authoritative. Align local config to them. Use when validation fails unexpectedly, before modifying validator behavior, or when tempted to change upstream tool code.
Automates Windows container image migration for OneBranch pipelines. Bumps AdoPipelineGeneration package, regenerates pipeline configs via ConfigGen, and verifies old image reference is removed. Use for LTSC2019 to LTSC2022 migration, container image updates, OneBranch pipeline image upgrades.
Run a 5-layer interview to elicit how a team actually works (rhythms, decisions, dependencies, institutional knowledge, friction) and emit a structured operating model.
DEPRECATED: Workflow commands replaced by lifecycle commands (/spec, /plan, /build, /test, /review, /ship). Scripts in this directory may still be referenced. Use lifecycle commands for new work.
Twenty-minute diagnostic mapping a team to a world-model paradigm (vector DB, structured ontology, signal-fidelity). Use for AI readiness or auditing where automated judgment is safe.
Complete project development toolkit: 23 agents, 23 slash commands, 29 lifecycle hooks, and 69 reusable skills for Claude Code workflows
Share bugs, ideas, or general feedback.
The operational layer for coding agents. Bookkeeping, validation, and flows that compound knowledge between sessions.
Meta prompts that help you discover and generate curated GitHub Copilot agents, instructions, prompts, and skills.
Task-focused agents for test, review, debug, docs, CI, security, refactoring, research, performance, and search-replace — with teammate and subagent role guidance
Complete AI coding workflow system. Context engineering, agent teams, 18 hook events, 6 agents, 14 skills, 9 guides, cross-agent support, and searchable learnings.
Production-grade engineering skills for AI coding agents — covering the full software development lifecycle from spec to ship.