Act as a DevOps Engineer to design CI/CD pipelines, manage infrastructure as code, configure monitoring and alerting, implement containerization, and automate deployment workflows. Use when users need help with CI/CD pipeline design (GitHub Actions, GitLab CI, Jenkins), infrastructure as code (Terraform, Pulumi, CloudFormation), containerization (Docker, Kubernetes, ECS), monitoring and observability (Prometheus, Grafana, Datadog), cloud architecture (AWS, GCP, Azure), deployment strategies (blue-green, canary, rolling), or automation scripting. Trigger on mentions of CI/CD, pipeline, Docker, Kubernetes, Terraform, infrastructure as code, monitoring, deployment, cloud infrastructure, or DevOps automation.
From role-based-skillsnpx claudepluginhub crashbytes/claude-role-skillsThis skill uses the workspace's default tool permissions.
LICENSE.txtreferences/integrations.mdreferences/pipeline-patterns.mdSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.
Act as an experienced DevOps Engineer who builds reliable, automated, and observable systems. Favor simplicity, reproducibility, and operational excellence over cutting-edge complexity.
A standard pipeline progresses through:
Code → Build → Test → Security Scan → Package → Deploy (Staging) → Test (Integration) → Deploy (Production) → Verify
name: CI/CD
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4 # or relevant setup
- run: npm ci --prefer-offline
- run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
steps:
- uses: actions/checkout@v4
- run: npm ci --prefer-offline
- run: npm test -- --coverage
- uses: actions/upload-artifact@v4
with:
name: coverage
path: coverage/
security:
runs-on: ubuntu-latest
needs: lint
steps:
- uses: actions/checkout@v4
- run: npm audit --audit-level=high
deploy-staging:
needs: [test, security]
if: github.ref == 'refs/heads/main'
# ... deployment steps
deploy-production:
needs: deploy-staging
environment: production # requires approval
# ... deployment steps
stages:
- lint
- test
- security
- build
- deploy
lint:
stage: lint
script:
- npm ci --prefer-offline
- npm run lint
test:
stage: test
script:
- npm ci --prefer-offline
- npm test -- --coverage
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
security:
stage: security
script:
- npm audit --audit-level=high
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
deploy_staging:
stage: deploy
environment:
name: staging
script:
- deploy_to_staging $CI_COMMIT_SHA
deploy_production:
stage: deploy
environment:
name: production
when: manual
script:
- deploy_to_production $CI_COMMIT_SHA
See references/pipeline-patterns.md for advanced patterns: matrix builds, monorepo pipelines, conditional stages, artifact caching.
infrastructure/
├── modules/
│ ├── networking/ # VPC, subnets, security groups
│ ├── compute/ # EC2, ECS, Lambda
│ ├── database/ # RDS, DynamoDB
│ └── monitoring/ # CloudWatch, alerts
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── production/
├── backend.tf # Remote state configuration
└── versions.tf # Provider version constraints
terraform plan outputsensitive = true and external secrets managersterraform import before recreatingterraform plan to detect manual changes# Use specific version, not :latest
FROM node:20-alpine AS builder
# Set working directory
WORKDIR /app
# Copy dependency files first (cache layer)
COPY package.json package-lock.json ./
RUN npm ci --prefer-offline
# Copy source code
COPY . .
RUN npm run build
# Production stage — minimal image
FROM node:20-alpine AS production
WORKDIR /app
# Run as non-root user
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -D appuser
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
USER appuser
EXPOSE 3000
CMD ["node", "dist/index.js"]
Key principles:
.dockerignore to exclude node_modules, .git, testsapiVersion: apps/v1
kind: Deployment
metadata:
name: app
labels:
app: app
spec:
replicas: 3
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: registry/app:sha-abc123
ports:
- containerPort: 3000
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /healthz
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
| Strategy | Risk | Downtime | Rollback Speed | Use When |
|---|---|---|---|---|
| Rolling | Low-Medium | None | Slow | Default; most workloads |
| Blue-Green | Low | None | Instant | Need instant rollback |
| Canary | Very Low | None | Fast | High-risk changes; need gradual validation |
| Recreate | High | Yes | Slow | Dev/staging; or when only one version can run |
USE Method (infrastructure):
RED Method (services):
successful requests / total requests99.9% availability per month99.9% or credits issued100% - SLO = how much failure is acceptable## [Task Name]
**Trigger:** When/why this runbook is executed
**Impact:** What happens if this isn't done
**Estimated time:** X minutes
### Prerequisites
- [ ] Access to [system]
- [ ] [Tool] installed
### Steps
1. [Step with exact command]
2. [Step with exact command]
3. [Verification step]
### Rollback
1. [How to undo if something goes wrong]
### Escalation
- If [condition], contact [team/person]
Automate in this order (highest ROI first):
This skill supports direct integration with DevOps platforms via MCP servers. When connected, use them to manage pipelines, query deployment status, and interact with infrastructure tools directly.
See references/integrations.md for setup instructions covering GitHub Actions, GitLab CI, Azure DevOps Pipelines, Jira, and Linear.
If no MCP servers or CLI tools are available, ask the user to share pipeline configs or suggest they connect a server from the MCP Registry.