From vfm-agent-company
Site Reliability Engineering practices from Google - the company that invented SRE. Master SLOs, error budgets, incident response, and toil elimination. Use when designing reliable systems, implementing SRE practices, or improving operational excellence. Learn from the team that runs Google Search, Gmail, and YouTube at billions of users scale.
npx claudepluginhub duylinhdang1998/claude-template-agent --plugin vfm-agent-companyThis skill uses the workspace's default tool permissions.
**Expert**: Alex Kim (Google SRE, 11 years)
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Checks Next.js compilation errors using a running Turbopack dev server after code edits. Fixes actionable issues before reporting complete. Replaces `next build`.
Expert: Alex Kim (Google SRE, 11 years) Level: 10/10 - Google invented SRE
Site Reliability Engineering from Google - what happens when you ask a software engineer to design an operations team. Not traditional ops or DevOps - applying software engineering to infrastructure.
Google runs services for billions (Search, Gmail, YouTube, Maps) with 99.99%+ uptime. These practices made that possible.
100% uptime is the wrong target. Use error budgets to balance reliability vs velocity.
Define and measure service quality with SLIs, SLOs, SLAs.
Automate manual, repetitive work. Target <50% time on toil.
Alert on symptoms (user-facing), not causes. Use golden signals.
Blameless postmortems, clear escalation, reduce MTTR.
Plan for growth, forecast demand, optimize resource usage.
SRE practices power:
Last Updated: 2026-02-03 Expert: Alex Kim (Google SRE, 11 years) - Runs billion-user services