Platform Engineering: Internal Developer Platforms (IDP), CNCF Platform definition, Team Topologies, IDP components (Service Catalog, Self-Service Infra, Golden Paths, Developer Portal), platform maturity model, make-vs-buy (Backstage vs Port vs Cortex), adoption strategy, DORA correlation.
From clarcnpx claudepluginhub marvinrichter/clarc --plugin clarcThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Reference for building Internal Developer Platforms (IDPs) — from strategy to implementation.
"Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations." — CNCF Platform Engineering Working Group
Platform as a Product:
Platform Engineering vs. DevOps:
| Aspect | DevOps | Platform Engineering |
|---|---|---|
| Focus | Culture + collaboration | Tooling + self-service |
| Scope | Team practices | Cross-team infrastructure |
| Measurement | Process metrics | Developer Experience (DevEx) |
| Output | Cultural shift | Paved roads (Golden Paths) |
From Skelton & Pais — four team types:
┌──────────────────────────────────────────────────────────┐
│ Stream-Aligned Teams │
│ (Product teams — build and run features) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Team A │ │ Team B │ │ Team C │ │
│ └──────────┘ └──────────┘ └──────────┘ │
├──────────────────────────────────────────────────────────┤
│ Platform Team │
│ (Reduce cognitive load via self-service + Golden Paths) │
│ ┌──────────────────────────────────────────────┐ │
│ │ Developer Portal (Backstage) + Infra + CI/CD │ │
│ └──────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────────┤
│ Enabling Team │ Complicated-Subsystem Team │
│ (Coaching, upskilling) │ (ML platform, data mesh) │
└──────────────────────────────────────────────────────────┘
Key principle: Platform team exists to reduce cognitive load of stream-aligned teams. If teams must deeply understand the platform to use it, it's not a platform — it's a dependency.
Central inventory of all services, APIs, libraries, and teams.
Backstage catalog-info.yaml:
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: order-service
description: Handles order creation, payment, and fulfillment
annotations:
github.com/project-slug: myorg/order-service
pagerduty.com/integration-key: abc123
sonarqube.org/project-key: order-service
tags:
- java
- kafka
- postgres
spec:
type: service
lifecycle: production
owner: group:order-team
system: ecommerce
dependsOn:
- resource:orders-db
- resource:payments-queue
providesApis:
- order-api
consumesApis:
- payment-api
- inventory-api
What the catalog enables:
Developers create infrastructure via templates — no Ops ticket required.
# Backstage Scaffolder template — provision a new database
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: provision-postgres
title: Provision PostgreSQL Database
spec:
parameters:
- title: Database Configuration
properties:
name:
type: string
description: Database name (will create myorg-{name}-db)
environment:
type: string
enum: [dev, staging, production]
size:
type: string
enum: [small, medium, large]
description: "small: 10GB, medium: 100GB, large: 1TB"
steps:
- id: trigger-terraform
name: Trigger Terraform
action: github:actions:dispatch
input:
repoUrl: github.com?repo=infra&owner=myorg
workflowId: provision-database.yml
branchOrTagName: main
workflowInputs:
db_name: ${{ parameters.name }}
env: ${{ parameters.environment }}
Pre-built, opinionated templates for the most common service types:
Golden Path: NodeJS REST API
├── Repository template (Backstage Scaffolder)
├── Dockerfile (optimized, multi-stage)
├── GitHub Actions CI/CD (test → build → deploy)
├── Kubernetes manifests (Deployment, Service, HPA)
├── Observability (OpenTelemetry pre-wired)
├── catalog-info.yaml (pre-filled)
└── README with onboarding guide
Time from idea → running service: 10 minutes (vs. 2 weeks without)
Single entry point for all developer tools and documentation:
| Section | What's there |
|---|---|
| Service Catalog | All services, APIs, teams |
| Templates | Golden Paths, database provisioning |
| Docs | TechDocs, architecture decisions |
| CI/CD | GitHub Actions status per service |
| Incidents | PagerDuty active incidents |
| Cost | AWS cost per team/service |
Standardized logs/metrics/traces:
| Level | Description | Indicators |
|---|---|---|
| 1 — Reactive | No platform team, Ops does everything manually | Tickets for every deployment, weeks to provision DB |
| 2 — Managed | Shared infra, but still manual processes | Same tools, some automation, but requires Ops help |
| 3 — Self-Service | Teams deploy without Ops tickets | Golden Paths exist, 80%+ self-service |
| 4 — Ecosystem | Platform itself is extensible by teams | Teams contribute plugins, templates, feedback loop |
Quick assessment:
Q1: How long to create a new database in production? (hours → days = Level 1-2)
Q2: How long to onboard a new engineer to their first commit? (days → weeks = Level 1)
Q3: Can teams deploy without opening an Ops ticket? (no = Level 1-2)
Q4: Do teams know who owns a service that's causing issues? (no = Level 1-2)
| Tool | Type | Strengths | Weaknesses | Cost |
|---|---|---|---|---|
| Backstage | OSS (self-hosted) | Fully customizable, huge ecosystem, CNCF project | High maintenance, requires dedicated team | Infrastructure + team time |
| Port | SaaS | Fast setup, good UX, flexible data model | Cost at scale, vendor lock-in | ~$10-20/dev/mo |
| Cortex | SaaS | Strong scorecards/standards enforcement | Less flexible catalog | ~$15-25/dev/mo |
| OpsLevel | SaaS | Good maturity tracking | Smaller ecosystem | ~$15/dev/mo |
| Roadie | Hosted Backstage | Backstage UX without maintenance burden | Still expensive | ~$25/dev/mo |
Decision framework:
< 20 engineers + fast time-to-value needed → Port or Cortex (SaaS)
20-100 engineers + Kubernetes-heavy + custom needs → Backstage (self-hosted)
> 100 engineers + large existing k8s infra → Backstage or Roadie
Compliance-heavy (HIPAA, SOC2) → Self-hosted Backstage
Platform Engineering directly improves DORA metrics:
| DORA Metric | Platform Improvement |
|---|---|
| Deployment Frequency | Self-service CI/CD templates → teams deploy more often |
| Lead Time | Golden Paths remove setup friction → faster first deploy |
| Change Failure Rate | Standardized configs/tests → fewer config mistakes |
| MTTR | Unified observability + ownership in catalog → faster diagnosis |
"Teams using IDPs deploy 2.1× more frequently and have 40% shorter lead times." — Puppet State of DevOps 2023
The #1 platform failure mode: build it, mandate it, watch teams route around it.
What works:
Anti-patterns:
backstage-patterns — catalog YAML, Scaffolder templates, plugins, TechDocsengineering-metrics — DORA metrics for measuring platform impactdora-implementation — technical setup for DORA tracking