From twig
Use to answer "where does service X run / which cluster / which namespace / what image / what's actually deployed" for Twig backend services and host apps. Covers Twig's seven EKS clusters, the NGSS vs OSM vs TAP service families, the per-repo Helm chart pattern, the Jenkins shared pipeline (`TwigWorld/docker-pipeline`), and the FluxCD GitOps repo (`TwigWorld/dip-infra`). This skill teaches the lookup procedure — it does NOT cache live state. Always derive volatile facts (replica counts, image tags, ingress hosts, what's currently running) from the source: `values.<env>.yaml`, the gitops repo, or `kubectl`. Triggers on cluster names (`twig-internal`, `twig-dev`, `twig-ngss`, `twig-tap-us`, `twig-osm-us`/`uk`/`eu`, `twig-sandbox`), agent labels (`K8S_QA`, `K8S_STAGING`, `K8S_NGSS_PROD`), and questions like "which cluster runs X", "what namespace is Y in", "what's the image for Z", "where do twig charts deploy".
How this skill is triggered — by the user, by Claude, or both
Slash command
/twig:k8s-servicesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill is a **how-to-look-this-up procedure**, not a frozen inventory. Replica counts, image tags, ingress hostnames, and "is service X currently running" change every release — answer those by reading `values.<env>.yaml`, the FluxCD gitops repo, or `kubectl`, **never from this skill's mental model**.
This skill is a how-to-look-this-up procedure, not a frozen inventory. Replica counts, image tags, ingress hostnames, and "is service X currently running" change every release — answer those by reading values.<env>.yaml, the FluxCD gitops repo, or kubectl, never from this skill's mental model.
What's worth pinning here are the slow-moving facts: which clusters exist, the per-family conventions, the deploy pipeline's structure, and the non-obvious gotchas we've discovered.
Use this skill when the user asks:
<service> run?" / "Which cluster hosts X?"twig-internal / twig-ngss / twig-tap-us / etc.?"<service> build?"<service> in?"charts/ folder?"K8S_NGSS_PROD / K8S_QA / K8S_STAGING?"<cluster>?"twig-internal's redirect namespaces.For CMS-specific architecture (which service owns a GraphQL field), defer to the content-delivery skill — this skill is purely about where things run.
Twig runs seven EKS clusters across seven AWS accounts:
| Cluster | Account | Region | Role | EKS cluster name |
|---|---|---|---|---|
twig-internal | 874528425052 | eu-west-1 | OSM legacy + shared platform + brand redirects | twig-internal-0prPwPXu |
twig-dev | 743504312842 | eu-west-1 | NGSS qa + staging | twig-dev-5GghY4oJ |
twig-ngss | 785933498059 | us-west-2 | NGSS production | twig-ngss-7jrreD6K |
twig-tap-us | 285549682109 | us-west-2 | TAP product (US) | twig-tap-us-jbDwRIPu |
twig-osm-us | 053695247981 | us-west-2 | OSM production (US region) | (not yet inventoried) |
twig-osm-uk | 168230745404 | eu-west-2 | OSM production (UK region) | (not yet inventoried) |
twig-osm-eu | 175425057340 | eu-west-1 | OSM production (EU region) | (not yet inventoried) |
twig-sandbox | 299035209821 | eu-west-1 | sandbox | (not yet inventoried) |
Source of truth for the cluster list: ~/code/other/infra/README.md (terragrunt profile section) and ~/code/other/infra/terraform/stages/ directory.
kubectl context names depend on how you generated them. aws eks update-kubeconfig builds context names differently based on the flags used:
--profile <profile> (no alias) → context is the cluster ARN, e.g. arn:aws:eks:us-west-2:785933498059:cluster/twig-ngss-7jrreD6K. This is what you get on a fresh setup. Long but unambiguous.--alias <name> → context is exactly <name>. Use this for short, memorable contexts.--role-arn <arn> → context is <cluster-name>@<account>_<role>, e.g. twig-ngss-7jrreD6K@785933498059_IL-PowerDeveloper. This is the form some older Twig setups use.Always run kubectl config get-contexts -o name | grep -i twig to list what's actually on the current machine — don't guess the format. Aliases and ARN forms can coexist for the same cluster.
Generating a context for a cluster you don't have yet:
# Long ARN-style context (default):
aws eks update-kubeconfig --name twig-ngss-7jrreD6K --region us-west-2 \
--profile 785933498059_IL-PowerDeveloper
# Short alias (recommended for daily use):
aws eks update-kubeconfig --name twig-ngss-7jrreD6K --region us-west-2 \
--profile 785933498059_IL-PowerDeveloper --alias twig-ngss
Required SSO permission set: IL-PowerDeveloper. All Twig EKS access goes through SSO permission sets in the cluster's account, and the only one wired up to actually authenticate against the cluster is IL-PowerDeveloper. Other commonly-listed permission sets (Developer, IL-Administrator) will appear in kubectl config get-contexts output but fail with ForbiddenException: No access when used. Always pick the ..._IL-PowerDeveloper context.
Logging in (one command for all Twig accounts):
aws sso login --sso-session il-sso
All <account>_IL-PowerDeveloper profiles in ~/.aws/config share sso_session = il-sso, so a single login refreshes credentials for every Twig account at once. You do not need to run aws sso login --profile <each> per account. (The il-sso session points at https://d-9a67047592.awsapps.com/start#.)
Per-profile login (only if your config doesn't use a shared sso_session):
aws sso login --profile 785933498059_IL-PowerDeveloper # for twig-ngss
aws sso login --profile 743504312842_IL-PowerDeveloper # for twig-dev
aws sso login --profile 874528425052_IL-PowerDeveloper # for twig-internal
# ...etc per account
If you don't see IL-PowerDeveloper contexts on your machine:
IL-PowerDeveloper in the relevant account (874528425052 / 743504312842 / 785933498059 / 285549682109 / 053695247981 / 168230745404 / 175425057340 / 299035209821). If it doesn't, that's an access request, not a config issue.aws sso login --sso-session il-sso.aws eks update-kubeconfig --name twig-internal-0prPwPXu --region eu-west-1 --profile 874528425052_IL-PowerDeveloper.If kubectl still fails after this, fall back to chart-derived data or to the FluxCD gitops repo (see Q2).
These are categories, not service lists. To get the list of services in a cluster, run kubectl get deploy -A against it (when access permits) or read ~/code/other/infra/gitops/clusters/<cluster>/.
| Cluster | What lives there |
|---|---|
twig-internal | OSM family application services (e.g. tocs and its celery/heavy/tree/flower variants in the production namespace) — plus ~30 brand-redirect namespaces (clickview-*, tigtag-*, twig-prep-redirects, twig-secondary-redirects, twigsciencereporter-redirect, aksorn-redirect) — plus all the cluster's shared platform tooling: FluxCD (flux-system), Karpenter, ChartMuseum, cert-manager, external-dns, ingress-nginx (internal + external), Jenkins agents (swarm-k8s-jenkins-slave), monitoring (Prometheus / Thanos / Grafana), MongoDB operator, NewRelic, Retool, Sorry-Cypress (Cypress dashboard), releases-dashboard, zpa-connector. Do not assume a "service" you find here is application code — most of twig-internal's deployments are infra. |
twig-dev | NGSS family non-prod environments. Standard namespaces are qa and staging; some repos add review (per-PR review apps), preview, ci (test runs), or other env-named namespaces. |
twig-ngss | NGSS family production. Default namespace is production; some repos add additional prod-side namespaces (e.g. tsc-react runs a translate namespace here for the translation workflow). |
twig-tap-us | TAP product workloads (e.g. k8s-tap-fe-service). |
twig-osm-{us,uk,eu} | OSM regional production: TigTag, Twig World, and other legacy-brand application services per region. Most are not in ~/code/services/ (they're owned by the OSM team). |
twig-sandbox | One-off experiments. |
Twig has three deploy conventions. The family decides the image registry path, the namespace pattern, the Jenkins agent label, and ultimately which cluster runs the workload.
| Family | Image registry path | Charts location | Jenkins agent label by env | Production cluster |
|---|---|---|---|---|
| NGSS | 817276302724.dkr.ecr.eu-west-1.amazonaws.com/ngss/<repo> | charts/<repo>/ in the service repo | K8S_QA / K8S_STAGING / K8S_NGSS_PROD (most repos), or a literal cluster-name label like twig-ngss-1 (some host-apps) — read deploy.groovy to confirm | twig-ngss |
| OSM | 817276302724.dkr.ecr.eu-west-1.amazonaws.com/osm/<repo> | charts/<repo>/ in the service repo (legacy repo_chart.yaml for some) | varies | twig-internal (and OSM regionals) |
| TAP | varies | k8s-tap-fe-service/charts/ (chart-only sibling pattern) | varies | twig-tap-us |
Identifying the family for a repo:
values.yaml and look at the image repository string. The ngss/ vs osm/ prefix is decisive.FAMILY parameter in the repo's deploy.groovy is also explicit (e.g. FAMILY = 'NGSS').ECR / SSM / IRSA accounts (consistent across NGSS):
817276302724 (eu-west-1)817276302724 (config under /NGSS/<svc>/<env>/)743504312842 — same as twig-dev785933498059 — same as twig-ngssThree patterns to recognise in ~/code/services/ and ~/code/host-apps/:
Repo contains charts/<repo>/Chart.yaml plus values.yaml, values.qa.yaml, values.staging.yaml, values.production.yaml, and values.default.yaml. The Jenkinsfile builds the Docker image; deploy.groovy registers per-env Jenkins jobs (NGSS_<repo>_Deploy-{qa,staging,production}) that point at the chart and a K8S_NODE agent label.
Example: ~/code/services/twig-graph/, ~/code/services/content-service/, ~/code/host-apps/tsc-react/.
k8s-* siblings)Standalone repos that hold only repo_chart.yaml + values.{default,qa,production}.yaml and reference an image built in a sibling repo. The non-k8s-* sibling builds the Docker image; the k8s-* repo is the chart.
Pairs (live as of writing):
services/binumi-video-integration-service ↔ services/k8s-binumi-video-integration-serviceservices/ts-rostering-clever-worker ↔ services/k8s-ts-rostering-clever-workerservices/ts-rostering-or-worker ↔ services/k8s-ts-rostering-or-worker~/code/other/k8s-tap-fe-serviceWhen investigating one of these, you usually need both repos open: the sibling for code, the k8s-* for chart values.
Repo has a Jenkinsfile that does docker build and a deploy.groovy JobDSL file but no charts/ directory of its own. The Jenkins job created by deploy.groovy plugs into the shared pipeline (TwigWorld/docker-pipeline) which materialises chart values at deploy time. Examples:
assignment-center-service, digital-glossary-service, lesson-answers-service, standards-explorer-service, ts-site-navigation-service, user-subscription-service, favourites-service-otk2, binumi-video-integration-service (the image-side of the pair), toxy.For these, the chart and per-env values live in TwigWorld/docker-pipeline (a separate GitHub repo, not on disk by default). To get the chart for a docker-only service, fetch from GitHub (see Lookup recipes below).
go-associated-standards-service — abandoned Go rewrite (last commit ~2023). Has Dockerfile but no Jenkinsfile, no chart, not deployed. Reference-only.cachinator — runs on an EC2 cron host, not in Kubernetes. deploy.sh rsyncs a Go binary to /home/admin/cachinator-linux-amd64. If a question mentions cachinator and Kubernetes in the same breath, the assumption is wrong.~/code/mfes/) — not k8s-deployed. They publish to CDN. Do not document them here.deploy.groovy works (the JobDSL pattern)Most NGSS service repos have either jobs/deploy.groovy or deploy.groovy at the root. It's a Jenkins JobDSL script that registers per-env pipeline jobs. Skim it to learn:
FAMILY — 'NGSS' for the new fleet, 'OSM' for legacy.RELEASE_NAME — Helm release name (e.g. cont-sync, qa-cont-sync).NAMESPACE — usually the env name (qa, staging, production).K8S_NODE — the Jenkins agent label that selects the cluster (K8S_QA → twig-dev, K8S_NGSS_PROD → twig-ngss, etc.).VALUES_FILE — path to the env-specific values file inside the repo.cpsScm block points at the shared pipeline repo: git@github.com:TwigWorld/docker-pipeline.git. That's where the actual helm upgrade logic lives.You will NOT find kubectl --context <foo> or helm upgrade -n <ns> in the service repo's Jenkinsfile. That logic is in docker-pipeline. If you need to read it (e.g. to confirm an agent-label → context mapping), see Lookup recipes.
Pick the question, run the procedure. Resolve repo paths via ~/.claude/twig/repo-paths.json exactly the way the content-delivery skill does — see that skill's "Resolving repo paths" section if a repo's local clone path isn't already known.
<service> run?"~/code/services/<service>/charts/<service>/, ~/code/host-apps/<service>/charts/<service>/, or one of the k8s-* paired repos. If none exist, it's a docker-only service — see Q4.Chart.yaml and values.yaml for image.repository. The ngss/ or osm/ prefix tells you the family.twig-dev (namespace = env name).twig-ngss (namespace production).twig-internal (namespace production) or an OSM regional cluster — read deploy.groovy for the agent label to disambiguate.twig-tap-us.deploy.groovy if the agent labels are non-standard. The K8S_NODE value is the source of truth.values.<env>.yaml directly. Don't recall — they change. Common keys: replicaCount, autoscaling.minReplicas/maxReplicas, resources.requests, ingress.hosts, ambassador.prefix.<cluster> right now?"Try in this order — first one that succeeds wins:
kubectl (most authoritative):
kubectl --context '<full-context-name>' get deploy -A \
-o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,REPLICAS:.spec.replicas,IMAGE:'.spec.template.spec.containers[0].image' \
--no-headers
For statefulsets / cronjobs / ingresses, repeat with get statefulset -A, get cronjob -A, get ingress -A. If you get ForbiddenException: No access, your SSO user doesn't have access to that cluster's account — fall through to step 2.
FluxCD gitops repo (for twig-internal and any cluster with FluxCD):
~/code/other/infra/gitops/clusters/<cluster>/ — the cluster's Kustomization root.~/code/other/infra/gitops/apps/<cluster>/ — per-app HelmRelease / Kustomization manifests.~/code/other/infra/gitops/apps/common/ — shared workloads applied to every cluster (Karpenter, cert-manager, etc.).Per-repo charts — last resort, for application services that don't live in gitops. Walk ~/code/services/*/charts/*/values.<env>.yaml and infer.
K8S_NGSS_PROD / K8S_QA / K8S_STAGING?"These are Jenkins agent labels, not kubectl contexts. The mapping lives in TwigWorld/docker-pipeline (Jenkins shared library on GitHub, not cloned by default). Conventional mapping based on the IRSA + service-family conventions:
K8S_QA → twig-dev cluster, namespace qa (or whatever NAMESPACE is set to in deploy.groovy).K8S_STAGING → twig-dev cluster, namespace staging.K8S_NGSS_PROD → twig-ngss cluster, namespace production.twig-internal or OSM regional clusters; specifics vary by repo — read deploy.groovy.K8S_NODE is not always a $K8S_* variable. Some repos (especially host-apps like tsc-react) put a literal cluster-name string into K8S_NODE directly — e.g. 'k8sNode': 'twig-ngss-1' in tsc-react's production env. When you see a value that doesn't start with $K8S_, treat it as a Jenkins agent label tied to that specific cluster (twig-ngss-1 → the twig-ngss cluster). The cluster identity is in the substring; the -1 is an agent-pool index.
The same deploy.groovy may register many envs beyond qa/staging/production. tsc-react, for example, registers five (qa, review, staging, production, translate). The NAMESPACE parameter defaults to envName so each becomes a namespace in its target cluster. When asked "where does X run?", read all of deploy.groovy's environments map — don't stop at the first three.
To verify the mapping authoritatively, fetch the pipeline repo:
gh repo view TwigWorld/docker-pipeline
gh api repos/TwigWorld/docker-pipeline/contents/<path-to-agent-label-mapping> 2>/dev/null
Or clone it locally if you'll be debugging deploys regularly.
The chart for these services lives in TwigWorld/docker-pipeline, not the service repo. Two ways to read it:
mcp__plugin_il_github__get_file_contents on TwigWorld/docker-pipeline for the chart path. Common paths to try: charts/, helm/, pipelines/<service>/.mcp__plugin_il_github__search_code with repo:TwigWorld/docker-pipeline <service-name> to locate per-service overrides.git clone git@github.com:TwigWorld/docker-pipeline.git ~/code/other/docker-pipeline. Subsequent reads use the local copy.The FAMILY + service name in the service's own deploy.groovy is the join key.
Given an image like 817276302724.dkr.ecr.eu-west-1.amazonaws.com/ngss/<x> or osm/<y>:
~/code/services/<x> (or ~/code/services/<y>). NGSS-family images are eu-west-1 / ngss/<repo>; the repo basename matches <repo>.ngss/ or osm/ one (e.g. infra/, quay.io/..., ghcr.io/...), it's shared platform tooling — usually only deployed via ~/code/other/infra/gitops/.Three possibilities, in likelihood order:
docker-pipeline) — see Q4. Check for a deploy.groovy in the repo: if it exists and has FAMILY = 'NGSS', this is almost certainly it.k8s-* sibling — search for ~/code/services/k8s-<repo>/ or grep other repos' chart values for the image name.deploy.groovy, no sibling. Confirm with the user.<service> last deployed?"Three approaches, in increasing reliability:
The Progressing deployment condition (closest single-call answer):
kubectl --context '<ctx>' get deploy <name> -n <ns> \
-o jsonpath='rev={.metadata.annotations.deployment\.kubernetes\.io/revision} lastUpdate={.status.conditions[?(@.type=="Progressing")].lastUpdateTime}{"\n"}'
lastUpdateTime on the Progressing condition advances each time the deployment rolls out a new ReplicaSet, which is what a Helm-driven deploy does. Combined with the revision annotation, this gives you "rev N rolled out at TIME".
Pod creationTimestamp (fallback when conditions are pruned):
kubectl --context '<ctx>' get pods -n <ns> -l app.kubernetes.io/name=<name> \
-o jsonpath='{range .items[*]}{.metadata.creationTimestamp}{"\n"}{end}' | sort -u
A rollout creates new pods, so the oldest live pod's creationTimestamp is approximately the last deploy time (within a rollout window). Cheap, but lies if the deployment auto-restarted pods for non-deploy reasons (eviction, node rotation, HPA scale-out).
Helm history (most authoritative — full deploy ledger):
helm --kube-context '<ctx>' history <release> -n <ns>
Where <release> is the Helm release name. For NGSS-family charts the release name follows the deploy.groovy RELEASE_NAME pattern: <prefix><short-name> per env, e.g. qa-cont-sync, cont-sync (production), ts-content-service (production). If unsure, helm --kube-context '<ctx>' list -n <ns> enumerates them. Output gives revision, status (deployed / superseded / failed), chart version, app version, and UPDATED timestamp — the canonical "when was it deployed".
Pitfalls:
metadata.creationTimestamp — that's when the Deployment resource was first created (often years ago), not when it was last rolled out.kubectl rollout history deploy/<name> exists, but the CHANGE-CAUSE annotation is rarely populated by Helm, and the revision numbers don't carry timestamps. Prefer approach 1 or 3.These are the things that catch people out:
twig-internal's production namespace is OSM, not NGSS. A deployment named tocs-tocs in twig-internal/production is the OSM TOCS instance (image osm/tocs), nothing to do with the NGSS production namespace in twig-ngss. Always check the cluster and the image registry path.twig-internal (clickview-*, tigtag-*, twig-prep-redirects, twigsciencereporter-redirect, etc.) host static-redirect handlers for retired brands. They look like services. They aren't.tocs is OSM-family, not NGSS. Image is osm/tocs. The repo ~/code/services/tocs/ does NOT follow the NGSS conventions table; charts ship celery / heavy / tree / flower variants and deploy to twig-internal/production.tocs is one repo but eight deployments. Looking for a single tocs deployment in twig-internal/production will mislead — the chart fans out into tocs-tocs (main API), tocs-tree, tocs-heavy (separate web tiers serving different workload classes), tocs-celery-0 through tocs-celery-3 (four sharded celery workers), and tocs-flower (celery monitoring UI). All run the same osm/tocs image but with different command/args. When debugging a tocs request, identify which deployment served it by the ingress host (tocs.twig-world.com → tocs-tocs, tree.twig.world → tocs-tree, heavy-tocs.twig-world.com → tocs-heavy).K8S_NGSS_PROD is not a kubectl context. It's a Jenkins agent label. The actual kubectl context for the twig-ngss cluster is whatever aws eks update-kubeconfig generated locally (an ARN like arn:aws:eks:us-west-2:785933498059:cluster/twig-ngss-7jrreD6K, an alias, or a <cluster>@<account>_<role> string). Always check kubectl config get-contexts rather than assuming.twig-internal (visible: flux-system namespace, helm-controller, kustomize-controller, source-controller) plus per-service Jenkins jobs for application services. If a question assumes ArgoCD, correct it.deploy.groovy says K8S_NODE = '$K8S_NGSS_PROD'. To read the live state, you need to translate that into the kubectl context — they are not the same string.cachinator is NOT k8s. Go binary on an EC2 cron host. Don't go looking for a chart.twig-sandbox (account 299035209821) is referenced in infra/README.md but isn't in the local kubectl context list by default. Don't assume it's decommissioned just because you can't reach it.twig-osm-{us,uk,eu}) hold most of the OSM application fleet. That fleet largely isn't in ~/code/services/; it's in OSM-team repos. If a question is about a TigTag / Twig World production service that you can't find under ~/code/services/, that's why.us-west-2 clusters — pull from 817276302724.dkr.ecr.eu-west-1.amazonaws.com. Cross-region pulls. There is no per-region ECR replica.This skill encodes patterns and conventions, not the live state. Trust ground truth over the skill when they disagree:
kubectl get (when access permits) or the FluxCD gitops repo.values.<env>.yaml (or for docker-only services, TwigWorld/docker-pipeline).~/code/other/infra/README.md + ~/code/other/infra/terraform/stages/.TwigWorld/docker-pipeline.If you discover the skill is wrong (a new cluster, a renamed service family, a changed convention), update the skill rather than working around it.
content-delivery — which CMS service owns a GraphQL field / REST path. Use that skill for "where is field X served from" questions; this skill is purely about where the workload runs.tocs-api — TOCS REST contract. The tocs k8s deployment in twig-internal/production serves that API.il:audit-deployment, il:manage-deployable — IL platform deploy tooling. Not directly applicable to Twig (Twig doesn't use OTK or il.yaml), but worth knowing exists for cross-team work.npx claudepluginhub imaginelearning/dp-claude-plugin --plugin twigCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.