From truefoundry
Deploys applications to TrueFoundry. Handles single HTTP services, async/queue workers, multi-service projects, and declarative manifest apply. Supports `tfy apply`, `tfy deploy`, docker-compose translation, and CI/CD pipelines. Use when deploying apps, applying manifests, shipping services, or orchestrating multi-service deployments.
How this skill is triggered — by the user, by Claude, or both
Slash command
/truefoundry:deployThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Routing note: For ambiguous user intents, use the shared clarification templates in [references/intent-clarification.md](references/intent-clarification.md).
references/api-endpoints.mdreferences/async-errors.mdreferences/async-python-library.mdreferences/async-queue-configs.mdreferences/async-sidecar-deploy.mdreferences/cli-fallback.mdreferences/cluster-discovery.mdreferences/codebase-analysis.mdreferences/compose-translation.mdreferences/container-versions.mdreferences/dependency-graph.mdreferences/deploy-api-examples.mdreferences/deploy-apply.mdreferences/deploy-async.mdreferences/deploy-debugging.mdreferences/deploy-errors.mdreferences/deploy-multi.mdreferences/deploy-ordering.mdreferences/deploy-scaling.mdreferences/deploy-service.mdRouting note: For ambiguous user intents, use the shared clarification templates in references/intent-clarification.md.
Route user intent to the right deployment workflow. Load only the references you need.
| User Intent | Action | Reference |
|---|---|---|
| "deploy", "deploy my app", "ship this" | Single HTTP service | deploy-service.md |
| "mount this file", "mount config file", "mount certificate file", "mount key file" | Single service with file mounts (no image rebuild) | deploy-service.md |
| "tfy apply", "apply manifest", "deploy from yaml" | Declarative manifest apply | deploy-apply.md |
| "deploy everything", "full stack", docker-compose, "docker-compose.yaml", "compose.yaml" | Multi-service: use compose as source of truth | deploy-multi.md + compose-translation.md |
| "async service", "queue consumer", "worker" | Async/queue service | deploy-async.md |
| "deploy LLM", "serve model" | Model serving intent (may be ambiguous) | Ask user: dedicated model serving (llm-deploy) or generic service deploy (deploy) |
| "deploy helm chart" | Helm chart intent | Confirm Helm path and collect chart details, then proceed with helm workflow |
| "deploy postgres docker", "dockerized postgres", "deploy redis docker", "database in docker/container" | Containerized database intent | Proceed with deploy workflow (do not route to Helm) |
| "deploy database", "deploy postgres", "deploy redis" | Ambiguous infra intent | Ask user: Helm chart (helm) or containerized service (deploy) |
Load only the reference file matching the user's intent. Do not preload all references.
When in doubt, ask. If any deployment parameter is ambiguous or missing — branch, workspace, image, port, resources, environment — ask the user rather than picking a value and proceeding silently. A wrong assumption can deploy to the wrong environment, from the wrong branch, or with the wrong configuration. The cost of one extra question is always lower than the cost of a bad deploy.
Examples of things to ask rather than assume:
Do NOT silently default to the current value of anything that could have changed or that the user has not explicitly confirmed for this deployment.
# 1. Check credentials
grep '^TFY_' .env 2>/dev/null || true
env | grep '^TFY_' 2>/dev/null || true
# 2. Derive TFY_HOST for CLI (MUST run before any tfy command)
export TFY_HOST="${TFY_HOST:-${TFY_BASE_URL%/}}"
# 3. Check CLI
tfy --version 2>/dev/null || echo "Install: pip install 'truefoundry==0.5.0'"
# 4. Check for existing manifests
ls tfy-manifest.yaml truefoundry.yaml 2>/dev/null
TFY_BASE_URL and TFY_API_KEY must be set (env or .env).TFY_HOST must be set before any tfy CLI command. The export above handles this automatically.TFY_WORKSPACE_FQN required. HARD RULE: Never auto-pick a workspace. Always ask the user to confirm, even if only one workspace exists or a preference is saved. See references/prerequisites.md for the full workspace confirmation flow.references/prerequisites.md.WARNING: Never use
source .env. Thetfy-api.shscript handles.envparsing automatically. For shell access:grep KEY .env | cut -d= -f2-
tfy apply vs tfy deployHARD RULE:
tfy applydoes NOT supportbuild_source.type: local. If the manifest has a local build source, you MUST usetfy deploy -f <manifest>. Usingtfy applywith a local build source will fail with:must match exactly one schema in oneOf.
| Scenario | Command | Works? |
|---|---|---|
Pre-built image (image.type: image) | tfy apply -f manifest.yaml | Yes |
build_source.type: git | tfy apply -f manifest.yaml | Yes |
build_source.type: git | tfy deploy -f manifest.yaml | Yes |
build_source.type: local | tfy deploy -f manifest.yaml | Yes |
build_source.type: local | tfy apply -f manifest.yaml | NO — will fail |
Before running any deploy command, check the manifest:
build_source.type: local → use tfy deploy -ftfy apply -f is fineBefore attempting any deploy/apply, run these checks. Fix issues before deploying — do not deploy a known-bad manifest.
hostIf any port has expose: true, it must have a host field. Deploying without it will fail with: Host must be provided to expose port.
Auto-generate the host if missing:
TFY_API_SH=~/.claude/skills/truefoundry-deploy/scripts/tfy-api.sh
# Get cluster ID from workspace FQN (format: cluster-id:workspace-name)
CLUSTER_ID=$(echo "$TFY_WORKSPACE_FQN" | cut -d: -f1)
# Discover base domain from cluster manifest
bash $TFY_API_SH GET "/api/svc/v1/clusters/$CLUSTER_ID"
# → Response is at data.manifest.base_domains[] (array of strings)
# → Look for wildcard entry (e.g., "*.ml.example.truefoundry.cloud")
# → Strip "*." to get base domain: "ml.example.truefoundry.cloud"
# → Construct host: "{service-name}-{workspace-name}.{base_domain}"
Pattern: {service-name}-{workspace-name}.{base_domain}
tfy deployIf the manifest contains build_source.type: local, ensure the deploy command is tfy deploy -f, NOT tfy apply.
capacity_type compatibilityspot_fallback_on_demand is not supported on all clusters. If you're unsure, use on_demand or omit capacity_type entirely to let the platform decide. Valid safe values: on_demand, spot.
build_spec.type must be exactOnly dockerfile and tfy-python-buildpack are valid. Do NOT use docker, build, python, or any other value.
If an existing manifest has build_source.type: git with a branch_name set, compare it to the current local branch before deploying:
# Use only the specific manifest file for this deployment (not both at once)
# Use -h to suppress the filename prefix so the bare value can be compared
grep -h 'branch_name:' "$MANIFEST_FILE" 2>/dev/null | head -1 | sed 's/.*branch_name:[[:space:]]*//'
# Get current local branch
git branch --show-current 2>/dev/null
If the branches differ, stop and ask the user:
The manifest specifies
branch_name: {manifest_branch}, but your current local branch is{current_branch}. Which branch should be deployed?
- Keep manifest branch:
{manifest_branch}(deploy as-is, no manifest change)- Use current branch:
{current_branch}(updatebranch_namein the manifest)
Never silently override the manifest's branch_name with the current local branch.
# tfy CLI expects TFY_HOST when TFY_API_KEY is set
export TFY_HOST="${TFY_HOST:-${TFY_BASE_URL%/}}"
# Preview changes
tfy apply -f tfy-manifest.yaml --dry-run --show-diff
# Apply
tfy apply -f tfy-manifest.yaml
# tfy CLI expects TFY_HOST when TFY_API_KEY is set
export TFY_HOST="${TFY_HOST:-${TFY_BASE_URL%/}}"
# MUST use tfy deploy (not tfy apply) for local builds
tfy deploy -f truefoundry.yaml --no-wait
Reminder:
tfy applydoes NOT supportbuild_source.type: local. Usetfy deploy -ffor local builds.
name: my-service
type: service
image:
type: image
image_uri: docker.io/myorg/my-api:v1.0
ports:
- port: 8000
expose: false # Set true + add host for public access
app_protocol: http
resources:
cpu_request: 0.5
cpu_limit: 1
memory_request: 512
memory_limit: 1024
ephemeral_storage_request: 1000
ephemeral_storage_limit: 2000
env:
LOG_LEVEL: info
replicas: 1
workspace_fqn: "WORKSPACE_FQN_HERE"
ports:
- port: 8000
expose: true
host: my-service-my-workspace.ml.your-org.truefoundry.cloud # Auto-generate from cluster discovery
app_protocol: http
Host is REQUIRED when
expose: true. Auto-generate it:{service-name}-{workspace-name}.{base_domain}. Getbase_domainfrom cluster discovery (seecluster-discovery.md).
TFY_API_SH=~/.claude/skills/truefoundry-deploy/scripts/tfy-api.sh
bash $TFY_API_SH GET '/api/svc/v1/apps?workspaceFqn=WORKSPACE_FQN&applicationName=SERVICE_NAME'
Or use the applications skill.
HARD RULE: After every successful
tfy applyortfy deploycommand, you MUST monitor the deployment to completion. Do NOT stop after the apply/deploy command returns. Do NOT ask the user "should I monitor?" — just do it. Do NOT say "you can check the status" — YOU check the status. The deployment is not done until you confirm a terminal state.
Immediately after deploy/apply succeeds, start polling. Do not wait for the user to ask.
Poll loop — execute this yourself, do not delegate to the user:
TFY_API_SH=~/.claude/skills/truefoundry-deploy/scripts/tfy-api.sh
# Run this in a loop with sleep between checks:
# Every 15s for first 2 min, every 30s for min 2-5, every 60s after that
# Timeout after 10 minutes
bash $TFY_API_SH GET '/api/svc/v1/apps?workspaceFqn=WORKSPACE_FQN&applicationName=SERVICE_NAME'
Or use MCP tool call if available:
tfy_applications_list(filters={"workspace_fqn": "WORKSPACE_FQN", "application_name": "SERVICE_NAME"})
How to check: The response is at data[0].deployment.currentStatus. Use state.isTerminalState as the authoritative check.
Terminal states (state.isTerminalState === true) — stop polling:
DEPLOY_SUCCESS → report success, replicas, endpoint URLBUILD_FAILED, DEPLOY_FAILED, FAILED → fetch logs, diagnose, suggest fix (see below)PAUSED → report pausedCANCELLED → report cancelledNon-terminal states — keep polling, report progress each time:
INITIALIZED → "Deployment initialized, waiting..."BUILDING (status) or transition BUILDING → "Build in progress..."BUILD_SUCCESS → "Build succeeded, deploying..."ROLLOUT_STARTED or transition DEPLOYING → "Deploying (X/Y replicas ready)..."DEPLOY_FAILED_WITH_RETRY → "Deploy failed, retrying..."curl -sf -o /dev/null -w '%{http_code}' "https://ENDPOINT_URL" || true
logs skill or direct APIReport current state and elapsed time. Do NOT silently give up — tell the user:
Monitoring timed out after 10 minutes. Current status: ROLLOUT_STARTED (transition: DEPLOYING).
The deployment is still in progress. You can re-run monitoring or check the TrueFoundry dashboard.
NEVER end your response after a deploy/apply command without reporting a terminal deployment status (
state.isTerminalState === true). If you are about to end your response and you have not confirmedDEPLOY_SUCCESS,DEPLOY_FAILED,BUILD_FAILED,FAILED,PAUSED, orCANCELLED, you are violating this rule — go back and poll.
After deployment succeeds (
DEPLOY_SUCCESS), ask the user about the following configuration options. Do not silently skip these — present them as a checklist and let the user decide.
Ask the user:
Your service is deployed. How should it be accessed?
1. **Public URL** — Accessible from the internet (expose: true with a host)
2. **Private/Internal only** — Only accessible within the cluster (expose: false)
If the user picks public and the port doesn't already have expose: true + host, update the manifest and redeploy.
Ask the user:
Do you want to add authentication to your service?
1. **No auth** — Anyone with the URL can access it
2. **TrueFoundry login** — Users must log in via TrueFoundry (truefoundry_oauth)
3. **JWT auth** — Verify JWT tokens from a custom identity provider
4. **Basic auth** — Username/password protection
If the user picks an auth option, add the appropriate auth block to the port configuration and redeploy.
Ask the user:
Should the service auto-shutdown when idle?
1. **Always running** — Keep replicas up at all times (default)
2. **Auto-shutdown after idle** — Scale to zero after no requests for a period (saves cost)
→ Recommended wait_time: 900 seconds (15 min) for dev, longer for staging
If the user picks auto-shutdown, add the auto_shutdown block to the manifest:
auto_shutdown:
wait_time: 900 # seconds of inactivity before scaling to zero
Skip these prompts if the user explicitly said they don't want changes, or if this is a redeploy of an existing service that already has these configured.
See references/cli-fallback.md for converting YAML to JSON and deploying via tfy-api.sh.
Before creating any manifest, scan the project:
docker-compose.yml, docker-compose.yaml, or compose.yaml first. If present (or user mentions docker-compose), treat it as the primary source of truth: load deploy-multi.md and compose-translation.md, generate manifests from the compose file, wire services per service-wiring.md, then complete deployment. Do not ask the user to manually create manifests when a compose file exists.Dockerfile files across the projectservices/, apps/, frontend/, backend/deploy-multi.md + compose-translation.mdreferences/deploy-service.mdreferences/deploy-multi.mdHARD RULE: When deploying multiple services, you MUST deploy in dependency order, create secrets between tiers, and wire services before deploying dependents. Never deploy all services at once.
Tier-by-tier flow:
TIER 0: Infrastructure (DB, Cache, Queue) → deploy → wait for pods ready → create TFY secrets
TIER 1: Backend (APIs, workers) → deploy with secrets + DNS wiring → verify connectivity
TIER 2: Frontend / gateway → deploy with backend URLs → verify end-to-end
Key rules:
DEPLOY_SUCCESS does NOT mean Helm pods are ready — poll actual readinessFor step-by-step orchestration, examples, and common patterns, see deploy-ordering.md. For dependency graphs, DNS wiring, and compose translation, see deploy-multi.md, service-wiring.md, and dependency-graph.md.
HARD RULE: NEVER put sensitive values directly in the manifest
envblock. ALWAYS create a TrueFoundry secret group first, then reference the secrets usingtfy-secret://format. This is non-negotiable — even for "quick" or "test" deployments.
Workflow for any env var that looks sensitive (matches *PASSWORD*, *SECRET*, *TOKEN*, *KEY*, *API_KEY*, *DATABASE_URL*, *CONNECTION_STRING*, *CREDENTIALS*, or any value the user explicitly says is sensitive):
secrets skill:
# Use the secrets skill to create a group with the sensitive keys
# The skill will handle creating the group and individual secrets
tfy-secret:// format:env:
LOG_LEVEL: info # plain text OK
DB_PASSWORD: tfy-secret://my-org:my-service-secrets:DB_PASSWORD # sensitive — ALWAYS use tfy-secret://
API_KEY: tfy-secret://my-org:my-service-secrets:API_KEY # sensitive — ALWAYS use tfy-secret://
Pattern: tfy-secret://<TENANT_NAME>:<SECRET_GROUP_NAME>:<SECRET_KEY> where TENANT_NAME is the subdomain of TFY_BASE_URL.
If the user provides a raw secret value in the manifest or asks you to put it directly in env:
Use the secrets skill for guided secret group creation. For the full workflow, see references/deploy-service.md (Secrets Handling section).
When users ask to mount files into a deployment, prefer manifest mounts over Dockerfile edits:
type: secret for sensitive file content (keys, certs, credentials)type: config_map for non-sensitive config filestype: volume for writable/shared runtime dataSee references/deploy-service.md (File Mounts section) for the end-to-end workflow.
These references are available for all workflows — load as needed:
| Reference | Contents |
|---|---|
manifest-schema.md | Complete YAML field reference (single source of truth) |
manifest-defaults.md | Per-service-type defaults with YAML templates |
cli-fallback.md | CLI detection and REST API fallback pattern |
cluster-discovery.md | Extract cluster ID, base domains, available GPUs |
resource-estimation.md | CPU, memory, GPU sizing rules of thumb |
health-probes.md | Startup, readiness, liveness probe configuration |
gpu-reference.md | GPU types and VRAM reference |
container-versions.md | Pinned container image versions |
prerequisites.md | Credential setup and .env configuration |
rest-api-manifest.md | Full REST API manifest reference |
| Reference | Used By |
|---|---|
deploy-api-examples.md | deploy-service |
deploy-errors.md | deploy-service |
deploy-scaling.md | deploy-service |
load-analysis-questions.md | deploy-service |
codebase-analysis.md | deploy-service |
tfy-apply-cicd.md | deploy-apply |
tfy-apply-extra-manifests.md | deploy-apply |
deploy-ordering.md | deploy-multi (tier-by-tier orchestration) |
compose-translation.md | deploy-multi |
dependency-graph.md | deploy-multi |
multi-service-errors.md | deploy-multi |
multi-service-patterns.md | deploy-multi |
service-wiring.md | deploy-multi |
deploy-debugging.md | All deploy/apply (when status is failed) |
async-errors.md | deploy-async |
async-queue-configs.md | deploy-async |
async-python-library.md | deploy-async |
async-sidecar-deploy.md | deploy-async |
workspaces skillmonitor skill to track deployment progressapplications skilllogs skillsecrets skillhelm skillllm-deploy skillservice-test skillnpx claudepluginhub truefoundry/tfy-deploy-skills --plugin truefoundryDeploys apps to Render by analyzing codebases, generating render.yaml blueprints, and providing dashboard deeplinks. For Git-backed services, Docker images, databases, and cron jobs.
Generates complete deployment configurations—Dockerfile, manifests, environment config, rollback procedures—for Node.js, Python, Go, Rust, Java services. Useful for deployment setup, strategy, or rollback queries.
Guides application design and refactoring using 12-Factor principles for cloud-native, containerized, and Kubernetes-deployed apps.