From daily-carry
Deploys to archivist@194.163.189.144 VPS and iteratively debugs until successful. Use when deploying to VPS, debugging deployment failures, investigating container issues, checking health endpoints, or fixing OtterStack errors. Triggers on "deploy to vps", "debug deployment", or "fix container failure".
npx claudepluginhub jayteealao/agent-skills --plugin daily-carryThis skill uses the workspace's default tool permissions.
Deploy to the production VPS (archivist@194.163.189.144) and iteratively debug failures until successful deployment.
Verifies tests pass on completed feature branch, presents options to merge locally, create GitHub PR, keep as-is or discard; executes choice and cleans up worktree.
Guides root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Writes implementation plans from specs for multi-step tasks, mapping files and breaking into TDD bite-sized steps before coding.
Deploy to the production VPS (archivist@194.163.189.144) and iteratively debug failures until successful deployment.
This skill connects to the specific VPS, triggers OtterStack deployments, monitors output for failures, diagnoses root causes, applies fixes, and redeploys until the application is successfully running and healthy.
This skill can be used in two modes:
/deploy-otterstack command during Phase 6: Debug LoopWhen invoked from the /deploy-otterstack command, this skill operates in iterative debug mode:
Workflow:
Context Passed from Orchestration:
ERROR_MESSAGE - Full error output from failed deploymentDEPLOYMENT_STAGE - Which stage failed (validation, startup, health_check, lock_conflict, traefik, unknown)PROJECT_NAME - Name of the project being deployedATTEMPT_NUMBER - Current debug iteration (1-6)DEPLOYMENT_TARGET - Always "vps" when this skill is invokedFix Determination:
Based on DEPLOYMENT_STAGE, the skill uses the corresponding decision tree section below:
Fix Application:
| Fix Type | Fixable Via | Orchestration Action |
|---|---|---|
| Missing env vars | SSH command | Auto-fix: ssh ... otterstack env set PROJECT VAR value |
| Stale lock file | SSH command | Auto-fix: ssh ... rm ~/.otterstack/locks/PROJECT.lock |
| Compose file issues | Local edit | Prompt user: "Edit compose file, then press Enter to retry" |
| Dockerfile issues | Local edit | Prompt user: "Edit Dockerfile, commit, push, then press Enter" |
| Code issues | Local edit | Prompt user: "Fix code, commit, push, then press Enter" |
Return Values:
After each debugging iteration, provide guidance for retry:
Fix applied: [yes/no/partial]
Fix type: [environment_variable/lock_file/compose_file/dockerfile/code/unknown]
Confidence: [high/medium/low]
Retry recommended: [yes/no]
Notes: [Brief description of what was fixed]
| Aspect | Standalone Mode | Orchestrated Mode |
|---|---|---|
| Invocation | User runs skill directly | Called by /deploy-otterstack |
| Error Capture | Manual observation | Automatic from deployment output |
| Error Analysis | Manual diagnosis using decision tree | Automatic using DEPLOYMENT_STAGE |
| Fix Application | User decides when to fix | Semi-automated with prompts |
| Retry Logic | User manually redeploys | Automatic retry after fix |
| User Interaction | High - user drives process | Low - command drives, prompts for manual fixes |
| Iteration Limit | None - user decides | 6 attempts (configurable) |
| Context Awareness | User provides context | Full context from preparation phase |
The orchestration command integrates this skill as follows:
Before Invocation:
# Parse deployment error
ERROR_MESSAGE=$(cat deployment_output.txt)
DEPLOYMENT_STAGE=$(parse_error_stage "$ERROR_MESSAGE")
# Invoke debug skill with context
# The skill analyzes the error and provides fix guidance
During Debugging:
# For auto-fixable issues (env vars, locks):
if [[ $FIX_TYPE == "environment_variable" ]]; then
ssh archivist@194.163.189.144 "${OTTERSTACK_PATH} env set ${PROJECT_NAME} ${VAR_NAME} '${VAR_VALUE}'"
fi
if [[ $FIX_TYPE == "lock_file" ]]; then
ssh archivist@194.163.189.144 "rm ~/.otterstack/locks/${PROJECT_NAME}.lock"
fi
# For manual fixes (compose, Dockerfile, code):
echo "Fix required: ${FIX_DESCRIPTION}"
read -p "Press Enter after fixing manually, or 'abort' to cancel: " response
[[ $response == "abort" ]] && exit 1
After Fix:
# Retry deployment
ssh archivist@194.163.189.144 "~/OtterStack/otterstack deploy ${PROJECT_NAME} -v"
# If successful: proceed to Phase 7: Verification
# If failed: increment attempt counter, continue debug loop
# SSH Connection
SSH_HOST="archivist@194.163.189.144"
SSH_USER="archivist"
OTTERSTACK_PATH="~/OtterStack/otterstack"
# Verify connection
ssh ${SSH_HOST} "echo 'Connection successful'"
# 1. Verify OtterStack is available
ssh ${SSH_HOST} "${OTTERSTACK_PATH} --help"
# 2. Check project exists
ssh ${SSH_HOST} "${OTTERSTACK_PATH} status <project-name>"
# 3. Deploy with verbose output
ssh ${SSH_HOST} "${OTTERSTACK_PATH} deploy <project-name> -v"
# 4. If it succeeds, verify endpoints. If it fails, proceed to debugging.
Before deploying, verify the environment is ready:
# Check SSH access
ssh ${SSH_HOST} "echo 'SSH OK'"
# Check OtterStack installation
ssh ${SSH_HOST} "${OTTERSTACK_PATH} --version" || \
ssh ${SSH_HOST} "ls -l ~/OtterStack/otterstack"
# List existing projects
ssh ${SSH_HOST} "${OTTERSTACK_PATH} project list"
# Check current deployment status
ssh ${SSH_HOST} "${OTTERSTACK_PATH} status <project-name>"
Deploy with verbose output to see all stages:
ssh ${SSH_HOST} "${OTTERSTACK_PATH} deploy <project-name> -v"
Watch the output for these sequential stages:
If any stage fails, proceed to the corresponding debugging section below.
Symptoms:
variable MYVAR is not setservices.web.image is undefinedDiagnosis Commands:
# View full validation output
ssh ${SSH_HOST} "cd ~/.otterstack/repos/<project> && docker compose config"
# Check which env vars are set
ssh ${SSH_HOST} "${OTTERSTACK_PATH} env list <project-name>"
# View the env file being used
ssh ${SSH_HOST} "cat ~/.otterstack/envfiles/<project-name>.env"
Common Causes:
Fix:
# Add missing environment variables
ssh ${SSH_HOST} "${OTTERSTACK_PATH} env set <project> VAR value"
# For invalid syntax: fix compose file locally, commit, push, redeploy
# Verify fix
ssh ${SSH_HOST} "${OTTERSTACK_PATH} deploy <project-name> -v"
Symptoms:
container exited with code 1Diagnosis Commands:
# OtterStack automatically shows last 50 lines on failure
# For more context:
ssh ${SSH_HOST} "docker compose -p <project>-<sha> logs --tail=100"
# Check container status
ssh ${SSH_HOST} "docker ps -a --filter name=<project>"
# Inspect specific container
ssh ${SSH_HOST} "docker logs <container-name> --tail=50"
Common Causes & Fixes:
Error: ENOENT: no such file or directory
Diagnosis:
# Check if files exist in container
ssh ${SSH_HOST} "docker exec <container> ls -la /app"
Fix: Update Dockerfile COPY paths, commit, push, redeploy.
Error: EACCES: permission denied
Diagnosis:
# Check file/directory ownership
ssh ${SSH_HOST} "docker exec <container> ls -la /app/data"
Fix: Update Dockerfile to set correct ownership:
RUN mkdir -p /app/data && chown -R app:app /app/data
USER app
Error: Error: Could not locate the bindings file
Fix: Add rebuild step to Dockerfile:
RUN npm rebuild better-sqlite3 # Or other native module
Error: no such table: sessions or relation does not exist
Diagnosis:
# Check if migrations ran
ssh ${SSH_HOST} "docker logs <container> | grep -A5 'migration'"
Fix: Ensure migration files are copied to correct path in Dockerfile.
Error: connection refused or ECONNREFUSED
Diagnosis:
# Check if database service is running
ssh ${SSH_HOST} "docker ps | grep database"
# Test connection from container
ssh ${SSH_HOST} "docker exec <container> curl database:5432"
Fix: Verify database service exists in compose file and is healthy.
Symptoms:
container myapp-web-1 is unhealthyDiagnosis Commands:
# Check container status
ssh ${SSH_HOST} "docker ps --format 'table {{.Names}}\t{{.Status}}'"
# View recent logs
ssh ${SSH_HOST} "docker logs <container> --tail=50"
# Test health check manually
ssh ${SSH_HOST} "docker exec <container> curl -f http://127.0.0.1:8080/health"
# Check what health check is defined
ssh ${SSH_HOST} "docker inspect <container> | grep -A10 Healthcheck"
Common Causes & Fixes:
Error: Health check returns 404
Diagnosis: Check application logs for available routes
Fix: Update health check in compose file to correct endpoint:
healthcheck:
test: ["CMD", "curl", "-f", "http://127.0.0.1:8080/healthz"] # Correct path
Error: connection refused when testing health check
Diagnosis:
# Check if localhost resolves to IPv6
ssh ${SSH_HOST} "docker exec <container> ping -c1 localhost"
# Test with explicit IPv4
ssh ${SSH_HOST} "docker exec <container> curl http://127.0.0.1:8080/health"
Fix: Use 127.0.0.1 instead of localhost in health check:
healthcheck:
test: ["CMD", "curl", "-f", "http://127.0.0.1:8080/health"]
Error: Timeout before app is actually ready
Diagnosis: Check how long app takes to start from logs
Fix: Increase start_period in health check:
healthcheck:
test: ["CMD", "curl", "-f", "http://127.0.0.1:8080/health"]
interval: 10s
timeout: 3s
retries: 3
start_period: 60s # Increased from 30s
Error: Connection refused on health check port
Diagnosis:
# Check what ports app is listening on
ssh ${SSH_HOST} "docker exec <container> netstat -tlnp"
Fix: Update health check to use correct port.
Symptoms:
Error: deployment in progressDiagnosis:
# Check lock file
ssh ${SSH_HOST} "cat ~/.otterstack/locks/<project>.lock"
# Extract PID and check if process is running
LOCK_PID=$(ssh ${SSH_HOST} "grep PID ~/.otterstack/locks/<project>.lock | cut -d: -f2")
ssh ${SSH_HOST} "ps -p ${LOCK_PID}"
Fix:
# If process is dead, remove stale lock
ssh ${SSH_HOST} "rm ~/.otterstack/locks/<project>.lock"
# If process is alive, wait for it to finish
# Or check logs: tail -f ~/.otterstack/logs/<project>.log
Symptoms:
Diagnosis:
# Check if containers are running
ssh ${SSH_HOST} "docker ps | grep <project>"
# Check Traefik routes
ssh ${SSH_HOST} "curl -s http://localhost:8080/api/http/routers | grep <project>"
# Check container labels
ssh ${SSH_HOST} "docker inspect <container> | grep -A20 Labels"
# Check container network
ssh ${SSH_HOST} "docker network inspect <network> | grep <container>"
Common Causes:
Fix: Verify compose file has correct Traefik configuration.
When deployment fails, follow this loop:
If environment variable issue:
ssh ${SSH_HOST} "${OTTERSTACK_PATH} env set <project> VAR value"
If compose file issue:
If application code issue:
If Dockerfile issue:
ssh ${SSH_HOST} "${OTTERSTACK_PATH} deploy <project-name> -v"
Continue the loop until deployment succeeds and all verifications pass.
This is the actual sequence of iterations from the Aperture deployment:
Error:
WARN The "DATABASE_URL" variable is not set. Defaulting to a blank string
Fix:
ssh archivist@194.163.189.144 "~/OtterStack/otterstack env set aperture DATABASE_URL 'postgres://...'"
Outcome: Variables loaded, but next issue appeared.
Error:
The container name "/aperture-web" is already in use
Fix: Removed container_name: directives from docker-compose.yml locally, committed, pushed.
Outcome: Containers started, but new issue appeared.
Error:
Error: Could not locate the bindings file. Tried: better_sqlite3.node
Fix: Added to Dockerfile:
RUN npm rebuild better-sqlite3
Committed, pushed, redeployed.
Outcome: Bindings loaded, but new issue appeared.
Error:
SqliteError: unable to open database file, code: 'SQLITE_CANTOPEN'
Fix: Added DATABASE_PATH environment variable pointing to /app/data/db/aperture.db and ensured directory ownership in Dockerfile.
Outcome: Database opened, but new issue appeared.
Error:
[DB] No migrations directory found, skipping migrations
Failed to start server: SqliteError: no such table: sessions
Fix: Updated Dockerfile to copy migrations to correct location:
COPY src/migrations ./dist/migrations # Was copying to ./src/migrations
Outcome: Migrations ran successfully, but new issue appeared.
Error:
container aperture-web-1 is unhealthy
Diagnosis: Health check used localhost which resolved to ::1 (IPv6), but nginx only listened on 0.0.0.0:80 (IPv4).
Fix: Changed health check to use 127.0.0.1 instead of localhost in docker-compose.yml.
Outcome: ✅ SUCCESS! Deployment completed.
# Check deployment
ssh archivist@194.163.189.144 "~/OtterStack/otterstack status aperture"
# Output: Status: active, SHA: e2d6223
# Verify endpoints
curl -I https://aperture-api.archivist.lol
# Output: HTTP/1.1 200 OK
curl -I https://aperture.archivist.lol
# Output: HTTP/1.1 200 OK
Total iterations: 6 Time: ~2 hours (including investigation time) Key lesson: Each fix revealed the next issue - systematic debugging is essential.
Once deployment completes without errors, verify success:
ssh ${SSH_HOST} "${OTTERSTACK_PATH} status <project-name}"
Expected output:
Status: active
Commit: abc1234
Started: 2025-01-11 10:30:00
ssh ${SSH_HOST} "docker ps --format 'table {{.Names}}\t{{.Status}}' | grep <project>"
Expected output:
project-abc1234-web-1 Up 2 minutes (healthy)
project-abc1234-worker-1 Up 2 minutes (healthy)
All containers should show (healthy) status.
# Test API endpoint
ssh ${SSH_HOST} "curl -I https://api.example.com/health"
# Or test from local machine
curl -I https://api.example.com/health
Expected output:
HTTP/1.1 200 OK
ssh ${SSH_HOST} "docker logs <container-name> --tail=20"
Expected: No errors, should see normal application startup and health check logs.
ssh ${SSH_HOST} "curl -s http://localhost:8080/api/http/routers | grep <project>"
Expected: Router entries with correct domains and priorities.
# Check restart count
ssh ${SSH_HOST} "docker ps -a | grep <project>"
# View logs across restarts
ssh ${SSH_HOST} "docker logs <container> --tail=100"
Common causes: Crash on startup, missing dependencies, wrong command
# List all project containers
ssh ${SSH_HOST} "docker ps -a | grep <project>"
# Manually stop old deployment
ssh ${SSH_HOST} "docker compose -p <old-project-name> down"
# Check container can reach internal services
ssh ${SSH_HOST} "docker exec <container> curl http://database:5432"
# Check environment variables in container
ssh ${SSH_HOST} "docker exec <container> env | grep DATABASE"
# Check file permissions
ssh ${SSH_HOST} "docker exec <container> ls -la /app"
# Check SSH keys
ssh ${SSH_HOST} "ssh -T git@github.com"
# Check network
ssh ${SSH_HOST} "ping -c3 github.com"
# Manually pull to see error
ssh ${SSH_HOST} "cd ~/.otterstack/repos/<project> && git pull"
If a deployment breaks production:
# Option 1: Deploy previous commit
ssh ${SSH_HOST} "${OTTERSTACK_PATH} deploy <project> --ref <previous-commit-sha>"
# Option 2: Manually switch back to old containers
# (Only if new containers still starting)
ssh ${SSH_HOST} "docker compose -p <old-project-name> up -d"
ssh ${SSH_HOST} "docker compose -p <new-project-name> down"
Deployment is successful when:
✅ Deployment command completes without errors
✅ All containers are healthy
docker ps shows all containers with (healthy) status✅ Endpoints return expected responses
✅ No errors in logs
✅ Traffic is being routed correctly (if Traefik enabled)
/deploy-otterstack - Full orchestration command that uses this skill in Phase 6 for VPS deployments