From agentic-ai-skills
Diagnoses cloud instance access issues and performs SSH/CLI operations on Linux servers, AWS EC2, and Aliyun ECS, including file transfers, service checks, logs, and blockers like bastions or MFA.
npx claudepluginhub agenticaiplan/agenticaiskills --plugin agentic-ai-skillsThis skill uses the workspace's default tool permissions.
Use this skill to manage cloud servers and instances like a careful human operator **within the current authorization boundary**. Prefer SSH and official cloud CLIs. Do **not** try to bypass MFA, SSO, approval workflows, captchas, IP allowlists, bastions, or least-privilege controls.
agents/openai.yamlreferences/aliyun.mdreferences/aws.mdreferences/generic-linux.mdreferences/install-and-examples.mdreferences/interrupted-run-recovery.mdreferences/model-deployment.mdreferences/restricted-access.mdreferences/shared-runtime-hygiene.mdscripts/artifact_check.pyscripts/deployment_final_check.pyscripts/normalize_result.pyscripts/password_ssh.pyscripts/preflight.pyscripts/remote_port_owner.pyscripts/ssh_probe.pyDeploys and operates containerized workloads on AWS ECS, Fargate, and ECR. Covers task definitions, services, debugging with ECS Exec, scaling, load balancers, and image management for AWS container optimization.
Performs authorized security assessments on Azure, AWS, and GCP infrastructure via reconnaissance, authentication testing, enumeration, privilege escalation, and data extraction.
Provisions, hardens, and deploys apps to Hetzner Cloud VPS with Docker, Nginx/Caddy reverse proxy, SSL certs, database setup, monitoring, and backups.
Share bugs, ideas, or general feedback.
Use this skill to manage cloud servers and instances like a careful human operator within the current authorization boundary. Prefer SSH and official cloud CLIs. Do not try to bypass MFA, SSO, approval workflows, captchas, IP allowlists, bastions, or least-privilege controls.
This skill is optimized for a shared core workflow that can be reused in Codex, OpenClaw, and Claude Code. Codex-specific UI metadata lives in agents/openai.yaml; the operational logic stays in this SKILL.md plus the bundled references and scripts.
ProxyJump, or a bastion.Normalize each request into these five inputs before acting:
Treat deploy, single-run, serve, accept, and cloud-op as explicit-action modes. Do not enter them from a vague “帮我看看机器” style request without confirming intent from the user prompt or environment.
If an input is missing, infer it from the environment first. Ask the user only for the minimum missing detail.
Run a lightweight local check before attempting privileged or remote work.
Typical commands:
python3 scripts/preflight.py --cloud generic --action inspect --host 10.0.0.8 --check-port
python3 scripts/preflight.py --cloud aws --action instance-check --profile prod --region ap-southeast-1
python3 scripts/preflight.py --cloud aliyun --action cloud-op --profile default --region cn-hangzhou
Preflight should verify only what is safe to verify locally:
blocked, denied, or needs_userUse the narrowest command that answers the request.
Useful wrappers:
python3 scripts/ssh_probe.py --host 10.0.0.8 --user ec2-user --identity-file ~/.ssh/prod.pem
python3 scripts/remote_port_owner.py --port 8080
If the environment requires human involvement, stop and return a structured handoff instead of guessing.
Common stop conditions:
Return a stable contract for both success and blocked cases so the result can be reused in later handoff, approval, or postmortem steps.
Minimum success shape:
{
"status": "ok",
"action": "inspect",
"target": "10.0.0.8",
"evidence": {
"network": {"10.0.0.8:22": "open"}
},
"next_step": "Proceed with the requested read-only inspection."
}
When blocked, return a concrete next step instead of guessing.
Expected shape:
{
"status": "blocked",
"reason": "Target port 22 is unreachable from the current machine.",
"next_step": "Connect through the approved bastion or ask for the current IP to be allowlisted.",
"evidence": {
"network": {"10.0.0.8:22": "closed"}
}
}
Allowed status values:
okblockeddeniedneeds_userunsupportedUse this workflow whenever the task is “部署模型 / 起 API / 做题面验收 / 输出文件检查” instead of plain server ops.
Always verify the underlying model or command once before wrapping it in an API. Do not start with service orchestration unless the user explicitly asks for service-first diagnosis.
curl or request shape from the task promptDo not claim “deployed successfully” until all applicable checks pass:
Use the bundled scripts when helpful:
curl -fsS http://127.0.0.1:8080/docs
python3 scripts/artifact_check.py --path /data/exam/output.wav --kind audio
python3 scripts/deployment_final_check.py --url http://127.0.0.1:8080/health --repeat 2
In exam, benchmark, and shared-machine environments, assume the host may already have leftovers.
Before reusing a port, GPU, or task directory:
Never assume “service started” means “new service is handling requests”. Always verify the response path.
Prefer one environment per task (venv or conda). If isolation is impossible:
Password SSH is common in exam or public-IP environments. Use it carefully:
prompt-password) for the current taskthread_id + user@host:port so later commands in the same thread do not re-prompt for the passwordexport a password variable, run a tiny visibility check first; if the current tool process still cannot read it, do not ask the user to repeat the export—switch to secure prompt input or a temp file insteadexpect or a similar wrapper is still required, keep the wrapper as small as possible and keep logs password-freeRecommended visibility check before relying on an env var:
if [ -n "$CLOUD_INSTANCE_PASSWORD" ]; then echo set; else echo missing; fi
Recommended helper usage:
python3 scripts/password_ssh.py --host 10.0.0.8 --user root --port 22 --prompt-password --ensure-session
python3 scripts/password_ssh.py --host 10.0.0.8 --user root --port 22 --remote-command "hostname && whoami"
python3 scripts/password_ssh.py --host 10.0.0.8 --user root --port 22 --close-session
Fallback modes:
--password-file or --password-env--prompt-password and --ensure-session--no-reuse-session--session-namespace is explicitly providedWhen the task was interrupted, never restart blindly. First check:
If artifacts and downloads are already present, prefer resume. If state is ambiguous or contaminated, clean and restart only the affected slice.
For longer remote tasks, prefer a predictable work layout such as:
01_single_run.py02_api.py03_start.sh04_selftest.sh*.log*.pidKeep logs, PID files, and outputs in one task directory so interrupted runs are easy to inspect.
Treat these as confirmation-required even if technically possible:
Load only the reference relevant to the current task:
references/generic-linux.mdreferences/aws.mdreferences/aliyun.mdreferences/restricted-access.mdreferences/install-and-examples.mdreferences/model-deployment.mdreferences/shared-runtime-hygiene.mdreferences/interrupted-run-recovery.mdscripts/preflight.py: local preflight checks and initial status classificationscripts/ssh_probe.py: non-destructive SSH connectivity/auth probe with optional bastionscripts/password_ssh.py: password-based SSH helper with secure prompt input, optional thread-scoped session reuse, and safer defaults for host key checkingscripts/normalize_result.py: normalize command outcomes into the shared status contractscripts/remote_port_owner.py: inspect which local process is listening on a portscripts/artifact_check.py: verify output artifacts exist and look reasonablescripts/deployment_final_check.py: verify health/example endpoints and optional artifacts with repeat checks