From infrastructure
Ansible automation conventions, patterns, and toolchain: playbook design, roles, inventory, vault, collections, execution environments, Event-Driven Ansible, testing, and performance tuning. Invoke whenever task involves any interaction with Ansible — writing playbooks, creating roles, managing inventory, reviewing automation code, debugging runs, upgrading ansible-core, or working with AAP.
npx claudepluginhub xobotyi/cc-foundry --plugin infrastructureThis skill uses the workspace's default tool permissions.
Idempotency is the highest Ansible virtue. Every task must describe desired state, not a sequence of commands.
references/error-handling.mdreferences/event-driven-ansible.mdreferences/execution-environments-and-collections.mdreferences/handlers-and-delegation.mdreferences/inventory-management.mdreferences/playbook-patterns.mdreferences/porting-guide.mdreferences/role-structure.mdreferences/testing-and-performance.mdreferences/variables-and-templating.mdreferences/vault-and-security.mdGenerates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Idempotency is the highest Ansible virtue. Every task must describe desired state, not a sequence of commands.
Extended examples, patterns, and detailed rationale for the rules below live in ${CLAUDE_SKILL_DIR}/references/.
${CLAUDE_SKILL_DIR}/references/playbook-patterns.md]: Play execution order, static vs
dynamic reuse comparison table, batched execution with serial, verification flags, standard directory layouts${CLAUDE_SKILL_DIR}/references/role-structure.md]: Extended directory tree, three ways to use
roles (play-level, import_role, include_role) with examples, platform-specific task splitting, argument_specs example,
dependency mechanics, deduplication rules${CLAUDE_SKILL_DIR}/references/inventory-management.md]: YAML/INI examples, group
hierarchy, environment separation, AWS/Azure/GCP/NetBox/Terraform plugins, multi-cloud chaining, caching${CLAUDE_SKILL_DIR}/references/vault-and-security.md]: File vs variable encryption,
password sources, ansible-sign, GPG verification, hardening roles, compliance scanning, security smells${CLAUDE_SKILL_DIR}/references/variables-and-templating.md]: Full 22-level
precedence list, magic variables, YAML quoting gotcha, registered variables${CLAUDE_SKILL_DIR}/references/error-handling.md]: Block execution flow, rescue variables,
failed_when/changed_when, any_errors_fatal${CLAUDE_SKILL_DIR}/references/handlers-and-delegation.md]: Handler execution order,
flushing, delegate_to, delegate_facts, fire-and-forget async${CLAUDE_SKILL_DIR}/references/testing-and-performance.md]: Molecule lifecycle,
driver comparison, CI matrix testing, strategy plugins, SSH pipelining, fact caching, serial batching${CLAUDE_SKILL_DIR}/references/execution-environments-and-collections.md]: EE vs local installs, version 3 schema,
FQCN migration, collection certification, Galaxy publishing, automation mesh, AAP 2.5/2.6 platform architecture${CLAUDE_SKILL_DIR}/references/event-driven-ansible.md]: Rulebook structure, event
sources, conditions, actions, Event Streams, decision environments, Kafka vs webhooks, performance tuning,
troubleshooting${CLAUDE_SKILL_DIR}/references/porting-guide.md]: ansible-core 2.17/2.18 breaking changes,
Python version requirements, removed modules, AAP 2.5/2.6 deprecations, upgrade strategystate: explicitly. Different modules have different defaults. state: present / state: absent
makes intent visible.ansible.builtin.copy, not copy. Prevents ambiguity when
multiple collections are installed.ansible.builtin.template, ansible.builtin.service, ansible.builtin.user) over
imperative ones (ansible.builtin.command, ansible.builtin.shell)command/shell is unavoidable, add creates:, removes:, or changed_when: to make it idempotentimport_tasks / import_role -- static, parsed at load time. Tags propagate to all imported tasks. Cannot loop. Use
when structure is fixed.include_tasks / include_role -- dynamic, evaluated at runtime. Tags apply only to the include statement. Can loop
and use when. Use when inclusion is conditional.Default to import_* for predictability.
inventories/
production/
hosts
group_vars/
host_vars/
staging/
hosts
group_vars/
host_vars/
site.yml # imports tier playbooks
webservers.yml
dbservers.yml
roles/
common/
webserver/
database/
site.yml imports tier playbooks. Each tier playbook maps host groups to roles.
A role manages one service or component — not an entire stack. Keep provisioning separate from configuration and application deployment. Roles are not programming constructs: avoid deep inheritance hierarchies, tight coupling, or hard dependencies on external variables.
roles/my_role/
tasks/main.yml # entry point
handlers/main.yml # auto-imported into play scope
templates/*.j2 # Jinja2 templates
files/ # static files
defaults/main.yml # low-precedence (user-configurable)
vars/main.yml # high-precedence (internal constants)
meta/main.yml # dependencies
meta/argument_specs.yml # argument validation (2.11+)
defaults/ -- easily overridden. Use for knobs users should change (ports, paths, feature flags).vars/ -- hard to override. Use for internal constants the role needs to function.nginx-proxy, ssl-certsnginx_port, nginx_worker_countnginx : Restart nginxDefine expected parameters in meta/argument_specs.yml. Validation runs before role tasks execute.
Defined in meta/main.yml. Run before the role. Deduplicated per play unless parameters differ or
allow_duplicates: true is set.
Prefer YAML over INI. INI :vars sections treat all values as strings, causing type confusion.
Group along three dimensions:
webservers, dbservers, monitoringdc1, dc2, us_eastproduction, staging, developmentSplit large inventories by function or region — a single static file with 5,000+ hosts takes 15-30 seconds to load. Keep production and staging in separate inventory files or directories. Never mix environments in a single inventory -- developers using a mixed inventory need access to all vault passwords.
Use inventory plugins (not scripts) for cloud providers:
amazon.aws.aws_ec2 -- groups from tags, instance types, regionsazure.azcollection.azure_rm -- conditional groups, keyed groupsgoogle.cloud.gcp_compute -- zones, machine types, labelsnetbox.netbox.nb_inventory -- single source of truth for hybrid environments, automatic group updates
from tags/custom fieldscloud.terraform.terraform_state -- parse state files as inventoryMix static and dynamic sources in the same inventory directory.
Build groups dynamically from host metadata using Jinja2 logic. Chain multiple cloud inventories into a single constructed inventory for cross-cloud targeting. Successor to Smart Inventories in AAP.
Role defaults/ is lowest. Extra vars (-e) always win. Most common layers:
defaults/main.ymlgroup_vars/all.ymlgroup_vars/<group>.ymlhost_vars/<host>.ymlvars/main.yml--extra-varsDefine each variable in ONE place.
Values starting with {{ }} must be quoted:
app_path: "{{ base_path }}/app" # correct
app_path: {{ base_path }}/app # YAML parse error
yes, no, true, false, on, off as booleans. Quote strings that match:
version: "yes", not version: yesmode: 0644 becomes 420 (decimal). Use mode: "0644"
for file permissions.combine() does shallow merge. Nested dicts are replaced, not merged. Use
combine(recursive=true) for deep merge.set_fact in a loop overwrites on each iteration. Use set_fact with
{{ result | default([]) + [item] }} to accumulate.All templating runs on the control node before task execution.
{{ value | default('fallback') }}, {{ list | unique }}, {{ dict1 | combine(dict2) }}when: result is defined, when: path is file{% for %}, {% if %}, {% macro %}.# group_vars/production/vars.yml (plaintext, searchable)
db_password: "{{ vault_db_password }}"
# group_vars/production/vault.yml (encrypted)
vault_db_password: "actual_secret"
Variable names remain greppable. Values stay encrypted.
--vault-password-file vault_pass.txt) for local dev.vault_pass.sh) that fetches from a secrets manager for team environmentsANSIBLE_VAULT_PASSWORD_FILE environment variable pointing to a pipeline secretFor enterprise or compliance-heavy environments, shift from static vault files to runtime secret fetching via lookup plugins:
Use ansible-sign with GPG to sign project content. Creates checksum manifests (SHA256) of protected files with
detached GPG signatures. AAP automation controller verifies signatures on project sync -- tampered projects fail to
update and no jobs launch. Automate signing in CI via ANSIBLE_SIGN_GPG_PASSPHRASE environment variable.
ansible-lockdown) for automated compliance. Customize via
defaults/main.yml, select levels via tags.no_log: true on tasks that handle secretsbecome: true at task level, not play levelblock:
- name: Deploy new version
# ... tasks that might fail
rescue:
- name: Rollback
# ... recovery tasks
always:
- name: Send notification
# ... runs regardless
rescue runs only when a block task failsalways runs regardless of block/rescue outcomeansible_failed_task, ansible_failed_resultblock but succeed in rescue are reported as "rescued", not "failed" -- account for this in
reportingFor multi-host runs, capture per-host status in block/rescue, then aggregate in always using
ansible_play_hosts_all with delegate_to: localhost and run_once: true. This produces a single summary of all
successes and failures across the fleet.
failed_when: -- custom failure conditionschanged_when: -- control when a task reports "changed"ignore_errors: true -- continue on failure (use sparingly)any_errors_fatal: true -- stop entire play on any host failure- name: Wait for service
ansible.builtin.uri:
url: http://localhost:8080/health
register: result
until: result.status == 200
retries: 30
delay: 10
meta: flush_handlers)listen: topics to group related handlersrole_name : handler_nameExecute a task on a different host: delegate_to: lb.example.com. Use for load balancer operations, centralized
notifications, cross-host coordination.
local_action: is shorthand for delegate_to: 127.0.0.1.
When multiple hosts delegate to the same target, use throttle: 1 or run_once: true to prevent race conditions.
become applies to the delegated host, not the original target -- verify escalation permissions.
async: N, poll: M (M > 0) -- extended timeout, still blocksasync: N, poll: 0 -- fire-and-forget, check later with async_statuspoll: 0 with tasks requiring exclusive locks (package managers)EDA is the "Automation Decisions" component of AAP -- a decision engine that listens to event sources and triggers automated responses via rulebooks. Rulebooks are the event-driven equivalent of playbooks: YAML files with sources, conditions, and actions.
Key concepts:
when), and actions (run_job_template, run_workflow_template,
set_fact, debug)alertmanager, aws_sqs_queue, azure_service_bus, kafka, pg_listener, webhookSee [${CLAUDE_SKILL_DIR}/references/event-driven-ansible.md] for rulebook structure, event filters, scaling, and
troubleshooting.
Container images bundling Ansible Core, Runner, collections, and all dependencies. Replace traditional virtual environments for consistent automation execution.
Use EEs when: enterprise scale, complex dependencies, team consistency needed. Use local installs for: simple setups, ad-hoc tasks, beginners.
AAP 2.5 introduced a unified UI, Platform Gateway (single auth entry point), and containerized installer (Podman on RHEL). AAP 2.6 adds an automation dashboard (ROI tracking), self-service automation portal, and Ansible Lightspeed intelligent assistant.
RPM-based installer is deprecated as of AAP 2.5 -- containerized and operator-based deployments are the future.
See [${CLAUDE_SKILL_DIR}/references/porting-guide.md] for AAP platform changes and upgrade guidance.
ansible-galaxy collection install community.generalrequirements.yml using open ranges:
collections:
- name: community.general
version: ">=7.0.0,<8.0.0"
community.general.ufw, not ufwansible-galaxy collection install -r requirements.ymlansible-galaxy collection download -r requirements.yml -p ./collections/ansible-test sanity --docker default for coding standards; ansible-lint --profile production for certificationgalaxy-importer in CI to replicate automation hub import checksrequires_ansible in meta/runtime.ymlplugin_routing in meta/runtime.yml for backward-compatible redirectsIntegrate ansible-lint in CI and pre-commit hooks. For enterprise environments, add policy-as-code tools (Steampunk
Spotter, Checkov) as gates before automation reaches production.
Standard role testing framework. Drivers: Docker (fast, local dev), Podman (rootless, enterprise), Vagrant (full VM),
delegated (default in Molecule 6). Run molecule test for the full lifecycle. Use multiple scenarios for different
conditions (default, HA cluster, upgrade).
forks (default 5) for parallel host execution -- start at 2-4x CPU cores, monitor control node memorypipelining = True with ControlPersistmitogen_linear or
mitogen_free. Most impactful for playbooks with many small tasks.gathering = smart with fact_caching = jsonfile (or Redis)gather_facts when not needed; use gather_subset to limit scopesynchronize over copy for large file transfersserial for staged batching in rolling deploymentscallbacks_enabled = timer, profile_tasksEnable inventory caching for dynamic sources (30+ seconds to under 1 second). Use constructed inventory plugin over large static groups. Flatten group hierarchies (3-4 groups per host, not 6-7). Split inventories by function/region.
ansible-core 2.17+ requires Python 3.7+ on managed hosts. RHEL 8 environments must stay on ansible-core 2.16 (system
Python 3.6 bindings are incompatible). Key removals: yum module (redirected to dnf), include module (use
include_tasks/import_tasks), smart connection option (select explicit plugin).
ansible-core 2.18 removes old-style vars plugins (get_host_vars/get_group_vars) and deprecates plural
COLLECTIONS_PATHS. Windows Server 2012/2012 R2 support is removed.
See [${CLAUDE_SKILL_DIR}/references/porting-guide.md] for the full list of breaking changes, deprecations, and upgrade
strategy.
When writing Ansible automation: apply all conventions silently. If an existing codebase contradicts a convention, follow the codebase and flag the divergence.
When reviewing Ansible code: cite the specific violation and show the fix inline. Example:
copy: -> ansible.builtin.copy:
The coding skill governs workflow; this skill governs Ansible-specific conventions. Both are active simultaneously.
no_log: true on tasks handling secretsbecome: true at task level, not play levelansible-sign in regulated environmentsIdempotency is the highest Ansible virtue. Describe desired state, never command sequences.