npx claudepluginhub myr-aya/gouvernai-claude-code-plugin --plugin gouvernaiThis skill uses the workspace's default tool permissions.
Runtime guardrails for AI agents. Classifies every sensitive action by risk tier, enforces proportional controls, and logs a full audit trail.
Enforces runtime security policies on Claude Code actions via Pre/PostToolUse hooks, blocks dangerous shell commands/file ops/MCP calls, scans configs for OWASP ASI10 vulnerabilities, logs audit trails.
Blocks destructive commands like rm -rf, git --force-push, kubectl delete; restricts edits to specified directories for production systems or autonomous agents.
Implements enterprise security patterns for Claude Code: role-based permission hardening, secrets blocking scripts, Vault integration, audit hooks for SOC2, HIPAA, GDPR compliance.
Share bugs, ideas, or general feedback.
Runtime guardrails for AI agents. Classifies every sensitive action by risk tier, enforces proportional controls, and logs a full audit trail.
Dual enforcement: This skill provides probabilistic classification (you read and follow these instructions). Hard constraints are also enforced deterministically via PreToolUse hooks โ obfuscated commands, credential exposure, and dangerous system commands are blocked programmatically even if you skip the skill.
| Situation | Action |
|---|---|
| Read file, git status, draft message | Auto-approved โ zero gate, zero overhead |
| Write to user file, git commit | ๐ก๏ธ T2 โ auto-approved with notification, log |
| Send email, modify config, delete files, curl, npm install | ๐ก๏ธ T3 โ pause, require approval, log |
| sudo, credential transmit, purchase, public post | ๐ก๏ธ T4 โ full stop, warn, require approval, log |
| Credential transmission, obfuscated command | ๐ก๏ธ BLOCKED โ hard constraint, no override |
| Bulk operation (5+ targets) | Escalate +1 tier |
| Unfamiliar endpoint or recipient | Escalate +1 tier |
User invokes /guardrails | Show session stats |
Read these on demand โ do NOT load them every session:
| File | Read when |
|---|---|
| GUIDE.md | First gated action of the session |
| ACTIONS.md | Classifying an action (Step 3) |
| TIERS.md | Checking escalation rules or controls (Steps 4, 7) |
| POLICY.md | Checking hard constraints or conflicts (Step 6) |
All reference files are in the same directory as this SKILL.md.
Do NOT run the gate for read-only or zero-side-effect actions:
scratch/, temp/, or tmp/ onlyguardrails_log.md for guardrails logging purposesguardrails-mode.json for mode persistenceImportant limitation: The deterministic hook layer (PreToolUse) only intercepts Bash, Write, Edit, and Read tool calls. MCP tool calls bypass the hook entirely. When classifying actions that use MCP tools (database queries, API calls via MCP servers, file operations through MCP), you must apply the full gate process linguistically โ there is no hard enforcement backstop. Treat MCP actions with extra caution.
Run the gate process below before executing ANY of these action types:
In addition to gating individual actions, watch for multi-step patterns across the conversation that could indicate data exfiltration or policy circumvention. If you detect any of the following patterns, escalate the current action to Tier 4 regardless of its base classification:
Variable staging then transmission: A command stores sensitive data in a variable (e.g. SECRET=$(cat .env), TOKEN=$(grep API_KEY config)) followed by a later command that transmits a variable to the network (e.g. curl -d "$SECRET", wget --post-data="$TOKEN"). Flag the transmission command as Tier 4.
Fragmented data extraction: Multiple commands that each extract a small piece of a secret (character slicing, substring operations, split/cut on credential files) followed by any outbound network request. Flag the first network request after the extraction as Tier 4.
Credential embedding in generated files: When asked to generate config files, scripts, or templates, check whether the content being written includes interpolated credential variables (like $API_KEY, $SECRET_TOKEN, process.env.API_KEY). If the file is then transmitted, committed to a public repo, or uploaded, escalate to Tier 4.
Disguised exfiltration: A request to "health check endpoints", "test connectivity", or "validate URLs" where the URLs or request bodies contain credential-like values, encoded strings, or variable interpolations. Escalate to Tier 4.
These patterns require judgment โ not every variable assignment followed by a curl is malicious. Apply this detection when the conversation context suggests sensitive data is in play (credentials were recently accessed, .env files were read, secrets were discussed).
IMPORTANT โ Output efficiency: Every message adds cost. Combine steps into a single response. Keep reasoning brief. Do not narrate steps separately. Read GUIDE.md on the first gated action for output format templates.
Step 1 โ Identify. List the actions matching trigger criteria. State what, where, and what data is involved.
Step 2 โ Determine mode. Read guardrails-mode.json from the project root (CLAUDE_PROJECT_DIR). If the file exists, use the mode and audit_only fields. If the file does not exist, default to full gate (mode: "full-gate", audit_only: false). Mode values: full-gate (default), strict (all tiers +1), relaxed (T2 skips gate). If audit_only is true, T2 and T3 auto-proceed with logging; T4 halts without executing.
Step 3 โ Classify. Read ACTIONS.md. Assign base tier (2, 3, or 4).
Step 4 โ Escalate. Read TIERS.md. Apply escalation rules if applicable. Result is final tier.
Step 5 โ Pre-approval. If action matches a known pre-approved pattern AND final tier < 4, log with "PRE-APPROVED" tag and proceed. If escalation raised the action to Tier 4, pre-approval is void.
Step 6 โ Hard constraints. Read POLICY.md. If any NEVER rule is violated, BLOCK regardless of tier or approval.
Step 7 โ Controls. Apply the universal control from TIERS.md for the final tier ร mode.
Step 8 โ Log and execute. Append to guardrails_log.md in the project root using the Write or Edit tool (never Bash echo/redirect โ Bash triggers the obfuscation hook on $(date) substitution). Log: timestamp, tier, type, description, mode, outcome, approval, escalation reason. Execute or halt. Do NOT read back, display, or echo the log after writing. The log write is silent โ never show log entries to the user unless they explicitly request /guardrails log.
SILENT LOGGING โ NO EXCEPTIONS: After writing to the log, do NOT read the log file. Do NOT display log contents. Do NOT show "here are the latest entries." Do NOT cat, head, or tail the log. The write is completely invisible to the user. The ONLY time log entries are shown is when the user explicitly runs /guardrails log. Violating this rule wastes tokens and breaks the flow. If you find yourself about to show log entries after a gate action, STOP.
Note: Writes to guardrails_log.md are exempt from the gate.
When token_cap is set in guardrails-mode.json, estimate the token cost of your planned action before executing.
What counts toward the estimate:
How to estimate: Use ~4 characters = 1 token as a rough heuristic.
When the estimate exceeds the cap:
๐ก๏ธ **T3 โ Token cap**: Estimated ~[N] tokens (cap: [cap]). [Brief description of planned actions]. Approve? (yes/no)The hook layer also checks payload size deterministically. If the Write/Edit content or Bash command exceeds the cap, the hook prompts the user before the skill layer sees it. The skill layer adds coverage for multi-step plans where individual payloads are under the cap but the total exceeds it.
Token cap does not apply to: