From auto-claude-skills
Evaluate designs involving autonomous agents for the lethal trifecta — private data + untrusted input + outbound action
npx claudepluginhub damianpapadopoulos/auto-claude-skillsThis skill uses the workspace's default tool permissions.
Architectural risk assessment for designs and implementations that involve autonomous agent behavior. Separate from security-scanner (which runs deterministic static analysis).
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Architectural risk assessment for designs and implementations that involve autonomous agent behavior. Separate from security-scanner (which runs deterministic static analysis).
During DESIGN phase when the prompt involves autonomous agents, unattended operation, private data processing with external input, or outbound actions. Also co-selects during REVIEW phase when autonomy-related triggers match alongside requesting-code-review.
For the proposed design or implementation, evaluate each field:
| Field | Question | Examples |
|---|---|---|
private_data | Does the agent access information that should not be shared with all parties? | User email, credentials, internal logs, PII, private repos, API keys, session tokens |
untrusted_input | Can an external party inject instructions the agent will process? | Email content, web pages, user-uploaded files, API responses from third parties, webhook payloads |
outbound_action | Can the agent send data or take actions visible outside its sandbox? | Sending emails, posting to Slack, pushing to git, making API calls, writing to shared filesystems, creating PRs |
For each field, state:
| Fields present | Classification | Action |
|---|---|---|
| All 3 | Lethal trifecta — High risk | Require mitigation before proceeding |
| 2 of 3 | Elevated risk | Note which leg is missing. Recommend not adding the third without mitigation. |
| 0-1 | Standard risk | No special action required |
The primary mitigation is blast-radius control — cutting at least one leg of the trifecta. Improved detection scores are NOT proof of safety.
Cut private_data:
Cut untrusted_input:
Cut outbound_action:
Output a structured assessment:
## Agent Safety Assessment
**Design:** <what is being evaluated>
**Date:** YYYY-MM-DD
### Risk Fields
| Field | Status | Evidence |
|-------|--------|----------|
| private_data | Present/Absent/Unknown | <specific evidence> |
| untrusted_input | Present/Absent/Unknown | <specific evidence> |
| outbound_action | Present/Absent/Unknown | <specific evidence> |
### Classification
**Risk level:** Lethal trifecta / Elevated / Standard
### Mitigation (if required)
**Recommended approach:** <which leg to cut and how>
**Trade-off:** <what capability is reduced by the mitigation>
**Residual risk:** <what remains after mitigation>