From agent-guardrails
Re-analyze Claude Code session logs against existing agent guardrail rules to measure effectiveness, catch false positives, identify missed anti-patterns, and refine regex patterns. Use when user asks to "update guardrails", "refine guardrail rules", "check guardrail effectiveness", "agent-guardrails update", "tune guardrails", or wants to improve existing behavioral rules based on real usage data.
npx claudepluginhub florianbuetow/claude-code --plugin agent-guardrailsThis skill uses the workspace's default tool permissions.
Re-analyze session logs against installed guardrail rules to measure effectiveness and refine patterns. This is the feedback loop: install rules with `/agent-guardrails:install`, use them for a while, then run `/agent-guardrails:update` to tune.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Re-analyze session logs against installed guardrail rules to measure effectiveness and refine patterns. This is the feedback loop: install rules with /agent-guardrails:install, use them for a while, then run /agent-guardrails:update to tune.
Read the installed guardrail script:
cat .claude/hooks/stop-guardrails.sh 2>/dev/null
Parse each grep block to extract:
# {name} comment)grep -qiE argument)blocked+= argument)If no script exists, tell the user to run /agent-guardrails:install first and stop.
Read the canonical rule definitions from the plugin's rules/ directory:
Source of truth: ${CLAUDE_PLUGIN_ROOT}/rules/no-*.md
Read each rule file's YAML frontmatter to extract name, pattern, and message fields. The six baseline categories are: no-speculative-language, no-stalling, no-preference-asking, no-false-completion, no-skipping, no-dismissing.
Find session logs from the period since rules were installed. Default to last 7 days; accept $ARGUMENTS for custom ranges (e.g., "update last 30 days").
find ~/.claude/projects/ -name "*.jsonl" -mtime -7 ! -path "*/subagents/*" 2>/dev/null | head -30
Build and run a Python analysis script at /tmp/agent-guardrails-update.py that:
type: "assistant" entriesScript requirements:
--days N argumentFor each installed rule, classify it into one of four states:
Detected when the pattern matches legitimate assistant text. Common indicators:
Action: Suggest tightening the regex. Provide the specific false-positive excerpts and a refined pattern.
Detected when the full anti-pattern scan finds instances that the installed rule misses. Common indicators:
Action: Suggest expanding the regex with the missed phrases. Provide the specific missed excerpts and an updated pattern.
Action: Note as "no matches" — cannot determine effectiveness without matches. Suggest the user check if the behavior was actually prevented or if the pattern needs testing.
Compare the full anti-pattern scan results against installed rules. For each anti-pattern category:
Report uncovered gaps with:
/agent-guardrails:install if available)Format results inline:
## Agent Guardrails Update Report
**Rules analyzed:** {count}
**Sessions scanned:** {count}
**Time range:** {date range}
### Rule Effectiveness
| Rule | Status | Matches | Assessment |
|------|--------|---------|------------|
| no-speculative-language | enabled | 23 | Working well |
| no-stalling | enabled | 5 | Too aggressive (2 false positives) |
| no-preference-asking | enabled | 0 | No data |
| no-false-completion | enabled | 8 | Working well |
| no-skipping | enabled | 12 | Too lenient (missed 4 instances) |
### Detailed Findings
#### no-stalling — Too Aggressive
**False positive excerpts:**
> "...let me first check if the file exists before writing to it..." (line 234, session abc123)
> This is legitimate — checking before writing is correct behavior, not stalling.
**Suggested pattern refinement:**
Remove "let me first check" from the pattern since it often occurs in legitimate pre-action verification.
**Current pattern:**
`(?i)(...|let me first check|...)`
**Proposed pattern:**
`(?i)(...|let me (first )?explain|...)`
(removed "let me first check" — kept "let me first explain" which is actual stalling)
---
#### no-skipping — Too Lenient
**Missed instances:**
> "...I won't bother with the edge cases since they're unlikely..." (not matched)
> "...that's close enough for now..." (not matched)
**Suggested pattern additions:**
Add: `I won'?t bother|close enough|good enough for now|that'?ll do`
**Proposed updated pattern:**
`(?i)(...existing...|I won'?t bother|close enough|good enough for now|that'?ll do)`
---
### Uncovered Gaps
| Category | Matches Found | Has Rule? |
|----------|--------------|-----------|
| Excessive apologizing | 7 | No |
**Suggested new rule:** Add a `# no-apologizing` grep block to `.claude/hooks/stop-guardrails.sh`
### Recommendations
1. **Update no-stalling** — tighten pattern to reduce false positives
2. **Update no-skipping** — expand pattern to catch missed phrases
3. **Install no-apologizing** — new pattern detected, not yet covered
4. **Keep no-speculative-language** — working well, no changes needed
5. **Monitor no-preference-asking** — no data yet, keep enabled
After presenting the report, apply the recommended changes:
.claude/hooks/stop-guardrails.sh with the updated grep pattern using the Edit tool.claude/hooks/stop-guardrails.sh following the existing patternAfter each edit, read the file back to verify the change was applied correctly.
Report what was changed:
## Changes Applied
- Updated `no-stalling` pattern in `.claude/hooks/stop-guardrails.sh` — removed false-positive trigger
- Updated `no-skipping` pattern in `.claude/hooks/stop-guardrails.sh` — added 4 new phrases
- Added `no-apologizing` rule to `.claude/hooks/stop-guardrails.sh` — new grep block
Script changes are active immediately.
If all rules are working well: Report that with match counts and congratulate the user on well-tuned rules. Suggest running update again after more sessions accumulate.
If no session logs exist: Tell the user to run some sessions first, then come back to update.
If rules were recently installed (< 2 days): Note that the sample size is small and results may not be representative. Still run the analysis but flag the limitation.
If the user has custom rules not in the curated set: Respect the customization. Base refinement suggestions on the user's version, not the curated original.