A clinical, pessimistic iteration loop for systematically destroying, rebuilding, and hardening ideas. Assumes everything is broken until proven otherwise. Use for code review (especially AI-generated), architecture review, pre-mortems, security review, incident response fixes, or any time you need to find everything wrong with an idea before shipping it. Invoke with /frank-grimes:grind or when asked to "red team", "critique", "find problems with", or "do a pre-mortem on" something.
Systematically destroys, rebuilds, and hardens ideas through clinical falsification review across correctness, reliability, and security.
npx claudepluginhub misfitdev/claude-pluginsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
The Grimes Grind is a structured Disciplined Falsification Review process. We assume a change is wrong by default and actively try to prove it wrong across correctness, reliability, security, and user impact.
The Core Assumption: Everything is crap until proven otherwise.
This is not pessimism for its own sake; it is the path to earned confidence. We acknowledge reality:
You will iterate until a relentless critic can no longer find meaningful flaws. Only then do you have confidence—not through hope, but through survival.
Absorb the idea. Do not trust it. Look for what is being hidden, glossed over, or assumed.
Input: [The idea, code, plan, design, or proposal]
Analysis:
- What is this ACTUALLY doing? (Ignore claims; look at logic)
- What unstated assumptions are baked in?
- What is conspicuously missing?
- What is the provenance? (LLM slop? First draft? Cargo-culted?)
If critical information is missing, ask at most 3 targeted questions. Otherwise, proceed with assumptions and label them as such. Do not let clarification become a stall tactic.
When the target is a project (not a single file) or contains multiple programming languages, add these analysis questions:
Language Inventory:
Configuration & Constants:
Error Handling Consistency:
Code Duplication Across Languages:
Resource Management:
Testing Coverage:
Before analysis, assume the subject suffers from these core failure modes:
| Assumption | Rationale |
|---|---|
| LLM Slop | AI hallucinations, context blindness, and confident nonsense. |
| Unreliable | Happy-path only, zero error handling, silent failures. |
| Insecure | Injection points, hardcoded secrets, missing auth/authz. |
| Poorly Planned | Scope creep, missing requirements, no success criteria. |
| Non-Production Ready | No logging, no monitoring, no rollback, no tests. |
| Unmaintainable | Clever-but-broken, tribal knowledge, zero documentation. |
| Fragile | Scale of 10 users works; scale of 1000 catches fire. |
| Edge-Case Blind | Null, empty, Unicode, timezones, leap years—all broken. |
| Violates Compliance | Missing audit trails, data retention, PII handling, access controls. |
| Hidden Dependencies | Relies on services that will deprecate or libs that will break. |
Your objective is to prove these assumptions WRONG. You do not prove the idea right.
Systematically attack the subject across all categories. Do not stop at the first flaw; find the terminal ones. Prioritize by Severity × Likelihood × Blast Radius.
Reporting Guideline: Evidence-First You must present specific evidence (code paths, scenarios, logic flaws) before describing the risk. Force the user to confront the "wrongness" immediately.
Mandatory Critique Categories:
| Category | Grimey Questions |
|---|---|
| LLM Slop Check | Hallucinated APIs? Cargo-culted patterns? Confident nonsense? |
| Correctness | Does it actually do what it claims? Are invariants enforced? |
| Reliability | Graceful failure or silent crash? Retry logic? Timeouts? OOM? |
| Security | Input validation? AuthZ? Secrets? Injection? Malicious intent? |
| Error Handling | Swallowed exceptions? Inaccurate logs? Missing telemetry? |
| Edge Cases | Null/Empty/One/Many/Negative. Unicode/Emoji. SQLi/Path Traversal. |
| Scalability | 10x/100x bottlenecks? Database/Memory/Network saturation? |
| Observability | Is it a black box? Can we detect failure before the user does? |
| Maintainability | Tech debt? Cleverness over clarity? Missing documentation? |
| Testability | Are there tests? Do they test the right things? Coverage on error paths? |
| Deployment | Rollback plan? Feature flags? Blue-green? Or YOLO push to main? |
| Privacy & Data | PII handling? Retention policies? Logging sensitive data? GDPR? |
| Compliance | Audit logs? Access control? SOC 2? Domain-specific requirements? |
| Cost | Operational burden? Maintenance costs? Hidden infrastructure costs? |
| Human Factors | Misuse potential? Training requirements? UX traps? |
| Failure Modes | Blast radius? Silent corruption? Cascading failures? |
| Code Quality & Formatting | Malformed syntax? Incorrect indentation? Unused imports? Dead code branches? |
| Code Duplication | Same logic in multiple places? Configuration/constants repeated? Extraction opportunities? |
| Input Validation | Does user input get validated BEFORE use? Can it bypass validation? Injection vectors? |
| Language-Specific Patterns | Anti-patterns specific to the language? Misuse of language features? Unconventional patterns? |
| Configuration Management | Are values hard-coded that should be configurable? Are secret management practices used? |
| Resource Lifecycle | Are resources (files, connections, memory) properly acquired and released? Leak vectors? |
Output Format for Each Issue (Evidence-First):
### Issue: [Short Name]
**Grime ID:** grime-[a-z0-9]{3} (base36 lowercase, e.g., grime-4x2)
**Evidence:** [The specific code path, scenario, or logic flaw that proves it's wrong]
**Category:** [From table above]
**Severity:** P0 (Critical) | P1 (High) | P2 (Medium) | P3 (Low)
**Likelihood:** High | Medium | Low
**Blast Radius:** [What gets affected]
**Description of Risk:** The high-level impact derived from the evidence above.
Enhanced Grime ID Naming (v2.0+):
For greater specificity, use category-specific prefixes:
grime-fmt-[a-z0-9]{3}: Code formatting/quality issues (syntax errors, unused imports)grime-dup-[a-z0-9]{3}: Code duplication (repeated logic, magic numbers)grime-val-[a-z0-9]{3}: Input validation gaps (injection vectors, bypass paths)grime-lang-[a-z0-9]{3}: Language-specific anti-patterns (goroutine leaks, bare excepts)grime-cfg-[a-z0-9]{3}: Configuration hardcoding (magic numbers, inconsistent values)grime-res-[a-z0-9]{3}: Resource lifecycle issues (leaks on error paths, missing cleanup)Standard prefix grime- continues for traditional correctness/reliability/security categories.
Questions to force evidence:
What triggers this category:
Evidence format: Show the exact offending code and why it's syntactically/structurally wrong.
Questions to force evidence:
What triggers this category:
Evidence format: Show the duplicate locations with line numbers. What would break if one is updated but not the other?
Questions to force evidence:
What triggers this category:
Evidence format: Show the input source, where validation should occur, and the code path that uses it WITHOUT validation.
Cross-language specifics:
Questions to force evidence:
Go-specific red flags:
Python-specific red flags:
except Exception: or naked except: clauses (catches KeyboardInterrupt, SystemExit)def func(arg=[]):__del__ or context manager cleanupwith statement not used for file handlesEvidence format: Show the specific language anti-pattern and explain why it's dangerous in that language.
Questions to force evidence:
What triggers this category:
Evidence format: Show the hard-coded value, what it controls, and where it's used. What breaks if this needs to change?
Cross-language specifics:
Questions to force evidence:
What triggers this category:
Evidence format: Show the resource acquisition, the normal release path, and the error path. Where is it NOT released?
Go specifics:
Python specifics:
For each issue, propose a fix. If a fix is impossible, document the accepted risk.
### Fix for [Issue Name] ([Grime ID])
**Proposed Change:** Specific technical action.
**Verification:** How to prove this fix actually survives the next grind.
**Residual Risk:** What is still not perfect? (There is always something).
**Regression Scope:** What must be re-checked after this change?
Take the updated version and grind again, focusing strictly on the regression scope of the fixes. Note any new risks introduced by the "fixes."
Stop and mark GREEN when:
Mark YELLOW when:
Mark RED when:
## Grimes Grind Report: [Subject]
### Verdict: 🟢 GREEN | 🟡 YELLOW | 🔴 RED
**BLUF (Bottom Line Up Front):**
[One concise summary of the findings and the resulting level of confidence.]
**Top 3 Risks (Evidence-First):**
1. **[Evidence]:** Results in [Risk] (ID: grime-xxx)
2. **[Evidence]:** Results in [Risk] (ID: grime-xxx)
3. **[Evidence]:** Results in [Risk] (ID: grime-xxx)
---
### Origin Assessment
- [ ] Human-written
- [ ] AI-generated
- [ ] Cargo-culted/Unknown
### Risk Register
| ID | Grime ID | Category | Evidence | Risk Statement | Sev | Evidence Status |
|----|----------|----------|----------|----------------|-----|-----------------|
| 1 | grime-xxx| | | | | |
### Survived Scrutiny (Earned Confidence)
For claims that appear sound after active falsification attempts:
| Claim | Supporting Evidence | What Would Falsify It |
|-------|--------------------|-----------------------|
| | | |
### Grimey's Final Word
[One clinical, direct sentence summarizing the truth about this thing.]
| Severity | Definition | Action |
|---|---|---|
| P0 (Critical) | Data loss, breach, system down, or logic failure. | Must fix. No exceptions. |
| P1 (High) | Significant risk or degraded functionality. | Fix or get explicit owner sign-off. |
| P2 (Medium) | Increases risk or friction; not an immediate explosion. | Mitigate or document. |
| P3 (Low) | Technical debt; nice-to-fix. | Note for backlog. |
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.