From impeccable
Evaluates UI/UX designs assessing visual hierarchy, information architecture, cognitive load, emotional resonance with scoring, persona testing, anti-pattern detection, actionable feedback. Use for design reviews/critiques.
npx claudepluginhub pbakaus/impeccable --plugin impeccableThis skill is limited to using the following tools:
Invoke /impeccable, which contains design principles, anti-patterns, and the **Context Gathering Protocol**. Follow the protocol before proceeding. If no design context exists yet, you MUST run /impeccable teach first. Additionally gather: what the interface is trying to accomplish.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides MCP server integration in Claude Code plugins via .mcp.json or plugin.json configs for stdio, SSE, HTTP types, enabling external services as tools.
Invoke /impeccable, which contains design principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding. If no design context exists yet, you MUST run /impeccable teach first. Additionally gather: what the interface is trying to accomplish.
Launch two independent assessments. Neither must see the other's output to avoid bias.
You SHOULD delegate each assessment to a separate sub-agent for independence. Use your environment's agent spawning mechanism (e.g., Claude Code's Agent tool, or Codex's subagent spawning). Sub-agents should return their findings as structured text. Do NOT output findings to the user yet.
If sub-agents are not available in the current environment, complete each assessment sequentially, writing findings to internal notes before proceeding.
Tab isolation: When browser automation is available, each assessment MUST create its own new tab. Never reuse an existing tab, even if one is already open at the correct URL. This prevents the two assessments from interfering with each other's page state.
Read the relevant source files (HTML, CSS, JS/TS) and, if browser automation is available, visually inspect the live page. Create a new tab for this; do not reuse existing tabs. After navigation, label the tab by setting the document title:
document.title = '[LLM] ' + document.title;
Think like a design director. Evaluate:
AI Slop Detection (CRITICAL): Does this look like every other AI-generated interface? Review against ALL DON'T guidelines in the impeccable skill. Check for AI color palette, gradient text, dark glows, glassmorphism, hero metric layouts, identical card grids, generic fonts, and all other tells. The test: If someone said "AI made this," would you believe them immediately?
Holistic Design Review: visual hierarchy (eye flow, primary action clarity), information architecture (structure, grouping, cognitive load), emotional resonance (does it match brand and audience?), discoverability (are interactive elements obvious?), composition (balance, whitespace, rhythm), typography (hierarchy, readability, font choices), color (purposeful use, cohesion, accessibility), states & edge cases (empty, loading, error, success), microcopy (clarity, tone, helpfulness).
Cognitive Load (consult cognitive-load):
Emotional Journey:
Nielsen's Heuristics (consult heuristics-scoring): Score each of the 10 heuristics 0-4. This scoring will be presented in the report.
Return structured findings covering: AI slop verdict, heuristic scores, cognitive load assessment, what's working (2-3 items), priority issues (3-5 with what/why/fix), minor observations, and provocative questions.
Run the bundled deterministic detector, which flags 25 specific patterns (AI slop tells + general design quality).
CLI scan:
npx impeccable --json [--fast] [target]
[target] (anything with markup). Do not pass CSS-only files.--fast (regex-only, skips jsdom)Browser visualization (when browser automation tools are available AND the target is a viewable page):
The overlay is a visual aid for the user. It highlights issues directly in their browser. Do NOT scroll through the page to screenshot overlays. Instead, read the console output to get the results programmatically.
npx impeccable live &
Note the port printed to stdout (auto-assigned). Use --port=PORT to fix it.javascript_tool so the user can distinguish it:
document.title = '[Human] ' + document.title;
javascript_tool (replace PORT with the port from step 1):
const s = document.createElement('script'); s.src = 'http://localhost:PORT/detect.js'; document.head.appendChild(s);
read_console_messages with pattern impeccable. The detector logs all findings with the [impeccable] prefix. Do NOT scroll through the page to take screenshots of the overlays.npx impeccable live stop
For multi-view targets, inject on 3-5 representative pages. If injection fails, continue with CLI results only.
Return: CLI findings (JSON), browser console findings (if applicable), and any false positives noted.
Synthesize both assessments into a single report. Do NOT simply concatenate. Weave the findings together, noting where the LLM review and detector agree, where the detector caught issues the LLM missed, and where detector findings are false positives.
Structure your feedback as a design director would:
Consult heuristics-scoring
Present the Nielsen's 10 heuristics scores as a table:
| # | Heuristic | Score | Key Issue |
|---|---|---|---|
| 1 | Visibility of System Status | ? | [specific finding or "n/a" if solid] |
| 2 | Match System / Real World | ? | |
| 3 | User Control and Freedom | ? | |
| 4 | Consistency and Standards | ? | |
| 5 | Error Prevention | ? | |
| 6 | Recognition Rather Than Recall | ? | |
| 7 | Flexibility and Efficiency | ? | |
| 8 | Aesthetic and Minimalist Design | ? | |
| 9 | Error Recovery | ? | |
| 10 | Help and Documentation | ? | |
| Total | ??/40 | [Rating band] |
Be honest with scores. A 4 means genuinely excellent. Most real interfaces score 20-32.
Start here. Does this look AI-generated?
LLM assessment: Your own evaluation of AI slop tells. Cover overall aesthetic feel, layout sameness, generic composition, missed opportunities for personality.
Deterministic scan: Summarize what the automated detector found, with counts and file locations. Note any additional issues the detector caught that you missed, and flag any false positives.
Visual overlays (if browser was used): Tell the user that overlays are now visible in the [Human] tab in their browser, highlighting the detected issues. Summarize what the console output reported.
A brief gut reaction: what works, what doesn't, and the single biggest opportunity.
Highlight 2-3 things done well. Be specific about why they work.
The 3-5 most impactful design problems, ordered by importance.
For each issue, tag with P0-P3 severity (consult heuristics-scoring for severity definitions):
Consult personas
Auto-select 2-3 personas most relevant to this interface type (use the selection table in the reference). If CLAUDE.md contains a ## Design Context section from impeccable teach, also generate 1-2 project-specific personas from the audience/brand info.
For each selected persona, walk through the primary user action and list specific red flags found:
Alex (Power User): No keyboard shortcuts detected. Form requires 8 clicks for primary action. Forced modal onboarding. High abandonment risk.
Jordan (First-Timer): Icon-only nav in sidebar. Technical jargon in error messages ("404 Not Found"). No visible help. Will abandon at step 2.
Be specific. Name the exact elements and interactions that fail each persona. Don't write generic persona descriptions; write what broke for them.
Quick notes on smaller issues worth addressing.
Provocative questions that might unlock better solutions:
Remember:
After presenting findings, use targeted questions based on what was actually found. STOP and call the AskUserQuestion tool to clarify. These answers will shape the action plan.
Ask questions along these lines (adapt to the specific findings; do NOT ask generic questions):
Priority direction: Based on the issues found, ask which category matters most to the user right now. For example: "I found problems with visual hierarchy, color usage, and information overload. Which area should we tackle first?" Offer the top 2-3 issue categories as options.
Design intent: If the critique found a tonal mismatch, ask whether it was intentional. For example: "The interface feels clinical and corporate. Is that the intended tone, or should it feel warmer/bolder/more playful?" Offer 2-3 tonal directions as options based on what would fix the issues found.
Scope: Ask how much the user wants to take on. For example: "I found N issues. Want to address everything, or focus on the top 3?" Offer scope options like "Top 3 only", "All issues", "Critical issues only".
Constraints (optional; only ask if relevant): If the findings touch many areas, ask if anything is off-limits. For example: "Should any sections stay as-is?" This prevents the plan from touching things the user considers done.
Rules for questions:
After receiving the user's answers, present a prioritized action summary reflecting the user's priorities and scope from Step 4.
List recommended commands in priority order, based on the user's answers:
/command-name: Brief description of what to fix (specific context from critique findings)/command-name: Brief description (specific context)
...Rules for recommendations:
/polish as the final step if any fixes were recommendedAfter presenting the summary, tell the user:
You can ask me to run these one at a time, all at once, or in any order you prefer.
Re-run
/critiqueafter fixes to see your score improve.