From dev-tools
Audits web app UX by dogfooding as user persona: tracks emotional friction, click efficiency, resilience to back/refresh, return intent. Uses Playwright/Chrome MCP for live sites, outputs ranked reports.
npx claudepluginhub jezweb/claude-skills --plugin dev-toolsThis skill uses the workspace's default tool permissions.
Dogfood web apps by browsing them as a real user would — with their goals, their patience, and their context. Goes beyond "does it work?" to "is it good?" by tracking emotional friction (trust, anxiety, confusion), counting click efficiency, testing resilience, and asking the ultimate question: "would I come back?" Uses Chrome MCP (for authenticated apps with your session) or Playwright for bro...
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Dogfood web apps by browsing them as a real user would — with their goals, their patience, and their context. Goes beyond "does it work?" to "is it good?" by tracking emotional friction (trust, anxiety, confusion), counting click efficiency, testing resilience, and asking the ultimate question: "would I come back?" Uses Chrome MCP (for authenticated apps with your session) or Playwright for browser automation. Produces structured audit reports with findings ranked by impact.
Before starting any mode, detect available browser tools:
mcp__claude-in-chrome__*) — preferred for authenticated apps. Uses the user's logged-in Chrome session, so OAuth/cookies just work.mcp__plugin_playwright_playwright__*) — for public apps or parallel sessions.If none are available, inform the user and suggest installing Chrome MCP or Playwright.
See references/browser-tools.md for tool-specific commands.
If the user didn't provide a URL, find one automatically. Prefer the deployed/live version — that's what real users see.
Check wrangler.jsonc for custom domains or routes:
grep -E '"pattern"|"custom_domain"' wrangler.jsonc 2>/dev/null
If found, use the production URL (e.g. https://app.example.com).
Check for deployed URL in CLAUDE.md, README, or package.json homepage field.
Fall back to local dev server — check if one is already running:
lsof -i :5173 -i :3000 -i :8787 -t 2>/dev/null
If running, use http://localhost:{port}.
Ask the user as a last resort.
Why live over local: The live site has real data, real auth, real network latency, real CDN behaviour, and real CORS/CSP policies. Testing locally misses deployment-specific issues (missing env vars, broken asset paths, CORS errors, slow API responses). The UX audit should test what the user actually experiences.
When local is better: The user explicitly says "test localhost", or the feature isn't deployed yet.
Control how thorough the audit is. Pass as an argument: /ux-audit quick, /ux-audit thorough, or default to standard.
| Depth | Duration | Autonomy | What it covers |
|---|---|---|---|
| quick | 5-10 min | Interactive | One user flow, happy path only. Spot check after a change. |
| standard | 20-40 min | Semi-autonomous | Full walkthrough + QA sweep of main pages. Default. |
| thorough | 1-3 hours | Fully autonomous | Multiple personas, all pages, all modes combined. Overnight mode. |
| exhaustive | 4-8+ hours | Fully autonomous | Every interactive element on every page. Every button clicked, every dialog opened, every form filled, every state triggered. Leave nothing untested. |
The exhaustive mode goes beyond thorough. Thorough tests workflows and pages. Exhaustive tests every single interactive element in the application.
For each page discovered:
Progress tracking: This mode generates a LOT of findings. Write findings to the report incrementally — don't hold everything in memory. Update docs/ux-audit-exhaustive-YYYY-MM-DD.md after each page is complete.
Element inventory format (per page):
/clients — 47 interactive elements
[x] "Add Client" button — opens modal ✓, form submits ✓, validation ✓
[x] Search input — filters correctly ✓, clear button works ✓, empty search ✓
[x] Sort dropdown — all 4 options work ✓, persists on navigation ✗ (BUG)
[x] Client row click — navigates to detail ✓
[x] Star button — toggles ✓, persists on refresh ✓
[ ] Pagination — next ✓, prev ✓, page numbers ✓, items per page ✗ (not tested - no data)
...
The thorough mode is designed to run unattended. Kick it off at end of day, review the report in the morning. The user should NOT need to find issues themselves — this mode catches everything.
Mindset: Don't run through a checklist. Think about the real person who will use this app every day. What are the threads of their workday? How will they move through the system? Will they understand what they're looking at? Will the app teach them how to use it through its design, or will they be guessing? Read references/workflow-comprehension.md before starting.
.jez/screenshots/ux-audit/ (numbered chronologically)docs/ux-audit-thorough-YYYY-MM-DD.md with issue counts by severityOn each page, inject JavaScript via the browser tool to programmatically detect layout issues:
// Detect elements overflowing their parent
document.querySelectorAll('*').forEach(el => {
const r = el.getBoundingClientRect();
const p = el.parentElement?.getBoundingClientRect();
if (p && (r.left < p.left - 1 || r.right > p.right + 1)) {
console.warn('OVERFLOW:', el.tagName, el.className, 'extends beyond parent');
}
});
// Detect text clipped by containers
document.querySelectorAll('h1,h2,h3,h4,p,span,a,button,label').forEach(el => {
if (el.scrollWidth > el.clientWidth + 2 || el.scrollHeight > el.clientHeight + 2) {
console.warn('CLIPPED:', el.tagName, el.textContent?.slice(0,50));
}
});
// Detect elements with zero or negative visibility
document.querySelectorAll('*').forEach(el => {
const s = getComputedStyle(el);
const r = el.getBoundingClientRect();
if (r.width > 0 && r.height > 0 && r.left + r.width < 0) {
console.warn('OFF-SCREEN LEFT:', el.tagName, el.className);
}
});
// Detect low contrast text (rough check)
document.querySelectorAll('h1,h2,h3,p,span,a,li,td,th,label,button').forEach(el => {
const s = getComputedStyle(el);
if (s.color === s.backgroundColor || s.opacity === '0') {
console.warn('INVISIBLE TEXT:', el.tagName, el.textContent?.slice(0,30));
}
});
Read console output after injection. Each warning is a potential finding to screenshot and investigate.
For each page, resize the viewport through standard breakpoints and screenshot:
| Width | What it represents | Check for |
|---|---|---|
| 1280px | Desktop (standard) | Baseline layout, sidebar + content |
| 1024px | Small desktop / tablet landscape | Nav collapse point, grid reflow |
| 768px | Tablet portrait | Sidebar behaviour, stacked layout |
| 375px | Mobile | Everything stacked, touch targets, no horizontal scroll |
If the layout changes between breakpoints (sidebar collapses, grid reduces columns), screenshot the transition point too.
On each page, read the browser console for:
Critical: Visual browsing misses API failures that the UI hides. Data-fetching libraries (TanStack Query, SWR) swallow HTTP errors and show empty/loading states instead of error messages. A component showing "No results found" might actually be getting a 403 — but the UI looks normal.
Monitor network responses throughout the entire audit session. If using Playwright, attach a response listener before browsing starts:
// Inject into page or use Playwright's page.on('response')
// Collect all non-2xx API responses
const networkErrors = [];
// After each page navigation, check for failed fetch/XHR requests
// by reading the browser's network log or console output
If using Chrome MCP, use read_network_requests to check for failed API calls after each page visit.
What to collect: URL, HTTP status, method, the page you were on when it happened.
Severity mapping:
| Status | Severity | What it usually means |
|---|---|---|
| 500+ | Critical | Server error — something is broken |
| 403 | High | Permission error OR route collision (static route shadowed by /:param) |
| 404 | Medium | Missing endpoint — may be a renamed/removed API route |
| 401 | Low | Expected for unauthenticated probes, but flag if it happens on authenticated pages |
| CORS error | High | API endpoint missing CORS headers — feature broken in production |
What this catches that visual browsing misses:
GET /api/boards/users shadowed by GET /api/boards/:boardId)Report format: Group by status code in a "Network Errors" section:
## Network Errors (detected during browsing)
### 403 Forbidden (2 endpoints)
- `GET /api/boards/users` on /app/boards/123 — likely route collision with /:boardId
- `POST /api/settings/theme` on /app/settings — permission check failing
### 500 Internal Server Error (1 endpoint)
- `GET /api/reports/summary` on /app/dashboard — server error
| Action | quick | standard | thorough |
|---|---|---|---|
| Navigate pages | Just do it | Just do it | Just do it |
| Take screenshots | Key moments | Friction points | Every page + every issue |
| Fill forms with test data | Ask first | Ask first | Just do it (obviously fake data) |
| Click delete/destructive | Ask first | Ask first | Ask first (only exception) |
| Submit forms | Ask first | Brief confirmation | Just do it (test data only) |
| Write report file | Just do it | Brief confirmation | Just do it |
When: "ux walkthrough", "walk through the app", "is the app intuitive?", "ux audit", "dogfood this"
This is the highest-value mode. You are dogfooding the app — using it as a real user would, with their goals, their constraints, and their patience level. Not a mechanical checklist pass, but genuinely trying to get a job done.
Ask the user two questions:
If the user doesn't specify a persona, adopt a reasonable default: a non-technical person who is time-poor, mildly distracted, and using this app for the first time today.
Navigate to the app's entry point. From here, attempt the task with no prior knowledge of the UI. Adopt the persona's mindset:
At each screen, evaluate against the walkthrough checklist (see references/walkthrough-checklist.md). Key questions to hold in mind:
Layout: Is all text fully visible? Nothing clipped or overlapping? Spacing consistent? Comprehension: Do I understand what this page is for and what I can do here? Do the labels make sense to a non-developer? Wayfinding: Do I know where I am in the app? Can I get back to where I came from? Does the nav show my location? Flow: Does this page connect naturally to the last one? Is the next step obvious without thinking? Trust: Do I feel confident this will do what I expect? Am I afraid I'll break something? Efficiency: How many clicks/steps is this taking? Is there a shorter path? Recovery: If I make a mistake right now, can I get back?
Track the effort required to complete the task:
After completing the main task, test what happens when things go wrong:
After completing (or failing) the task, reflect as the persona:
Take screenshots at friction points. Compile findings into a UX audit report.
Write report to docs/ux-audit-YYYY-MM-DD.md using the template from references/report-template.md
Severity levels:
When: "qa test", "test all pages", "check everything works", "qa sweep"
Systematic mechanical testing of all pages and features.
Discover pages: Read the app's router config, sitemap, or manually navigate the sidebar/menu to find all routes
Create a task list of areas to test (group by feature area)
For each page/feature:
Cross-cutting concerns:
Produce a QA sweep summary table:
| Page | Status | Issues |
|---|---|---|
| /patients | Pass | — |
| /patients/new | Fail | Form validation missing on email |
Write report to docs/qa-sweep-YYYY-MM-DD.md
When: "check [feature]", "test [page]", "verify [component] works"
Focused testing of a specific area.
| Scenario | Mode | Depth |
|---|---|---|
| Just changed a page, quick sanity check | Targeted Check | quick |
| After building a feature, before showing users | UX Walkthrough | standard |
| Before a release, verify nothing is broken | QA Sweep | standard |
| Quick check on a specific page after changes | Targeted Check | quick |
| Periodic UX health check | UX Walkthrough | standard |
| Client demo prep | QA Sweep + UX Walkthrough | standard |
| End-of-day comprehensive test, review in morning | All modes combined | thorough |
| Pre-launch full audit with evidence | All modes combined | thorough |
| Test literally everything before a client demo | Every element on every page | exhaustive |
| Weekend-long complete app certification | Every element, state, viewport, mode | exhaustive |
Skip this skill for: API-only services, CLI tools, unit/integration tests (use test frameworks), performance testing.
Default rules (standard depth). See "Autonomy by Depth" table above for quick/thorough overrides.
| When | Read |
|---|---|
| Before starting thorough mode — understand the user's world | references/workflow-comprehension.md |
| Evaluating each screen during walkthrough | references/walkthrough-checklist.md |
| Running the six scenario tests | references/scenario-tests.md |
| Writing the audit report | references/report-template.md |
| Browser tool commands and selection | references/browser-tools.md |