From harness-claude
Performs design audits with Nielsen's 10 heuristics, consistency inventories, accessibility checks, competitive analysis, and design debt quantification for existing products before redesigns.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Evaluating existing design — heuristic evaluation (Nielsen's 10), consistency inventory, accessibility audit, competitive analysis, identifying and quantifying design debt
Evaluates UX/UI of websites, apps, and digital interfaces using Jakob Nielsen's 10 usability heuristics. Identifies issues in visibility of status, system-real-world match, consistency, error prevention, flexibility, aesthetics, recognition, error recovery, and documentation.
Audits UI screenshots, Figma designs, or live sites for visual quality, design consistency, brand alignment, design system compliance, and issues like WCAG contrast failures pre-launch.
Aggregates design audit findings across usability heuristics, WCAG accessibility, visual consistency, and component reusability; deduplicates issues, prioritizes by severity, and generates executive summaries with remediation roadmaps.
Share bugs, ideas, or general feedback.
Evaluating existing design — heuristic evaluation (Nielsen's 10), consistency inventory, accessibility audit, competitive analysis, identifying and quantifying design debt
Define the audit scope and success criteria before starting. An audit without boundaries becomes an infinite task. Specify:
Conduct a heuristic evaluation using Nielsen's 10 usability heuristics. Walk through every screen in scope and evaluate against each heuristic systematically. Do not rely on memory — open the actual product and interact with it. For each heuristic, note:
Build a consistency inventory. A consistency inventory catalogs every visual and interaction pattern in the product:
Measure design debt quantitatively. Design debt is the accumulation of inconsistent, outdated, or suboptimal design decisions. Quantify by counting:
Perform a competitive audit for design patterns. Select 3-5 direct competitors and 2-3 best-in-class products outside your category. For each, document:
Audit for accessibility against WCAG 2.1 AA. Use both automated tools and manual testing. Automated tools (axe, Lighthouse, WAVE) catch approximately 30-40% of accessibility issues — the remaining 60-70% require manual evaluation:
Synthesize findings into a severity-prioritized report. Raw audit data is overwhelming. Synthesize into four tiers:
Establish a baseline and schedule recurring audits. A single audit is a snapshot. Design debt accumulates continuously. Establish metrics from the audit and re-measure quarterly:
Each heuristic requires specific evaluation techniques:
H1: Visibility of system status. Check every state transition:
H2: Match between system and real world. Audit all labels, error messages, and instructional text. "CORS error" in a consumer product fails this heuristic. "Unable to connect — check your internet connection" passes. Check icon metaphors: does a floppy disk mean "save" to users who have never seen one?
H4: Consistency and standards. The heuristic most served by the consistency inventory:
H5: Error prevention. Audit every form and destructive action:
Design debt falls into five categories, each with different remediation strategies:
Pattern fragmentation. The same UI concept implemented differently across the product. Example: three different loading patterns (spinner, skeleton, progress bar) used interchangeably with no governing logic. Remediation: define a loading pattern decision tree and migrate all instances.
Stale patterns. UI that followed best practices when built but has not been updated. Example: a modal without focus trapping — acceptable in 2018, a WCAG violation since 2.1. Remediation: identify all stale patterns, prioritize by impact, update incrementally.
Inconsistent spacing and sizing. Values that drift from the system scale. Example: the design system defines an 8px grid, but screens use 7px, 9px, 10px, 13px because developers eyeballed values. Remediation: find-and-replace with design token audit tooling.
Orphaned components. One-off UI elements that could use a standard component. Example: a custom date range picker on analytics when the design system has a standard one. Remediation: replace with the standard component, accounting for edge cases the custom version handled.
Missing states. Components that only handle the happy path. Example: a data table with no empty state, no error state, no loading state, no pagination for large datasets. Remediation: define the full state matrix for each component and implement missing states.
Finding card format:
ID: AUDIT-042
Heuristic: H4 (Consistency and standards)
Severity: Major (3/4)
Screen: Settings > Notifications
Finding: The "Save" button in notification preferences is left-aligned,
while every other settings tab places "Save" right-aligned.
Evidence: [screenshot with annotation]
Impact: Users with muscle memory for right-aligned save will miss this button.
Recommendation: Move to right-aligned position consistent with other tabs.
Effort: Low (CSS change, 1 file)
Summary dashboard metrics:
| Category | Count | Critical | Major | Minor |
|---|---|---|---|---|
| Heuristic violations | 87 | 4 | 23 | 60 |
| WCAG failures | 34 | 8 | 19 | 7 |
| Pattern inconsistencies | 156 | 0 | 42 | 114 |
| Missing states | 28 | 6 | 15 | 7 |
The Screenshot Safari. Collecting hundreds of screenshots with no analytical framework. The auditor spends two weeks capturing every screen, produces a 200-slide deck, and nobody reads it because there is no synthesis, no prioritization, and no recommendation. The fix: evaluate against a framework (Nielsen's heuristics, WCAG criteria) and produce a prioritized findings report. Screenshots are evidence, not the deliverable.
The Perfection Trap. Treating every inconsistency as equally urgent. An audit that flags "the border-radius is 7px instead of 8px" with the same severity as "users cannot recover from a failed payment without clearing their browser cache" has lost all sense of proportion. The fix: use a severity scale ruthlessly. Only critical findings block releases. Minor findings go to the backlog.
The Audit-and-Forget. Running a comprehensive audit, producing an excellent report, and never acting on it. Six months later the same issues persist plus new ones. The fix: tie audit findings to sprint planning. Each sprint pulls the top 3 findings from the audit backlog. Track remediation velocity. Re-audit quarterly.
Solo Evaluator Bias. A single person conducts the entire heuristic evaluation. Research shows one evaluator finds only 35% of usability problems. Three to five independent evaluators find 75-85%. The fix: use multiple evaluators who audit independently before comparing findings. The disagreements are often the most valuable discussion points.
Shopify's Polaris Audit (2017). Before building Polaris, Shopify audited their entire admin interface. They discovered: 47 unique button styles, 12 different modal implementations, inconsistent spacing ranging from 4px to 37px with no system, and color usage that had drifted so far that no source of truth existed. The audit quantified the cost: engineers spent an estimated 30% of UI development time making decisions a design system would eliminate. This data justified the multi-year Polaris investment. The methodology — screenshot every component variant, categorize, count, calculate remediation cost — became a template Atlassian, Salesforce, and GitLab later adopted.
UK Government Digital Service (GDS) Accessibility Audit. GDS audited all gov.uk services against WCAG 2.1 AA in 2019. They tested 900+ forms across 150 services. Findings: 62% had form labels not programmatically associated with inputs, 41% had insufficient color contrast on at least one critical element, 23% had keyboard traps in modal dialogs. Results were published publicly, creating accountability. Each service team received a prioritized remediation list. Within 12 months, critical violations dropped 78%.
Atlassian Design System Consolidation. When Atlassian unified Jira, Confluence, Trello, and Bitbucket under a single design system, they first audited all four products. The audit revealed: 13 date picker variants, 8 avatar implementations, 5 dropdown menus, and color palettes that overlapped but were not identical. Each variant was categorized as "keep," "merge," or "deprecate." The merged components retained the best accessibility characteristics from any variant. The audit took 6 weeks; the consolidation roadmap spanned 18 months.