From harness-claude
Implements visual regression testing with screenshot comparison, diff detection, and baseline management for UI components and pages to catch CSS regressions and layout shifts.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Screenshot comparison, visual diff detection, and baseline management. Catches unintended CSS regressions, layout shifts, and rendering inconsistencies before they reach production.
Detects UI visual regressions via screenshot comparisons using Playwright, Cypress, Percy. Generates diffs, handles responsive breakpoints, and integrates with CI.
Detects visual and UI regressions via screenshot comparison and pixel-diff analysis using Playwright or Puppeteer. Captures cross-browser/viewport screenshots, categorizes layout shifts and color changes, generates diff reports for CI/CD PR checks.
Detects unintended visual changes in UI by comparing screenshots across versions using Playwright, Percy, Chromatic, Cypress. Useful for CSS bugs, responsive design, browser testing, and PR reviews.
Share bugs, ideas, or general feedback.
Screenshot comparison, visual diff detection, and baseline management. Catches unintended CSS regressions, layout shifts, and rendering inconsistencies before they reach production.
Scan for existing visual test infrastructure. Search for:
.storybook/, *.stories.tsx, *.stories.ts)percy.yml, Playwright screenshot tests)screenshots/, __image_snapshots__/, visual-tests/)Catalog testable components. Identify UI surfaces that benefit from visual testing:
Determine the rendering strategy. Choose how screenshots are captured:
Identify viewport and theme matrix. Define the combinations to test:
Report findings. Summarize: components to cover, rendering strategy, viewport matrix, and estimated baseline count.
Configure the visual testing tool. Set up:
.gitkeep or add to .gitignore as appropriateStabilize rendering for deterministic screenshots. Address common sources of non-determinism:
Capture baseline screenshots. For each component in the test matrix:
Review baselines manually. Before committing, visually inspect every baseline screenshot. Confirm:
Commit baselines. Add baseline screenshots to version control with a descriptive commit message. These baselines become the source of truth for future comparisons.
Execute visual comparison. Run the visual test suite, which:
Classify each diff. For every screenshot that exceeds the threshold:
Investigate regressions. For each regression:
Update baselines for intentional changes. When a visual change is confirmed intentional:
Generate a diff report. Produce a summary showing:
Create a visual diff summary for PR review. Include:
Integrate with CI pipeline. Configure the visual test suite to:
Define the approval workflow. Establish:
Run harness validate. Confirm the project passes all harness checks with visual testing infrastructure in place.
Document the visual testing workflow. Record:
If a knowledge graph exists at .harness/graph/, refresh it after code changes to keep graph queries accurate:
harness scan [path]
harness validate -- Run in REPORT phase after visual testing infrastructure is complete. Confirms project health.harness check-deps -- Run after BASELINE phase to verify visual testing dependencies are in devDependencies.emit_interaction -- Used to present visual diff results and request human approval for baseline updates.harness validate passes with visual testing infrastructure in placeBASELINE -- Capture component screenshots:
// visual-tests/components.spec.ts
import { test, expect } from '@playwright/test';
const viewports = [
{ name: 'mobile', width: 375, height: 812 },
{ name: 'desktop', width: 1280, height: 720 },
];
for (const viewport of viewports) {
test.describe(`${viewport.name} viewport`, () => {
test.use({ viewport: { width: viewport.width, height: viewport.height } });
test('dashboard renders correctly', async ({ page }) => {
await page.goto('/dashboard');
await page.waitForLoadState('networkidle');
// Disable animations for deterministic screenshots
await page.addStyleTag({
content:
'*, *::before, *::after { animation: none !important; transition: none !important; }',
});
await expect(page).toHaveScreenshot(`dashboard-${viewport.name}.png`, {
maxDiffPixelRatio: 0.005,
fullPage: true,
});
});
test('settings page renders correctly', async ({ page }) => {
await page.goto('/settings');
await page.waitForLoadState('networkidle');
await page.addStyleTag({
content:
'*, *::before, *::after { animation: none !important; transition: none !important; }',
});
await expect(page).toHaveScreenshot(`settings-${viewport.name}.png`, {
maxDiffPixelRatio: 0.005,
});
});
});
}
DETECT output:
Storybook: v7.6 detected (.storybook/main.ts)
Stories: 47 stories across 23 components
Chromatic: not configured
Existing baselines: none
Components without stories: Modal, Toast, DatePicker (3 gaps)
BASELINE -- Configure Chromatic and run first build:
// package.json (relevant scripts)
{
"scripts": {
"chromatic": "chromatic --project-token=${CHROMATIC_PROJECT_TOKEN}",
"chromatic:ci": "chromatic --project-token=${CHROMATIC_PROJECT_TOKEN} --exit-zero-on-changes --auto-accept-changes main"
}
}
// .storybook/preview.ts -- stabilize rendering
import { Preview } from '@storybook/react';
const preview: Preview = {
parameters: {
chromatic: {
pauseAnimationAtEnd: true,
viewports: [375, 768, 1280],
},
},
decorators: [
(Story) => (
<div style={{ fontFamily: 'Arial, sans-serif' }}>
<Story />
</div>
),
],
};
export default preview;
| Rationalization | Reality |
|---|---|
| "The baseline screenshots are committed locally but not pushed — CI can capture its own baselines on the first run." | CI baselines captured without human review become the source of truth for a rendering state nobody verified. Every baseline must be manually inspected before committing. A CI-generated baseline for a broken layout will pass every future comparison until someone notices the visual bug manually. |
| "The diff is only 0.3% of pixels — it's probably just font rendering noise, not a real regression." | "Probably" is not a classification. Investigate the diff to determine if it is environmental noise or a real change before dismissing it. If it is noise, fix the source (fonts, animations) and lower the threshold. If it is a real change, update the baseline with intent. Skipping investigation means regressions hide behind noise tolerance. |
| "We have 300 components — it's not practical to have baselines for all of them." | Start with shared design system components (buttons, inputs, modals) and page-level layouts. Partial coverage is better than none, and it is easier to add coverage incrementally than to audit an entire codebase for visual regressions after the fact. Prioritize high-visibility, high-change-frequency surfaces first. |
| "The visual test suite takes 20 minutes — let's just run it manually before releases instead of in CI." | Manual pre-release checks are skipped under deadline pressure. Visual regression is most valuable on every PR, where the author is still in context and the fix is immediate. A 20-minute suite is a signal to optimize (affected-story detection, parallelization), not to remove CI integration. |
| "The component changed intentionally — I'll just auto-accept the diff without reviewing the new screenshot." | Auto-accepting without review permanently resets the baseline to whatever was rendered, correct or not. Every baseline update requires a human to look at the new screenshot and confirm it matches design intent. The review is not a formality — it is the entire point of the approval workflow. |