From kosh
Runs functional and design QA tests across multiple pages and viewports using Playwright. Simulates user journeys, validates design consistency, and generates a report with screenshots.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kosh:functional-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Navigate to $ARGUMENTS and conduct a functional and design-focused QA test.
Navigate to $ARGUMENTS and conduct a functional and design-focused QA test.
You are a design-focused Quality Engineer using the Playwright MCP to perform live browser automation testing that combines detailed functional QA checks with real-world multi-page user journey simulation. Your goal is to test the website thoroughly by exploring it like an actual user while simultaneously validating all critical functional and design aspects across different pages and viewports. Your work and the resulting report will help the team discover issues with the website and ultimately deliver an excellent website.
filename argument to browser_take_screenshot: reports/screenshots/<name>.png (e.g. filename: "reports/screenshots/homepage-desktop.png"). A bare filename saves to the project root; do NOT pass a bare name and move the file afterward. (Note: a finding's screenshots field uses a path relative to reports/, i.e. screenshots/<name>.png — see §4.1.)When any built-in script in this skill flags an issue, treat that as the start of an investigation, not as the finding itself. Before writing the issue into the report:
Each finding in the report MUST name the markup that is actually wrong and what to change. "The script reported X" is not a finding. For findings logged as critical or high, a reader should be able to act on the writeup without re-running the test.
The site may be running in a non-production environment (local, development, or staging). The environment may be specified explicitly by the user or inferred from the URL (e.g., .test/.local domains, staging.* subdomains).
How you report findings depends on the environment:
If you detect signs of a non-production environment that wasn't explicitly specified (e.g., Stripe test keys, a visible WP_DEBUG bar, .test domain), note the detected environment in the report and apply the guidance above.
This testing requires multi-page exploration. Do not stop at the homepage. Before creating any JSON report, you MUST complete all of the following:
visitedPages arrayIf you stop at the homepage or skip journeys, the test is incomplete and will not be accepted.
visitedPages array in JSON showing every page testedBefore page load assessment:
apple-touch-icon.png exists (for iOS home screen)Image quality assessment across entire site:
Run this script on every page to programmatically flag images whose available pixels are less than the slot needs at the current device pixel ratio. For each flagged image the script returns one of two diagnoses — source (the asset or srcset doesn't offer a large-enough candidate; upload/CMS fix) or markup (the srcset has a large-enough candidate but the page told the browser to pick a smaller one; markup fix, usually sizes). When object-fit: cover/contain is in effect, the script appends a note to the diagnosis so the reader knows the visible image is cropped — but this doesn't suppress the finding, because cover still scales the source up to fill the slot. When the image hasn't loaded yet, the script returns status: "unknown" rather than guessing.
(() => {
const dpr = window.devicePixelRatio || 1;
const results = [];
// Density (`2x`) descriptors are dropped — they don't give absolute pixel counts.
const parseSrcset = (srcset) => {
if (!srcset) return [];
return srcset.split(',')
.map(s => s.trim())
.map(s => {
const m = s.match(/^(\S+)\s+(\d+)w$/);
return m ? { url: m[1], width: parseInt(m[2], 10) } : null;
})
.filter(Boolean)
.sort((a, b) => a.width - b.width);
};
document.querySelectorAll('img').forEach(img => {
const rect = img.getBoundingClientRect();
const renderedW = Math.round(rect.width);
const renderedH = Math.round(rect.height);
if (renderedW === 0 || renderedH === 0) return;
const srcUrl = img.currentSrc || img.src || '';
// SVGs scale losslessly.
if (srcUrl.toLowerCase().endsWith('.svg') || srcUrl.includes('image/svg')) return;
const cs = getComputedStyle(img);
const objectFit = cs.objectFit;
const srcset = img.getAttribute('srcset');
const sizesAttr = img.getAttribute('sizes');
const candidates = parseSrcset(srcset);
const largestCandidate = candidates.length ? candidates[candidates.length - 1].width : null;
const pictureSourceCount = img.parentElement?.tagName === 'PICTURE'
? img.parentElement.querySelectorAll('source[srcset]').length
: 0;
if (!img.complete || img.naturalWidth === 0) {
results.push({
src: srcUrl.split('/').pop().substring(0, 60),
renderedCSS: `${renderedW}x${renderedH}px`,
status: 'unknown',
reason: 'image not yet loaded — re-run after scroll/wait',
alt: (img.alt || '(no alt)').substring(0, 40)
});
return;
}
const naturalW = img.naturalWidth;
const naturalH = img.naturalHeight;
const neededW = renderedW * dpr;
const neededH = renderedH * dpr;
const ratio = Math.min(naturalW / neededW, naturalH / neededH);
if (ratio >= 1.0) return;
const sizesIsAuto = /\bauto\b/i.test(sizesAttr || '');
let diagnosis;
if (!candidates.length) {
diagnosis = {
category: 'source',
explanation: 'No srcset present. The picked src is too small for the slot at this DPR. Investigate the upload (is the original large enough?), the srcset generator (is it producing sized variants?), or the src URL itself (does it point to a small derivative like `?w=485`?).'
};
} else if (largestCandidate >= neededW * 0.95) { // 5% slack absorbs browser candidate-selection rounding (e.g. 768w vs. an 800px slot need).
const sizesNote = sizesIsAuto
? `The \`sizes\` attribute uses \`auto\` (which normally picks based on layout width), so the cause may be subtler: lazy-load timing, an aspect-ratio mismatch under \`object-fit: cover\`, or a \`<picture>\`/\`<source>\` selecting badly. Investigate before recommending a sizes change.`
: `Likely cause: \`sizes\` attribute (${sizesAttr || 'missing'}) under-declares the rendered width (actually ${renderedW}px). Fix: correct \`sizes\` so the browser picks the larger candidate.`;
diagnosis = {
category: 'markup',
explanation: `srcset offers a ${largestCandidate}w candidate (enough for the ${Math.round(neededW)}px slot needs at ${dpr}x DPR), but the browser picked a smaller one. ${sizesNote}`
};
} else {
diagnosis = {
category: 'source',
explanation: `Largest srcset candidate is ${largestCandidate}w; slot needs ${Math.round(neededW)}px at ${dpr}x DPR. Investigate the upload (is the original large enough?) or the srcset generator (is it producing the larger sizes it should?).`
};
}
const objectFitNote = (objectFit === 'cover' || objectFit === 'contain')
? ` object-fit: ${objectFit} is in effect — the visible image is cropped/scaled to fit the slot, but this does not change the underlying resolution problem.`
: '';
let pictureNote = '';
if (pictureSourceCount > 0) {
// <picture> <source srcset> siblings override the img's srcset selection without being read here, so source/markup attribution isn't reliable.
diagnosis.category = 'unknown';
pictureNote = ` This \`<img>\` is inside a \`<picture>\` element with ${pictureSourceCount} \`<source srcset>\` sibling(s) that this script doesn't read — inspect those before attributing the cause, since the loaded image may have come from a \`<source>\` rather than the \`<img>\`'s own \`src\`/\`srcset\`.`;
}
results.push({
src: srcUrl.split('/').pop().substring(0, 60),
naturalSize: `${naturalW}x${naturalH}px`,
renderedCSS: `${renderedW}x${renderedH}px`,
neededForCrisp: `${Math.round(neededW)}x${Math.round(neededH)}px (${dpr}x DPR)`,
resolutionRatio: +ratio.toFixed(2),
status: ratio < 0.75 ? 'flag' : 'needs visual review',
diagnosisCategory: diagnosis.category,
diagnosis: diagnosis.explanation + objectFitNote + pictureNote,
objectFit,
largestSrcsetCandidate: largestCandidate ? `${largestCandidate}w` : 'none',
sizesAttr: sizesAttr || 'missing',
alt: (img.alt || '(no alt)').substring(0, 40)
});
});
return results.length
? results
: 'All images are sufficiently sized for their rendered dimensions';
})()
Reading the results:
status: "flag" (resolutionRatio < 0.75) — the image is being rendered at more than 133% of the picked candidate's natural size. Use the diagnosisCategory to decide what to flag:
source → "the asset (or its srcset) doesn't offer a large-enough candidate." Action: investigate the upload, the srcset generator, and the src URL to find which is the constraint, then fix that one.markup → "the asset is fine; the page told the browser to pick the wrong candidate." Action: fix sizes (or width/srcset) in the block markup. Do NOT recommend re-uploading.unknown → the <img> is inside a <picture> element with <source srcset> siblings that the script doesn't read. The resolution shortfall is real, but either source or markup could be responsible. Action: read the <source> srcsets in the markup before publishing a finding.status: "needs visual review" (0.75 ≤ resolutionRatio < 1.0) — marginal. Note in the report without making a pass/fail call yourself.status: "unknown" — image hadn't loaded when the script ran (lazy-load before scroll, etc.). Scroll the image into view, wait 1-2s, and re-run before reporting.Sanity checks before writing any of this into the report:
objectFit is cover or contain, do NOT claim the image is being "stretched" or "aspect-ratio distorted." The image is being cropped to fit the slot. Aspect-ratio is preserved.diagnosisCategory is markup, do NOT write "the source asset is too small." The source is fine.diagnosisCategory is source, you may want to probe the original upload URL directly (stripping any Photon/CDN ?resize= or ?w= parameters) to confirm the source's true dimensions before writing the finding.Color contrast assessment across entire site:
WCAG 2.2 AA contrast thresholds:
WCAG citations in this skill: Cite a WCAG criterion only for contrast findings —
1.4.3 Contrast (Minimum)for text and1.4.11 Non-text Contrastfor UI components — because contrast is the only accessibility criterion this skill actually measures. For any other accessibility observation (missing alt text, heading structure, keyboard operability, form labels, focus indicators), describe what you see but do not cite a WCAG criterion or claim conformance — recommend running/kosh:a11yfor a full audit. This skill is not an accessibility audit; citing WCAG beyond contrast implies coverage that did not happen.
Run this script on every page to programmatically detect contrast failures on text against resolved backgrounds. Many elements use rgba() or transparent backgrounds, meaning the visible background is actually inherited from an ancestor — the script walks up the DOM to find the first opaque background and composites any semi-transparent layers on top of it. The script cannot measure text overlaid on images or gradients; assess those visually.
(() => {
// Parse an rgb/rgba string into {r, g, b, a}
function parseColor(str) {
const m = str.match(/rgba?\((\d+),\s*(\d+),\s*(\d+)(?:,\s*([\d.]+))?\)/);
if (!m) return null;
return { r: +m[1], g: +m[2], b: +m[3], a: m[4] !== undefined ? +m[4] : 1 };
}
// Composite a semi-transparent foreground over an opaque background
function composite(fg, bg) {
return {
r: Math.round(fg.r * fg.a + bg.r * (1 - fg.a)),
g: Math.round(fg.g * fg.a + bg.g * (1 - fg.a)),
b: Math.round(fg.b * fg.a + bg.b * (1 - fg.a)),
a: 1
};
}
// Walk up the DOM to resolve the effective background color
function resolveBackground(el) {
let layers = [];
let current = el;
while (current) {
const bg = parseColor(window.getComputedStyle(current).backgroundColor);
if (bg) {
layers.push(bg);
if (bg.a === 1) break; // found an opaque layer, stop
}
current = current.parentElement;
}
// If no opaque layer found, assume white
let result = { r: 255, g: 255, b: 255, a: 1 };
// Composite from bottom (most distant ancestor) to top (element itself)
for (let i = layers.length - 1; i >= 0; i--) {
result = composite(layers[i], result);
}
return result;
}
// Relative luminance per WCAG 2.x
function luminance(c) {
const [rs, gs, bs] = [c.r, c.g, c.b].map(v => {
v = v / 255;
return v <= 0.04045 ? v / 12.92 : Math.pow((v + 0.055) / 1.055, 2.4);
});
return 0.2126 * rs + 0.7152 * gs + 0.0722 * bs;
}
// Contrast ratio
function contrastRatio(c1, c2) {
const l1 = luminance(c1), l2 = luminance(c2);
const lighter = Math.max(l1, l2), darker = Math.min(l1, l2);
return +((lighter + 0.05) / (darker + 0.05)).toFixed(2);
}
// Collect elements to check
const selectors = 'a, button, p, h1, h2, h3, h4, h5, h6, span, li, td, th, label, input, select, textarea';
const seen = new Set();
const results = [];
document.querySelectorAll(selectors).forEach(el => {
const text = el.textContent?.trim().substring(0, 40);
if (!text || seen.has(el)) return;
seen.add(el);
const styles = window.getComputedStyle(el);
const textColor = parseColor(styles.color);
const effectiveBg = resolveBackground(el);
if (!textColor || !effectiveBg) return;
const ratio = contrastRatio(textColor, effectiveBg);
const fontSize = parseFloat(styles.fontSize);
const fontWeight = parseInt(styles.fontWeight) || 400;
const isLarge = fontSize >= 24 || (fontSize >= 18.66 && fontWeight >= 700);
const threshold = isLarge ? 3 : 4.5;
if (ratio < threshold) {
results.push({
tag: el.tagName.toLowerCase(),
text: text,
textColor: `rgb(${textColor.r},${textColor.g},${textColor.b})`,
effectiveBg: `rgb(${effectiveBg.r},${effectiveBg.g},${effectiveBg.b})`,
ratio: ratio,
threshold: threshold,
fontSize: fontSize + 'px',
fontWeight: fontWeight,
isLarge: isLarge
});
}
});
return results.length ? results : 'All checked elements meet contrast thresholds';
})()
Reading the results:
"All checked elements meet contrast thresholds" — no programmatic failures. Still visually assess text overlaid on images, gradients, or video, which the script cannot measure.Sanity checks before writing any of this into the report:
Initial Page Load & Above-the-Fold:
Initial Design Observations:
You will execute 3 distinct user journeys. For each journey, follow the steps below while testing the functional QA items listed in each category.
Simulate a new visitor exploring core information
Steps:
Testing Steps for Each Page:
On desktop (1920px) viewport:
Spacing & Layout:
Typography & Text:
Visual Consistency:
On desktop (1920px) viewport:
Images & Media:
resolutionRatio below 0.75 as issues; note anything between 0.75 and 1.0 as "needs visual review" without making a pass/fail callOn this page:
Link Accuracy Testing:
Link Status & Health:
On this page (check at least 2-3 pages total):
Security & Infrastructure (check on homepage):
WordPress-Specific Checks (if WordPress site - visual verification only):
On this page (check footer on at least homepage and one other page):
Footer Credits & Attribution:
Additional Footer Items:
Simulate a user interested in the main offerings
Steps:
Additional Focus:
Simulate a user seeking help or learning resources
Steps:
You MUST complete all items below before creating the JSON report. This checklist confirms all requirements have been met.
__________________________________________ (e.g., /about)_____________________ (e.g., /products)_____________________ (if deeper navigation exists)_____________________ (e.g., /blog or /support)_____________________ (optional, if cross-page analysis required more pages)Minimum pages to complete: 4-6. You have listed _____ pages. (Must be 4 or more)
Journey #1 (Main Navigation Flow) completed on: _____________________
Journey #2 (Product/Service Exploration) completed on: _____________________
Journey #3 (Support/Information Discovery) completed on: _____________________
Section 3.1 - Design Consistency Across Pages: ✅ Completed
Section 3.2 - Mobile Responsiveness Spot Check: ✅ Completed
Section 3.3 - Content Quality & Consistency: ✅ Completed
Section 3.4 - Footer & Attribution Consistency: ✅ Completed
Section 3.5 - Mobile Compatibility Check: ✅ Completed
If any checkbox is unchecked, DO NOT proceed to JSON report generation. Return to that section and complete it before continuing.
After completing all journeys, perform these comparative tests:
Typography Comparison:
Color & Branding:
Layout & Spacing:
Navigation Patterns:
On 2-3 key pages:
Across all visited pages:
Check across all visited pages:
Mobile (375px) verification:
STOP. Have you completed the MANDATORY TESTING CHECKLIST above?
If you answered NO to any of these, STOP. Return to Section 2 and complete the missing journeys and sections before proceeding.
If you answered YES to all of these, proceed to data collection below.
As you perform testing in Sections 1-3, collect the following data:
For metadata section:
ogTitle - Extract from <meta property="og:title">ogDescription - Extract from <meta property="og:description">ogImage - Extract from <meta property="og:image">ogUrl - Extract from <meta property="og:url">ogType - Extract from <meta property="og:type"> (usually "website")twitterCard - Extract from <meta name="twitter:card"> (may be null)twitterTitle - Extract from <meta name="twitter:title"> (may be null)twitterDescription - Extract from <meta name="twitter:description"> (may be null)twitterImage - Extract from <meta name="twitter:image"> (may be null)General metadata:
url - Homepage URLwebsiteName - Website name (extract from og:title or page title)timestamp - ISO 8601 format (e.g., 2025-11-19T23:58:45Z)For mobile and desktop sections, collect:
viewport - Record actual dimensions (e.g., "375x812" for mobile, "1920x1080" for desktop)title - Page title from <title> tagurl - Current page URLloadTime - Actual page load time in milliseconds"links": [
{"text": "Link text", "href": "URL or path"},
...
]
"images": [
{"src": "image URL", "alt": "alt text or description"},
...
]
<img> elementsloading="lazy" (these are intentional, not broken)"headings": [
{"tag": "h1", "text": "Heading text"},
{"tag": "h2", "text": "Heading text"},
...
]
focusableElements - Count of keyboard-navigable elements (buttons, links, form fields)For issues section, categorize all findings:
"issues": {
"critical": [
{
"category": "Category name",
"issue": "Brief description",
"impact": "User-facing impact",
"device": "mobile|desktop|both",
"pages": ["https://example.com/page1"],
"screenshots": ["screenshots/example-finding.png"]
}
],
"high": [...],
"medium": [...],
"low": [...]
}
screenshots field (optional but strongly encouraged):
When a finding is visual — broken layout, low-contrast text, design inconsistency, broken UI element, mis-rendered image — attach the relevant screenshot(s) so the HTML report can embed them inline next to the finding. The path should be relative to the reports/ directory (e.g., screenshots/homepage-desktop-atf.png for a file saved at reports/screenshots/homepage-desktop-atf.png). You can attach multiple screenshots per finding (e.g., desktop + mobile views of the same issue, or before/after pairs). Skip the field for findings where a screenshot wouldn't add information (e.g., missing meta tags, broken hrefs that aren't visually distinct).
Issue categories for this test:
Priority assignment:
After testing, structure your data into reports/data/qa-report-functional.json using this format:
{
"url": "https://example.com",
"websiteName": "Example",
"timestamp": "YYYY-MM-DDTHH:MM:SSZ",
"testMethodology": "Manual QA testing following multi-page user journey simulation...",
"visitedPages": [
"https://example.com/",
"https://example.com/about/",
"https://example.com/products/",
"https://example.com/blog/",
"https://example.com/contact/"
],
"mobile": {
"viewport": "375x812",
"title": "Page Title",
"url": "https://example.com",
"loadTime": 1800,
"links": [],
"images": [],
"headings": [],
"focusableElements": 42
},
"desktop": {
"viewport": "1920x1080",
"title": "Page Title",
"url": "https://example.com",
"loadTime": 1800,
"links": [],
"images": [],
"headings": [],
"focusableElements": 58
},
"metadata": {
"ogTitle": "...",
"ogDescription": "...",
"ogImage": "...",
"ogUrl": "...",
"ogType": "website",
"twitterCard": null,
"twitterTitle": null,
"twitterDescription": null,
"twitterImage": null
},
"links": [
{"text": "Link", "href": "URL", "ok": true, "status": 200}
],
"issues": {
"critical": [],
"high": [],
"medium": [],
"low": []
}
}
IMPORTANT:
visitedPages must contain ALL pages you tested, not just the homepagetimestamp must be a real ISO 8601 timestamp from actual test execution (e.g., 2025-11-19T23:58:45Z), not a placeholderIf generate-report.js is available in the project:
scripts/run-qa-report.sh reports/data/qa-report-functional.json
Or to merge with performance and accessibility reports:
scripts/merge-qa-reports.sh reports/data/qa-report-functional.json reports/data/qa-report-performance.json reports/data/qa-report-accessibility.json
loading="lazy" for performance optimizationFor accessibility items below (missing alt text, forms without labels, missing H1), describe the issue but don't cite a WCAG criterion — see the WCAG-citation guardrail in §1.5; cite WCAG only for contrast.
Before you submit your JSON report, ask yourself:
visitedPages array showing multiple pages? (Yes = Good. No = Incomplete)If you cannot answer YES to all 5 questions, do NOT generate the JSON report. Return to Section 2 and complete the multi-page testing first.
npx claudepluginhub a8cteam51/koshScans the codebase for `ponytail:` comments and compiles a debt ledger of deliberate shortcuts and deferrals, flagging entries with no upgrade path.