From design-system-ops
Audits design system theme coverage and consistency across semantic and component tokens, identifying missing values, propagation failures, resolver gaps, and internal violations. Use when launching themes, rebrands, or dark mode.
npx claudepluginhub murphytrueman/design-system-opsThis skill uses the workspace's default tool permissions.
A skill for auditing theme coverage and visual consistency across multiple design system themes. Identifies tokens missing from specific themes, component tier propagation failures, internal consistency violations within each theme, DTCG resolver coverage gaps, and components likely to break on theme switches. Produces a theme coverage report with severity-rated findings.
Audits design system token definitions for naming violations, missing semantic tiers, and structural debt in tiered architectures. Generates severity-rated findings and prioritized remediations.
Audits design token usage in code and design files for consistency, coverage, gaps, and hard-coded values. Generates reports with prioritized recommendations.
Detects design systems in code, identifies token drifts with paired evidence blocks showing definitions vs conflicting usages. Use for auditing UI consistency.
Share bugs, ideas, or general feedback.
A skill for auditing theme coverage and visual consistency across multiple design system themes. Identifies tokens missing from specific themes, component tier propagation failures, internal consistency violations within each theme, DTCG resolver coverage gaps, and components likely to break on theme switches. Produces a theme coverage report with severity-rated findings.
Theming is where the three-tier token architecture proves its value or reveals its failures. When a system switches themes correctly, the change ripples through every component that references the semantic tier. When it does not — when components hardcode primitives or when the semantic tier is incomplete — a theme switch becomes a hunt through hundreds of files for missed overrides.
A theme audit is not about validating a single theme's visual appearance. It is about ensuring every token consumed by every component exists and is correctly defined across every theme the system claims to support. It is about catching cases where the component tier skips the semantic tier entirely, making theme switches invisible to that component.
The audit surfaces three categories of problems: coverage gaps (token defined in Theme A but not Theme B), architectural failures (component tokens that bypass the semantic tier), and internal consistency breaks (within a single theme, visual logic is violated — e.g. in dark mode, surface colours are lighter than background).
Before producing output, check for a .ds-os-config.yml file in the project root. If present, load:
system.theming — if false, exit early with a note that this skill applies only to systems with theming enabled. If true, proceed.severity.* — overrides for theme-specific findings (e.g. missing_theme_value: critical for a system about to launch dark mode)integrations.style_dictionary — if enabled, parse tokens via Style Dictionary v4 to auto-extract all semantic and component tokens and their resolver-defined mode valuesintegrations.figma — if enabled, pull Figma variables and their modes as the theme sourcerecurring.* — if this is a recurring run, load the previous theme audit for trend comparisonIf no config file exists, proceed with defaults and ask for manual input.
Before auditing, discover what themes the system actually defines:
Discover themes:
.resolver.json files and extract mode names (e.g. light, dark, brand-a, brand-b):root, .dark, [data-theme="light"], [data-theme="dark"], [data-brand="brand-a"] — each scope is a theme variant$themes: (...) or separate theme files (_theme-light.scss, _theme-dark.scss)export const lightTheme = { ... }; export const darkTheme = { ... })tailwind.config.js or tailwind.config.ts for darkMode configuration and any theme extendsPresent discovered themes to user:
Produce a brief inventory:
Themes discovered:
- Light (default, CSS root scope, Figma mode)
- Dark (CSS .dark scope, Figma mode)
- Brand A (data-theme="brand-a" scope)
- Brand B (data-theme="brand-b" scope)
Total: 4 themes
Ask: "I found these [N] themes. Should I audit all of them, or focus on specific variants?"
If no themes are discovered and theming is marked as true in config, ask the user to name the themes they intend to support.
Gather the semantic and component tiers across all discovered themes:
For DTCG format:
For CSS custom properties:
:root (or default theme scope) as the reference set of all semantic tokens.dark, [data-theme="dark"], etc.) and their token definitionsFor SCSS variables:
For JavaScript theme objects:
For Tailwind:
theme and darkMode blocksOutput a token inventory:
Semantic tokens: 156 total
- Defined in all themes: 144
- Defined in light only: 5
- Defined in dark only: 4
- Defined in brand-a only: 3
Component tokens: 287 total
- Defined in all themes: 278
- Coverage gaps: 9
This checkpoint reveals the scale of coverage problems before the detailed audit.
For each semantic token, verify it is defined across ALL themes:
Coverage matrix:
Build a matrix: rows are semantic tokens, columns are themes. Mark each cell as:
Identify missing semantic tokens per theme:
Light theme: 156/156 semantic tokens ✓
Dark theme: 152/156 semantic tokens (missing: --color-feedback-info, --color-feedback-warning, --spacing-dense-gap, --text-decorative-label)
Brand A: 151/156 (missing: --color-feedback-warning, --spacing-dense-gap, --text-decorative-label, --text-heading-display)
Brand B: 156/156 ✓
For each missing token, flag:
Verify that component tokens correctly inherit from the semantic tier across all themes:
For each component token:
Tier leakage detection:
Flag any component token that references a primitive (rather than a semantic token):
TC-10 | Critical | Tier leakage | button.background.default references {color.blue.500} (primitive) instead of semantic tier
- Impact: Button background will not change on theme switch (dark mode will show blue on blue)
- Recommended action: Redefine as button.background.default: {color.action.primary}
Quantify the scope:
Component tokens examined: 287
- Correctly reference semantic tier: 276
- Tier leakage (reference primitives): 11
Tier leakage is the most dangerous category of theme bug — everything appears to work until someone activates a new theme.
For each theme, validate internal logical consistency:
Consistency rules (apply per theme, not across themes):
For light theme:
For dark theme:
For brand variants:
Consistency violations to flag:
Run visual spot-checks on high-impact token groups:
color.feedback.error have adequate contrast against color.background.default in this theme?color.surface.primary and color.background.default visually distinct (same value is sometimes OK, but should be documented as intentional)?color.action.primary visually more prominent than color.action.secondary in this theme?Flag violations:
TC-22 | Medium | Consistency | Dark theme: color.background.default and color.surface.primary are identical (#121212)
- This may be intentional (both are neutral backgrounds), but it reduces visual hierarchy
- Recommended action: Review with design team. If intentional, document the decision. If not, adjust surface token.
If the system uses DTCG format with resolver files:
Resolver file structure check:
.resolver.json files exist and are well-formed JSONsets and modes blockssets has a value defined for every declared modeMode coverage per token:
For each semantic token:
color.action.primary:
light: {color.blue.500} ✓
dark: {color.blue.300} ✓
brand-a: {color.purple.600} ✓
brand-b: [missing] ✗
Flag missing mode values:
TC-30 | Critical | Resolver coverage | color.action.primary missing value in brand-b mode
- When brand-b theme is active, color.action.primary will resolve to light mode default (fallback)
- Recommended action: Add mode-specific value to brand-b mode in resolver
Set composition check:
Orphaned tokens in resolver files are maintenance burden — they appear in IDE autocomplete but produce no runtime value.
Identify patterns in the codebase that are likely to break on theme switch:
Regression patterns:
Search for common failures:
rgba(var(--color-action-primary), 0.5) work for colour tokens but may fail if the theme switches the token value to a different formatpadding: calc(var(--spacing-component-gap) * 2) assume the token value is a plain number; if the token includes units, the calc failsstyle={{ backgroundColor: isDark ? darkColor : lightColor }} is not using the token system at allRegression output:
Flag high-risk patterns:
TC-40 | High | Regression | Found 23 instances of hardcoded hex values in component code
- These will NOT change on theme switch even though token values exist
- Recommended action: Replace hardcoded values with token references
TC-41 | Medium | Regression | Found 7 instances of opacity hacks (rgba with token + literal opacity)
- These work but are fragile if token format changes
- Recommended action: Create composite opacity tokens (e.g., color.action.primary-75%) if opacity variations are needed
Structure the report as follows:
Date: [date] Scope: [themes audited] System theming enabled: [yes/no from config]
Overall theme health. What is the most significant gap? Is coverage consistent across themes, or are some themes neglected? Are component tokens correctly inheriting from semantic tier?
One paragraph. Honest about severity.
List all themes discovered and confirmed in scope:
| Category | Total | Coverage |
|---|---|---|
| Semantic tokens | 156 | 152/156 in all themes |
| Component tokens | 287 | 278/287 correctly reference semantic tier |
| Tier leakage instances | — | 9 component tokens reference primitives |
Semantic token coverage matrix:
| Token | Light | Dark | Brand A | Brand B |
|---|---|---|---|---|
| color.action.primary | ✓ | ✓ | ✓ | ✓ |
| color.feedback.info | ✓ | ✗ | ✗ | ✓ |
| spacing.dense.gap | ✓ | ✗ | ✗ | ✓ |
For each missing token, include:
Tier propagation:
Tier leakage violations:
For each violation:
Light theme consistency: [Pass/Warn/Fail] Dark theme consistency: [Pass/Warn/Fail] Brand variant consistency: [Pass/Warn/Fail]
List any violations:
Resolver files found: [count and paths] Mode coverage: [summary of mode-to-token coverage] Orphaned tokens: [count and examples] Missing mode values: [count and severity by theme]
Include resolver-specific findings with severity ratings.
Hardcoded values in components: [count by severity] Opacity hacks: [count and pattern examples] Calc() on tokens: [count and contexts] Missing theme-specific variants: [count and affected components]
Each category should include:
Tier 1 — Fix immediately:
Tier 2 — Fix before next theme launch:
Tier 3 — Address in polish phase:
If system.component_count in config is < 5, or if component count is inferred to be small from theme coverage:
"This is a small system. Component-tier propagation problems (tier leakage) have outsize impact because each component token affects user-facing surfaces directly. Prioritise tier leakage findings even if absolute violation count is low."
If recurring is configured in .ds-os-config.yml:
recurring.output_directory for a previous theme audit reportrecurring.output_directory using the recurring.naming_patternrecurring.retain_countIf no previous report exists, note "This is the baseline theme audit. Trend analysis will be available from the next run."