From radar-suite
Aggregates 5 companion radar skills plus 5 grep domains for unified A-F codebase grading and ship/no-ship decisions. Tracks velocity, diffs audits, and celebrates improvements.
npx claudepluginhub terryc21/radar-suite --plugin radar-suiteThis skill uses the workspace's default tool permissions.
Capstone Radar is the **aggregator + gap filler** for the 5-skill radar family. It consumes findings from 4 companion skills, runs its own scans for 5 domains the companions don't cover, grades everything on one unified scale, and makes the ship/no-ship decision.
Orchestrates radar audit skills: run full sequence or targeted via git changes/directories, manage unified findings ledger, verify fixes, link issues.
Consolidates outputs from dev, design, writing, product, and UX detect skills into unified scored index, risk heatmap, and unknowns backlog for audit combination.
Runs comprehensive codebase audits with mechanical verification (build, lint, tests, secrets scan, git status) and specialist reviewers, producing scored reports across 7+ axes. Quick modes skip reviewers.
Share bugs, ideas, or general feedback.
Capstone Radar is the aggregator + gap filler for the 5-skill radar family. It consumes findings from 4 companion skills, runs its own scans for 5 domains the companions don't cover, grades everything on one unified scale, and makes the ship/no-ship decision.
It does NOT:
| Command | Description |
|---|---|
/capstone-radar | Full analysis โ checks everything and reads companion handoffs |
/capstone-radar quick | Quick check โ own domains only, ignores companion results |
/capstone-radar report | No scanning, re-grade from existing handoff files only |
/capstone-radar diff | Compare against previous audit โ show resolved/new issues |
--trust-all | Override staleness decay -- treat all companion handoffs as Fresh |
--show-suppressed | Show findings suppressed by known-intentional entries |
Audit comparison โ compare current codebase against a previous audit to show what changed.
/capstone-radar diff
/capstone-radar diff 2026-03-15
.agents/research/*-capstone-audit.mdAudit Diff: 2026-03-15 โ 2026-03-29
Summary:
โ
Resolved: 8 issues fixed since last audit
๐ด Still Open: 3 issues remain
๐ New: 2 new issues detected
๐ Changed: 4 files modified (may need re-verification)
Grade Trend:
Overall: B [83] โ B+ [87] โ
Code Hygiene: B- โ A- โ
Security Basics: A โ A (stable)
Test Health: C โ C+ โ
Resolved Issues:
| # | Finding | Was | File | Resolved By |
|---|---------|-----|------|-------------|
| 1 | TODO marker in production | ๐ก HIGH | CloudSyncManager.swift:142 | Commit abc123 |
Still Open:
| # | Finding | Urgency | File | Age |
|---|---------|---------|------|-----|
| 1 | Force unwrap in error path | ๐ก HIGH | BackupManager.swift:89 | 14 days |
New Issues:
| # | Finding | Urgency | Risk: Fix | Risk: No Fix | ROI | Blast Radius | Fix Effort | Status |
If 3+ previous audits exist, show trend line:
Grade Velocity (last 5 audits):
Build 21: C+ [78]
Build 22: B- [80]
Build 23: B [84]
Build 24: B+ [87]
Build 25: B+ [88] โ current
Trend: Improving (+2.5 pts/build avg)
Projection: A- by Build 28 at current rate
On first invocation, ask the user two questions in a single AskUserQuestion call:
Question 1: "What's your experience level with Swift/SwiftUI?"
Question 2: "Which audit mode?"
Question 3: "Include user impact explanations?"
Can also be toggled mid-session with --explain / --no-explain. See skills/shared/rating-system.md "User Impact Explanations" for format and rules.
Experience-adapted explanations for Capstone Radar:
Beginner: "Capstone Radar is the final check before your app goes to the App Store. It combines results from 4 other audit tools (if you've run them) with its own security, testing, and code quality checks, then gives your whole app a letter grade and tells you if it's safe to ship. Think of it as the building inspector who reviews all the specialist reports plus checks the things no one else covered."
Intermediate: "Capstone Radar aggregates findings from 4 companion skills (data-model-radar, ui-path-radar, roundtrip-radar, ui-enhancer-radar) and adds its own scans for security, test health, code hygiene, dependencies, and build health. It grades all 10 domains on one scale, tracks trends across runs, and makes a ship/no-ship recommendation."
Experienced: "Aggregator + gap filler for the radar family. Consumes 4 companion handoffs, owns 5 grep-reliable domains, unified A-F grading, velocity tracking, risk heatmap, ship/no-ship decision."
Senior/Expert: "5 owned + 4 consumed domains. Velocity. Heatmap. Ship/no-ship."
Store the experience level as USER_EXPERIENCE and apply to ALL output for the session.
See radar-suite-core.md for: Tier System, Pipeline UX Enhancements, Table Format, Plain Language Communication, Work Receipts, Contradiction Detection, Finding Classification, Audit Methodology, Context Exhaustion, Progress Banner, Issue Rating Tables, Handoff YAML schema, Known-Intentional Suppression, Pattern Reintroduction Detection, Experience-Level Output Rules, Implementation Sort Algorithm, short_title requirement.
Read .radar-suite/session-prefs.yaml for the tier field. Capstone behavior adapts based on the active tier:
All 5 companion skills are expected. If a handoff is missing, treat it as an error (not "not audited"):
ERROR: data-model-radar handoff missing. Full pipeline requires all 5 companions.
Options: [Re-run data-model-radar / Continue without it (grade will be partial)]
Before starting own scans, emit the Pre-Capstone Summary (see radar-suite-core.md Pipeline UX Enhancements #5):
RS-NNN (short_title)Read tier_skills from session prefs to know which skills were in the subset. Only expect handoffs for those skills. For skills not in the subset:
Weight redistribution: redistribute missing domain weights proportionally to audited domains (same as existing missing-handoff logic).
Existing behavior. Missing handoffs show "Not audited -- run [skill-name] for coverage."
Known-intentional check: Read .radar-suite/known-intentional.yaml (if exists). Store as KNOWN_INTENTIONAL. Before presenting any finding from own domain scans, check it against these entries. If file + pattern match, skip silently and increment intentional_suppressed counter. Companion findings that were suppressed at the companion level are already excluded.
Pattern reintroduction check: Read .radar-suite/ledger.yaml for status: fixed findings with pattern_fingerprint and grep_pattern. For each, grep the codebase. If the pattern appears in a new file without the exclusion_pattern, report as "Reintroduced pattern" at ๐ก HIGH urgency.
Experience-level auto-apply: If USER_EXPERIENCE = Beginner, auto-set EXPLAIN_FINDINGS = true and default sort to impact. If Senior/Expert, default sort to effort. Apply all output rules from Experience-Level Output Rules table in radar-suite-core.md.
Collect:
Glob pattern="**/*.swift" (exclude Tests, Pods, .build, DerivedData)wc -l for accuracygit shortlog -sn --no-merges | wc -l. Store as TEAM_SIZE.Ensure output directories exist: mkdir -p .agents/research .agents/ui-audit
Print one-line summary:
Project: {files} Swift files (~{loc} LOC) | {arch} | {persistence} | Tests: {unit}/{ui} | Build: {number} | Contributors: {count}
Glob pattern=".agents/research/*-capstone-audit.md"
Glob pattern=".agents/research/*-codebase-audit.md"
If previous reports exist, parse ALL GRADES_YAML HTML comment blocks for velocity tracking. Build an array of dated snapshots sorted by date. Store as HISTORY.
Read the unified ledger and handoff files. Which files to read depends on the active tier (see Tier Awareness above). In Tier 2, read only handoffs for skills listed in tier_skills. In Tier 3, read all 5. In Tier 1/standalone, read whatever exists.
Read .radar-suite/ledger.yaml (if exists) โ unified cross-skill finding store
Read .agents/ui-audit/data-model-radar-handoff.yaml (if exists)
Read .radar-suite/time-bomb-radar-handoff.yaml (if exists)
Read .agents/ui-audit/ui-path-radar-handoff.yaml (if exists)
Read .agents/ui-audit/roundtrip-radar-handoff.yaml (if exists)
Read .agents/ui-audit/ui-enhancer-radar-handoff.yaml (if exists)
Ledger integration: When the ledger exists, use it as the primary source of cross-skill findings. Handoff YAMLs provide per-skill detail (serialization coverage, cross-cutting patterns) that the ledger doesn't store. Capstone should reference findings by RS-NNN ID when available.
For each handoff found:
for_capstone_radar.blockers[] โ extract finding and urgency (CRITICAL or HIGH)for_release_ready_radar.blockers[] keyserialization_coverage (model field coverage stats)cross_cutting_patterns[] (patterns affecting multiple workflows)Score consumed domains: Start at 95 (generous baseline โ companion ran a full audit and only escalated the worst items). Deduct:
Missing handoffs (tier-dependent messaging):
tier_skills, treat as error. If not, mark "Not in scope" (expected absence).In all cases, weight is redistributed proportionally to audited domains.
Print companion status:
Companions: data-model-radar [found/missing] | ui-path-radar [found/missing] | roundtrip-radar [found/missing] | ui-enhancer-radar [found/missing]
Skip this step if mode = Quick.
When a companion handoff is found, check its audit_depth and _verified flags (if present):
audit_depth: full + all targets _verified: true โ Full credit (start at 95, deduct per blocker)audit_depth: partial or some targets _verified: false โ Reduced credit (start at 85, deduct per blocker). Note in report: "Partial audit โ [domain] grade is provisional."Do not give full credit for unverified companion work. A handoff that says "backup verified, CSV not checked" should not produce an A for the Model Layer domain. The unverified targets represent unknown risk.
After assessing quality, calculate staleness for each companion handoff:
staleness_score = (days_since_handoff * 0.3) + (commits_since_handoff * 0.1)
Where:
days_since_handoff = calendar days since the handoff's timestamp fieldcommits_since_handoff = output of git rev-list --count <handoff_commit>..HEAD (use git log --oneline --since="<timestamp>" if commit hash unavailable)| Score | Label | Behavior |
|---|---|---|
| 0-2 | Fresh | Full trust -- use handoff as-is |
| 2-5 | Aging | Trust grades, but spot-check HIGH findings (read the file, verify pattern still exists) |
| 5-10 | Stale | Downgrade companion grades by one letter. Spot-check all CRITICAL + HIGH findings. |
| 10+ | Expired | Ignore handoff entirely. Recommend re-running the companion skill. |
Freshness summary (print in companion status output):
Companion freshness:
data-model-radar: Fresh (1 day, 3 commits)
ui-path-radar: Aging (8 days, 12 commits)
roundtrip-radar: Stale (22 days, 47 commits) โ grades downgraded by one letter
ui-enhancer-radar: Expired (35 days, 89 commits) โ handoff ignored, re-run recommended
Spot-check protocol (for Aging and Stale handoffs):
possibly-resolved (stale handoff) and exclude from grade calculationre-verified and keep the finding--trust-all flag: Overrides staleness calculation. All companions treated as Fresh regardless of age. Use when you know nothing has changed since the last audit.
Stale/Expired companions are flagged in the "Next Steps" section of the report.
New in v1.1 (axis framework). Every companion handoff now includes an
axis_summaryblock and every finding includes anaxisfield. Capstone splits findings by axis before grading: only axis_1 findings count toward the A-F grade. axis_2 and axis_3 findings land in a separate Hygiene Backlog section.
For each companion handoff found in Step 3, read:
axis_summary (top-level block) โ counts by axischecks_performed (top-level block) โ what the companion scannedaxis (per-finding field) โ classification for each findingbefore_after_experience.audience (per-finding field) โ surface in the reportSplit findings by axis into two buckets:
axis == axis_1_bugaxis in (axis_2_scatter, axis_3_dead_code, axis_3_smelly)Verify schema gate compliance. For each finding in either bucket, confirm it has:
better_approach field[A-Za-z0-9_/+.-]+\.swift:\d+ in the better_approach bodypattern_citation_lookup entry in verification_logbefore_after_experience with audience setFindings failing the gate are still displayed but are tagged coaching incomplete and excluded from the top-10 prioritized list. This makes bad coaching visible rather than silently dropped.
Aggregate checks_performed across all companions into a single "Audit Coverage" block:
Audit Coverage:
source_roots: Sources/
files scanned: 588 (ui-path), 588 (roundtrip), 120 (capstone own)
patterns ran: reachability_trace, whole_file_scan, branch_enumeration, pattern_citation_lookup
patterns skipped: source_root_introspection (single-root project)
Count rejected_no_citation across all handoffs. If any radar rejected findings at the gate, surface this prominently โ it means the radar had candidate findings it could not coach.
Only axis_1_bug findings affect the A-F grade. axis_2 and axis_3 findings are NOT counted for grading. The grade cap rules from radar-suite-core.md apply only to axis_1 findings:
A codebase with 0 axis_1 findings but 50 axis_2 findings can still grade A. Hygiene debt does not block shipping.
Handoffs from older radar versions (pre-v1.1) lack the axis_summary block and the per-finding axis field. For these:
axis_1_bug (the pre-v1.1 default behavior)"ui-enhancer-radar handoff is pre-axis (v1.0.x); all findings counted as axis_1"Before running grep patterns, determine where to go deep vs quick. The default behavior is to run all grep patterns with equal effort โ but some domains have more risk than others based on context.
Signal 1: Prior findings.
Read HISTORY from Step 2. If previous audits scored a domain below B, go deep on verification for that domain this time. Also check memory for known issues (e.g., prior session found test health concerns โ verify test patterns more carefully).
Signal 2: Companion handoff gaps. If a companion handoff flagged issues that touch an owned domain (e.g., data-model-radar found security-relevant patterns), escalate that owned domain to deep verification.
Signal 3: Codebase size signals. If Step 1 found >500 Swift files, Security Basics grep will produce many candidates. Plan for heavier verification time on that domain. If test-to-source ratio is below 0.2, Test Health needs deep investigation, not just file counting.
Before Step 4, print:
Risk Ranking for Own Domains:
DEEP VERIFY:
- [domain] (reason: [signal])
- [domain] (reason: [signal])
STANDARD:
- [domain]
- [domain]
LOW RISK:
- [domain]
Deep verify means: read every grep candidate in context (20+ lines), classify each as CONFIRMED/FALSE_POSITIVE/INTENTIONAL. Don't report counts โ report classified findings.
Standard means: read a sample of grep candidates (top 5 per pattern), classify those, extrapolate. Label as (sampled) in the report.
Low risk means: report counts with spot-check of top 3 candidates. Label as (spot-checked) in the report.
Run grep-based scans for the 5 domains capstone owns. Apply Verification Rule (Step 5) to every hit. Follow the risk-ranking from Step 3.5 โ deep-verify domains go first and get full candidate classification.
Code Hygiene grep-sufficient:
Grep pattern="// TODO|// FIXME|// HACK|// XXX" glob="**/*.swift" output_mode="count"
Grep pattern="as!" glob="**/*.swift"
Grep pattern="try!" glob="**/*.swift"
Also check file sizes: Glob pattern="**/*.swift" then read line counts for files that appear large. Flag files >1000 lines.
Test Health grep-sufficient:
Grep pattern="import Testing" glob="**/*Test*.swift" output_mode="count"
Grep pattern="import XCTest" glob="**/*Test*.swift" output_mode="count"
Grep pattern="Thread\.sleep|sleep\(" glob="**/*Test*.swift"
Glob pattern="**/*Tests.swift"
Glob pattern="**/*Test.swift"
Calculate test-to-source ratio: test files / (total swift files - test files).
Security Basics grep-sufficient:
Grep pattern="(api[_-]?key|secret[_-]?key|password|token)\s*[:=]\s*[\"'][^\"']+[\"']" glob="**/*.swift" -i
Grep pattern="UserDefaults.*\.(password|token|secret|apiKey)" glob="**/*.swift" -i
Grep pattern="http://(?!localhost|127\.0\.0\.1)" glob="**/*.swift"
Dependency Health grep-sufficient:
stat -f "%Sm" -t "%Y-%m-%d" Package.resolved 2>/dev/null || echo "no Package.resolved"
Grep pattern="\.exact\(|\.upToNextMajor\(|\.upToNextMinor\(|from:" path="Package.swift" output_mode="content"
Count packages in Package.resolved.
Build Health grep-sufficient:
find . -name "*.xcscheme" -not -path "*/.build/*" | wc -l
Grep pattern="targets?:" path="Package.swift" output_mode="content"
Before grading each owned domain, produce this table:
| Pattern | Grep Run? | Hits | Classified | Confirmed | False Positive | Receipt |
|---------|-----------|------|------------|-----------|----------------|---------|
| TODO/FIXME | ? | | | | | |
| try! | ? | | | | | |
| as! | ? | | | | | |
| hardcoded secrets | ? | | | | | |
| http:// URLs | ? | | | | | |
Rules:
? means the grep wasn't run yet โ cannot grade until all are filledStart at 100 points. Deduct per CONFIRMED finding:
| Severity | HIGH Conf | MED Conf | LOW Conf |
|---|---|---|---|
| CRITICAL | -15 | -10 | -3 |
| HIGH | -8 | -5 | -1 |
| MEDIUM | -3 | -2 | 0 |
| LOW | -1 | -1 | 0 |
Grep patterns produce candidates, NOT confirmed issues. Before reporting:
Common false positives:
try! in test setup code โ acceptable in testshttp:// in XML namespaces or comments โ not an endpointsleep() in test helpers for async settling โ may be intentionalTODO markers that reference completed work โ stale commentsas! in exhaustive switch cases โ compiler-guaranteed safeCollect ALL file paths from ALL findings (own 5 domains + companion handoffs). Count domain appearances per file. Report files appearing in 3+ domains:
Risk Heatmap โ Files with findings in 3+ domains:
{file1} โ {domain1}, {domain2}, {domain3} ({N} domains)
{file2} โ {domain1}, {domain2}, {domain3} ({N} domains)
These {N} files account for {X}% of all findings.
If TEAM_SIZE > 1:
git log --format="%an" --since="30 days ago" -- {file} | sort -u to find recent contributors{file} โ {N} domains, {M} contributors this month, {owner or "no clear owner"}If roundtrip-radar handoff includes cross_cutting_patterns[], incorporate them. Flag patterns that affect 3+ workflows.
Build a dependency graph across ALL findings (own + companion) for --sort implement ordering.
Scan all findings for depends_on and enables fields. These may be populated by individual skills (within-skill dependencies) or inferred here (cross-skill dependencies).
Apply these rules to infer dependencies that individual skills could not see:
enables A.enables the behavioral one.depends_on entry and each enables entry. Auto-inferred edges are marked inferred: true.Fix RS-014 first (enables RS-015, RS-016)
Fix RS-003 and RS-007 (independent, parallel-safe)
This graph is consulted when the user selects --sort implement. It also informs the "relationship-aware reporting" in Step 10 (root cause findings first, symptoms indented beneath).
| Domain | Source | Weight |
|---|---|---|
| Code Hygiene | Own scan | 10% |
| Test Health | Own scan | 15% |
| Security Basics | Own scan | 15% |
| Dependency Health | Own scan | 5% |
| Build Health | Own scan | 5% |
| Model Layer | data-model-radar handoff | 10% |
| Navigation/UX | ui-path-radar handoff | 10% |
| Data Safety | roundtrip-radar handoff | 15% |
| Visual Quality | ui-enhancer-radar handoff | 10% |
| Cross-Domain Risk | Correlation analysis | 5% |
Weight redistribution: When domains are unaudited (missing companion handoff), divide their total weight proportionally among audited domains. Example: if Data Safety (15%) and Visual Quality (10%) are missing, remaining 75% becomes 100%. Code Hygiene's 10% becomes 10/75 = 13.3%.
1. Label the scope. The overall grade line must state how many domains were audited:
Overall: B+ [87] (6/10 domains audited โ 4 companion domains missing)
Not just Overall: B+ [87].
2. Label the depth. Each owned domain grade must state its verification depth:
Code Hygiene: A [96] (deep-verified โ all candidates classified)
Test Health: A [94] (sampled โ top 5 per pattern classified)
Build Health: A+ [100] (spot-checked โ 3 candidates verified)
3. Distinguish hygiene from quality. The 5 owned domains are surface hygiene checks. They catch "did you leave a hardcoded password" but NOT "does the backup restore lose 7 fields." When companion domains are missing, add this disclaimer:
Note: This grade covers hygiene domains only. Logic bugs, data safety,
navigation issues, and visual quality are not assessed without companion
skill handoffs. Prior audits found [N] bugs in unaudited domains.
Check HISTORY for the count of prior findings in unaudited domains. If no history, say "unknown number of bugs may exist."
4. Partial companion credit. When a companion handoff has audit_depth: partial or _verified: false on some targets, the consumed domain grade is provisional. Mark it:
Model Layer: B+ [88] (provisional โ companion audit was partial)
Start at 100. Deduct:
| Grade | Score | Grade | Score | Grade | Score |
|---|---|---|---|---|---|
| A+ | 97-100 | B+ | 87-89 | C+ | 77-79 |
| A | 93-96 | B | 83-86 | C | 73-76 |
| A- | 90-92 | B- | 80-82 | C- | 70-72 |
| D+ | 67-69 | D | 63-66 | ||
| D- | 60-62 | F | 0-59 | ||
| I | n/a |
Overrides the numeric score for ship-blocking, irreversible risks:
| Trigger | Source |
|---|---|
| Hardcoded production credentials | Security Basics (own) |
| Unencrypted secrets in UserDefaults/plist | Security Basics (own) |
| CRITICAL blocker from any companion | Companion handoff |
| Confirmed data loss/corruption path | Companion or own |
If ANY domain is Incomplete, overall grade is I (Incomplete).
Weighted average using domain weights (after redistribution), mapped to grade scale.
If HISTORY has previous snapshots, compare current grades to each:
Velocity โ Grade Over Time:
Code Hygiene: C -> B- -> B+ (3 audits, improving)
Security Basics: A -> A -> A (stable)
Test Health: C -> C -> C+ (3 audits, slowly improving)
Overall: B -> B+ (trending up)
If 3+ data points, add projection: "At this rate, you'll hit overall A- by [estimate]."
If current and previous audits have build numbers:
Build {prev} -> Build {current}:
New issues: {N}
Resolved: {N}
Grade: {prev_grade} -> {current_grade}
Diff current vs previous audit. Highlight domains that improved by >=1 letter grade OR >=10 points:
Improvements since last audit:
Code Hygiene: B- -> A- (resolved 8 TODO markers)
Test Health: C -> B- (added 12 test files)
Data Safety: D+ -> B (roundtrip-radar found and fixed 6 data loss bugs)
Always show improvements BEFORE findings. Positive reinforcement matters.
For each finding NOT present in the previous audit:
REGRESSION: {finding description}
File: {file:line}
Introduced: commit {hash} "{message}" ({date})
If TEAM_SIZE > 1, also show author and reviewer (if from a PR):
Author: {name}
PR: #{number} (if detectable from commit message)
Run git log -1 --format="%H %s %an %ai" -- {file} for each regression file.
| Recommendation | Criteria |
|---|---|
| SHIP | No CRITICALs, <=2 HIGHs, overall B- or above, no Incompletes |
| CONDITIONAL SHIP | No CRITICALs, 3-5 HIGHs, overall C+ or above, no Incompletes |
| DO NOT SHIP | Any Incomplete, OR any CRITICAL, OR >5 HIGHs, OR overall C or below |
Any Incomplete = automatic DO NOT SHIP โ no exceptions.
Format:
## Ship Recommendation: [SHIP / CONDITIONAL SHIP / DO NOT SHIP]
### Blockers (must fix before release)
1. [Issue] โ LOE: Xh | File: path:line
### Advisories (fix soon after release)
1. [Issue] โ LOE: Xh | File: path:line
**Estimated total blocker LOE:** X hours
If any companion skills were not run:
Coverage gaps โ these domains were NOT audited:
Model Layer: Run data-model-radar for coverage
Data Safety: Run roundtrip-radar for coverage
Ship recommendation is based on {N}/10 domains. Run missing companions for full confidence.
Before writing: Per the Artifact Lifecycle rules in radar-suite-core.md (Class 3: Dated Snapshot), list any existing .agents/research/*-capstone-audit.md files. Move them to .agents/research/archive/superseded/ (create the directory if missing) before writing the new snapshot. Only ONE live capstone audit file exists at the top level at any time.
Write to .agents/research/YYYY-MM-DD-capstone-audit.md. Write sections as generated.
Report sections:
**Overall: B+** (CodeHygiene A- [91] | Security A [95] | ...)<!-- GRADES_YAML
build: 25
date: 2026-03-23
overall: 87
code_hygiene: 92
test_health: 78
security_basics: 95
dependency_health: 88
build_health: 90
model_layer: 85
navigation_ux: null
data_safety: 72
visual_quality: null
cross_domain_risk: 80
-->
null = domain not audited. Previous reports with old 13-category format are still parseable for velocity โ map old domain names where possible, ignore the rest.
(Urgency x 3) + (Risk x 2) + (ROI x 2) + (1/LOE) (axis_1 only)Section 14 โ Fix Before Shipping (axis_1 only)
This section lists every axis_1_bug finding from all companion handoffs plus capstone's own scans. The A-F grade reflects ONLY this section. Format:
## Fix Before Shipping
> Audience legend: ๐ค end_user | ๐ code_reader | ๐ง future_maintainer
> All findings in this section are user-facing bugs that should be fixed before release.
### #1 โ CSV import freezes main thread on large files
**Source:** roundtrip-radar | **Axis:** axis_1_bug | **Audience:** ๐ค end_user
**File:** Sources/Features/ImportExport/ImportCSVView.swift:142
**Before:** User taps Import CSV, app freezes 10-30s with no feedback, some users force-quit thinking the app crashed.
**After:** User sees progress bar with row count, can cancel mid-import, UI stays responsive.
**Current approach:** Import call runs synchronously on @MainActor from onTapGesture. CPU-bound parsing loop blocks main thread.
**Minimum fix:** Wrap in Task { @MainActor in ... }, move parsing to Task.detached(priority: .userInitiated), add progress overlay + cancel button.
**Better approach:** Follow the pattern at Sources/Managers/CloudSyncManager.swift:104-112 which shows both the @MainActor bridge and Task.detached in adjacent lines. Extract an AsyncImportOperation type if there are 2+ similar imports.
**Tradeoffs:** Apply when the operation is >200ms or involves file I/O on large inputs. Don't apply for <50ms operations.
| Urgency | Risk: Fix | Risk: No Fix | ROI | Blast Radius | Fix Effort |
|---------|-----------|--------------|-----|--------------|------------|
| ๐ด CRITICAL | ๐ข Medium | ๐ด Critical | ๐ Excellent | โช 1 file | Small |
Each finding in this section displays the full coaching schema. The rating table stays 9-column per radar-suite-core.md.
Section 14a โ Hygiene Backlog (axis_2 + axis_3)
This section does NOT affect the grade. It is a separate list organized by axis sub-type:
## Hygiene Backlog
> Audience legend: ๐ค end_user | ๐ code_reader | ๐ง future_maintainer
> These findings do not block shipping. Fix opportunistically when touching the affected files.
### Scatter (axis_2) โ correct code, reorganize for clarity
#### S1 โ Empty states handled 500 lines apart in LegacyWishesView.swift
**Source:** ui-path-radar | **Axis:** axis_2_scatter | **Audience:** ๐ code_reader
**File:** Sources/Features/LegacyWishes/Views/LegacyWishesView.swift:120,480,640
**Severity:** rolling_hygiene
**Before:** Developer reading the file has to scroll between 3 regions to trace state machine.
**After:** Single enum ViewState at top of file, single switch at top of body.
**Better approach:** Follow pattern at Sources/Views/Navigation/NavigationTypes.swift:21 (String-backed enum with per-case computed properties).
### Dead Code (axis_3_dead_code) โ unreachable branches
#### D1 โ Unreachable empty-state branch in SomeListView.swift
**Source:** ui-path-radar | **Axis:** axis_3_dead_code | **Audience:** ๐ง future_maintainer
**File:** Sources/Views/SomeListView.swift:NNN
**Severity:** backlog_hygiene
**Before:** Future developer assumes the branch handles empty collections, wastes time tracing it.
**After:** Branch deleted, comment at upstream filter site documents why downstream views don't see empty.
### Smelly (axis_3_smelly) โ reachable but poorly justified
[similar format]
Hygiene findings use a simpler format (no 9-column rating table) since they don't block shipping and don't need severity ranking. Each finding still includes its coaching and file:line citation.
Section 14b โ Audit Coverage
## Audit Coverage
Source roots scanned: Sources/
Files scanned across radars:
- ui-path-radar: 588 Swift files
- roundtrip-radar: 588 Swift files (20 workflows traced)
- capstone-radar own domains: 120 files
Patterns executed:
- reachability_trace: 42 traces (3 findings reclassified from axis_1 to axis_3_dead_code)
- whole_file_scan: 18 scans (2 findings reclassified from axis_1 to axis_2_scatter)
- branch_enumeration: 8 #if blocks enumerated
- pattern_citation_lookup: 57 lookups (47 hits, 10 generic fallbacks)
- source_root_introspection: 1 (single-root project confirmed)
Patterns skipped: none
Findings rejected at schema gate: 0
(If >0: "3 findings dropped for missing citation โ listed under 'Coaching Incomplete' below")
This section answers "what was checked" so a clean audit run is not ambiguous.
When the unified ledger (.radar-suite/ledger.yaml) exists, section 14 presents ALL findings in a single 9-column table with the Source column showing provenance. The table is sorted by impact category (Crash Risk โ Data Loss โ UX Broken โ UX Degraded โ Polish โ Hygiene), then by Urgency within each category:
| # | Finding | Source | Urgency | Risk: Fix | Risk: No Fix | ROI | Blast Radius | Fix Effort | Status |
|---|---------|--------|---------|-----------|--------------|-----|--------------|------------|--------|
| | **Crash Risk** | | | | | | | | |
| 1 | Cascade delete on archived items | time-bomb | ๐ด CRITICAL | ... | ... | ... | ... | ... |
| 2 | Force unwrap on nil photo data | roundtrip | ๐ก HIGH | ... | ... | ... | ... | ... |
| | **Data Loss** | | | | | | | |
| 3 | CSV export drops Room and UPC | roundtrip | ๐ก HIGH | ... | ... | ... | ... | ... |
| 4 | InsuranceProfile not in backup | data-model | ๐ก HIGH | ... | ... | ... | ... | ... |
| | **UX Broken** | | | | | | | |
| 5 | Settings > Export has no back button | ui-path | ๐ก HIGH | ... | ... | ... | ... | ... |
Category separator rows (bold text, empty rating columns) divide the table visually without breaking the single-table rule. The Source column replaces the old (skill-name) suffix that was appended to finding text.
When the ledger contains finding relationships, the impact-organized report groups related findings:
## Crash Risk (2 findings)
RS-002 [CRITICAL] Cascade delete on archived items (time-bomb-radar)
โโ symptom: RS-014 [HIGH] Force unwrap on nil photo data (roundtrip-radar)
Fix RS-002 first โ RS-014 may resolve automatically.
Root cause findings are always listed first in their impact group. Symptoms are indented beneath their root cause with a note that fixing the root cause may resolve them.
Auto-inference: On capstone startup, scan ledger for relationship patterns per the Finding Relationships protocol in radar-suite-core.md. Create links automatically and present for user confirmation.
After the impact-organized findings, offer fix organization:
How would you like to organize fixes?
1. **By crash risk (Recommended)** โ highest severity first across all skills
2. **By blast radius** โ smallest changes first for quick wins
3. **By user journey** โ group findings affecting the same workflow
4. **By skill** โ fix all data-model issues, then time-bomb, etc. (legacy)
Map test files against source files:
**/*Tests.swift and **/*Test.swiftItemViewModelTests.swift โ ItemViewModel)Test Coverage Gaps:
{type} ({N} files, 0 tests) โ appears in {domain} findings
{type} ({N} files, 0 tests) โ HIGH risk (companion flagged data issues)
Capstone-radar uses a 10-column table with a Source column added to the standard 9. This is the only skill that adds Source -- it's the aggregator, so provenance matters. All other radar skills use the standard 9-column table.
| # | Finding | Source | Urgency | Risk: Fix | Risk: No Fix | ROI | Blast Radius | Fix Effort | Status |
|---|---|---|---|---|---|---|---|---|---|
| 1 | CSV export drops Room and UPC | roundtrip | ๐ก HIGH | โช Low | ๐ก High | ๐ Excellent | ๐ข 2 files | Small | |
| 2 | InsuranceProfile not in backup | data-model | ๐ก HIGH | โช Low | ๐ก High | ๐ข Good | ๐ข 3 files | Medium | |
| 3 | TODO marker in production path | capstone | ๐ข MEDIUM | โช Low | ๐ข Medium | ๐ Excellent | โช 1 file | Trivial |
Source column values: Use short names -- capstone, data-model, ui-path, roundtrip, ui-enhancer, time-bomb. For findings from capstone's own scans, use capstone.
Single table rule still applies. The Source column eliminates the need for separate tables per companion skill. ALL findings -- from own scans and all companion handoffs -- go into ONE table. The impact-based organization (v3.0) groups rows by impact category within this single table; the Source column shows where each finding originated.
Blast Radius: Always include the number of files the fix touches (e.g., 3 files, 1 file). Count by grepping for callers/references before rating.
| Indicator | General meaning | ROI meaning |
|---|---|---|
| CRITICAL | Critical / high concern | Poor return |
| HIGH | High / notable | Marginal return |
| MEDIUM | Medium / moderate | Good return |
| LOW | Low / negligible | โ |
| PASS | Pass / positive | Excellent return |
After the ship decision, ask: "How much time do you have for fixes?"
Sort findings by (grade impact / fix effort). Output a prioritized list that fits the time budget:
You have {N} hours. Here are the {M} fixes that move your grade the most:
1. {fix} ({domain} {old_grade} -> {new_grade}) โ {effort}, {files} file(s)
2. {fix} ({domain} {old_grade} -> {new_grade}) โ {effort}, {files} file(s)
...
Every fix includes tests. When listing fix effort, include test writing time. A "Small" fix that needs tests is "Small + tests." When the user applies fixes (directly or via companion skills), verify tests were added. Untested fixes are unverified fixes.
Write .agents/ui-audit/capstone-radar-handoff.yaml:
source: capstone-radar
date: <ISO 8601>
project: <project name>
build: <build number>
recommendation: "ship" | "conditional" | "no-ship"
overall_grade: "<letter grade>"
overall_score: <numeric>
# File timestamps โ enables staleness detection by consuming skills
file_timestamps:
<file path>: "<ISO 8601 mod date>"
# one entry per unique file referenced in findings
for_roundtrip_radar:
priority_workflows:
- domain: "<e.g., Data Safety>"
grade: "<letter>"
reason: "<why this domain scored low>"
group_hint: "<optional, e.g. 'data_safety', 'error_handling'>"
for_ui_path_radar:
priority_areas:
- domain: "<e.g., Navigation/UX>"
grade: "<letter>"
reason: "<specific issues found>"
group_hint: "<optional batching suggestion>"
for_ui_enhancer_radar:
priority_views:
- domain: "<e.g., Visual Quality>"
grade: "<letter>"
reason: "<specific gaps>"
group_hint: "<optional batching suggestion>"
for_data_model_radar:
priority_models:
- domain: "<e.g., Model Layer>"
grade: "<letter>"
reason: "<specific model issues>"
group_hint: "<optional batching suggestion>"
For each unique file path referenced across all findings, record its modification date:
stat -f "%Sm" -t "%Y-%m-%dT%H:%M:%SZ" "<file path>"
Enables consuming skills to detect staleness โ if a file changed after the audit, affected findings may need re-verification.
Optional field for batching related issues. Common hints:
security_basics โ secrets, credentials, http URLstest_health โ missing tests, test gapscode_hygiene โ TODOs, force unwraps, large filesbuild_health โ scheme issues, dependency problemsAutomatic: This file is always written so other audit skills can pick up where this one left off. No user action needed.
Per the Artifact Lifecycle rules in radar-suite-core.md, capstone is the LAST phase of an audit and is responsible for final cleanup:
.radar-suite/ and .agents/research/.RESUME_PHASE_*.md, RESUME_*.md except NEXT_STEPS.md, *-v[0-9]*.md) to .radar-suite/archive/superseded/.*-capstone-audit.md exists at the top of .agents/research/; older ones must already be in archive/superseded/ from the Step 10 "Write Report" archive rule..radar-suite/checkpoint.yaml per the Checkpoint & Resume rule..radar-suite/NEXT_STEPS.md (overwrite, no dates, no phase numbers in filename) containing the post-capstone fix session prompt. This is the ONLY single-use handoff file capstone is allowed to create.ledger.yaml, session-prefs.yaml) are in-place rewrites โ not dated or versioned.This prevents .radar-suite/ from accumulating stale prose artifacts across runs.
After writing the handoff YAML, also write findings to .radar-suite/ledger.yaml following the Ledger Write Rules in radar-suite-core.md:
impact_category, compute file_hashCapstone-specific ledger behavior:
impact_category if its cross-skill view reveals a more accurate classificationAfter presenting the report, offer:
.agents/research/ without asking.Guarantees no blocking prompts. Only uses Read, Grep, Glob. No Bash, no Edit, no Write. When complete:
Hands-free audit complete. Analysis ready.
Deferred: Report file writing, git history analysis.
Reply to continue with supervised steps.
Adjust ALL output (grades, findings, recommendations, domain summaries) based on USER_EXPERIENCE:
After EVERY step and EVERY commit, your NEXT output MUST be the progress banner followed by the next-step AskUserQuestion. Do not output anything else first. Do not leave a blank prompt.
After completing each step, always print this banner:
Step [N] of 10 complete: [step name]
Next: Step [N+1] โ [step name] (~[time estimate])
Step time estimates:
| Step | Name | Est. Time |
|---|---|---|
| 1 | Project Metrics | ~1 min |
| 2 | Previous Audit Check | ~1 min |
| 3 | Companion Handoff Consumption | ~1 min |
| 3.5 | Risk-Ranking | ~1 min |
| 4 | Own Domain Scans | ~3-5 min |
| 5 | Verification | ~2-5 min |
| 6 | Cross-Domain Correlation | ~2 min |
| 7 | Grading | ~1 min |
| 8 | Velocity + Regressions | ~2 min |
| 9 | Ship Recommendation | ~1 min |
| 10 | Output + Follow-up | ~3 min |
| Skill | Unique Role |
|---|---|
| data-model-radar | Are your @Model definitions correct? |
| ui-path-radar | Can users reach every feature? |
| roundtrip-radar | Does data survive the full journey? |
| ui-enhancer-radar | Does it look and feel right? |
| capstone-radar (this skill) | Can you ship? Unified grade + decision. |
Recommended audit order:
Capstone is both the entry point ("what should I audit?") and the exit point ("can I ship?"). The other radars are the deep work in between.
After every step/commit: print progress banner โ AskUserQuestion โ never blank prompt.
Grade honesty: State N/10 domains audited. State verification depth (deep/sampled/spot-checked). No clean A+ from surface greps alone.
Risk-ranking: Produce Step 3.5 risk-ranking table before Step 4. Verify riskiest domains first.