From rtl-agent-team
Provides verification policies, module graduation gates, coverage targets, synthesis estimation rules, and checklists for Phase 5 RTL three-stage pipeline. Reference for module and top-level verification.
npx claudepluginhub babyworm/rtl-agent-team --plugin rtl-agent-teamThis skill uses the workspace's default tool permissions.
Stage 1 (Module): Each module independently verified in parallel across 9 categories.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Stage 1 (Module): Each module independently verified in parallel across 9 categories. Stage 2 (Top): System-level verification after ALL modules graduate. Stage 3 (Final): Compliance review + summary generation.
Core principle: Module-level verification first, top-level only after module graduation. A module "graduates" when ALL its verification checks PASS (or PARTIAL_PASS for AC-level checks where PARTIAL Critical/High ac_ids produce WARNING). Only graduated modules participate in top-level integration. This prevents wasting top-level sim time on modules with known bugs.
Applied at both module-level (Stage 1) and top-level (Stage 2):
V1: Lint (final comprehensive) → lint-checker
V2: SVA Completion + Formal → sva-extractor + eda-runner
V3: CDC/RDC Analysis → cdc-checker + constraint-writer
V4: Protocol Compliance → protocol-checker (if bus interfaces)
V5: Functional Regression (Tier 3/4) → testbench-dev + eda-runner + func-verifier
V6: Coverage Analysis → coverage-analyst + testbench-dev
V7: Performance Verification → perf-verifier + eda-runner
V8: Synthesizability + PPA Estimation → eda-runner + synthesis-reporter
V9: Code Review + Refactoring → rtl-critic + rtl-p4s-refactor
Parallel Group A: V1(Lint) + V2(SVA/Formal) + V3(CDC) + V4(Protocol) + V8(Synth Est.)
Sequential: V5(Functional) starts after V1 pass (lint-clean required for sim)
Incremental: V6(Coverage) starts as V5 data arrives
Sequential: V7(Performance) after V5 pass (functional correctness required)
Final: V9(Code Review) after V1-V8 results inform review scope
A module graduates when ALL of (PARTIAL_PASS accepted for V5 AC-level checks — WARNING, not FAIL):
On FAIL: invoke rtl-p4s-bugfix (feedback loop, max 2 per module). After fix, re-verify ONLY the failed categories (not all 9).
When structured acceptance_criteria (with ac_id) exist in iron-requirements: Module graduation (Stage 1):
Stage 3 audit (final, pre-P6):
All top-level checks PASS → proceed to Stage 3. On FAIL → classify and fix:
{
"module": "{module}",
"status": "pending",
"checks": {
"v1_lint": "pending",
"v2_sva_formal": "pending",
"v3_cdc": "pending",
"v4_protocol": "pending|n/a",
"v5_functional": "pending",
"v6_coverage": "pending",
"v7_performance": "pending",
"v8_synth_est": "pending",
"v9_code_review": "pending"
},
"feedback_loops": 0,
"graduated": false
}
Long test suites split by scenario category:
| Category | Description | Typical Vector Count |
|---|---|---|
| basic | Normal operation, happy path | 50-100 |
| corner_case | Boundary conditions, edge cases | 100-200 |
| stress | Maximum throughput, back-to-back, full FIFO | 200-500 |
| error_handling | Invalid inputs, error injection, recovery | 50-100 |
Each scenario category runs as independent parallel agent. Multi-seed regression per scenario (5 seeds default: 1, 42, 123, 1337, 65536). Total: M modules × S scenarios × 5 seeds = massive parallelism. Early termination: >5% failure rate → halt and report.
For very large modules, further split by feature within each category.
Minimum 3 rounds: Draft → Strengthen → Harden.
run_syn.sh or formal scripts handle sv2v internally (Layer 2)_v2v.v files (generated by scripts, not manually)| Metric | Target | Evaluated On |
|---|---|---|
| Line coverage | ≥ 90% | Post-exclusion |
| Toggle coverage | ≥ 80% | Post-exclusion |
| FSM coverage | ≥ 70% | Post-exclusion |
Iterative coverpoint refinement (minimum 3 rounds).
Generate additional tests for HIGH priority gaps. Re-run regression for new tests.
When convergence is detected (2 consecutive iterations with < 0.5% improvement),
apply Coverage Exclusion Protocol per rtl-p5s-coverage-policy: classify unreachable bins,
generate exclusion files, document in reviews/phase-5-verify/{module}-coverage-exclusions.md,
and report both raw and post-exclusion numbers.
Both Module-level (V8) and Top-level (T8):
run_syn.sh --skip-if-unavailable — handles tool selection and sv2v internally (Layer 2)syn-tool-profiles):
Module-level (Stage 1 V8): → Always: synthesis estimation with NanGate45 + NAND2 gate count → SDC: per-module clock/IO constraints
Top-level (Stage 2 T8): → Always: full synthesis estimation with NanGate45 + SDC → User requested full synthesis? → additionally export netlist + JSON report → Area metric: ALWAYS NAND2-FO2 gate equivalents (NOT LUTs, NOT raw cell count)
Theoretical maximum concurrent agents for M modules, S scenarios:
Stage 1 Group A: M × 5 checks
Stage 1 Group B: M × S scenarios × 5 seeds
Stage 1 Group C: M × 2 checks
Stage 1 Group D: M × 1
Example: 6 modules, 4 scenarios
Group A: 30, Group B: 120, Group C: 12, Group D: 6
Peak: ~168 (practical limit: ~20-30 via run_in_background)
Modules that pass all Group A checks can start Group B immediately without waiting for other modules' Group A. Each module progresses independently.
| Failure Type | Scope | Fix Approach | Re-verify |
|---|---|---|---|
| UNIT_FIX (lint) | Single module V1 | rtl-coder fix | V1 only |
| UNIT_FIX (SVA) | Single module V2 | rtl-p4s-bugfix | V2 only |
| UNIT_FIX (CDC) | Single module V3 | rtl-coder add sync | V3 only |
| UNIT_FIX (sim) | Single module V5 | rtl-p4s-bugfix | V5 + V6 |
| INTEGRATION_FIX | Cross-module | rtl-p4s-bugfix | Affected Vx + Stage 2 |
| DESIGN_FIX | Architecture | STOP → user | All (after upper phase fix) |
Independent UNIT_FIX failures in different modules: fix in parallel. Same-module failures: fix sequentially within a single task. INTEGRATION_FIX: always sequential (cross-module dependencies).
When P5 verification reveals a μArch-level issue that cannot be fixed in RTL alone (e.g., pipeline balance infeasible, missing metadata in FIFO struct, architectural throughput bottleneck):
reviews/phase-5-verify/uarch-feedback-{module}.md with:
Every feedback loop iteration MUST produce a decision record at
.rat/scratch/phase-5/feedback-loop-decision-{N}.md with:
# Feedback Loop Decision #{N}
- Module: {module}
- Check: V{x} ({category})
- Classification: UNIT_FIX | INTEGRATION_FIX | DESIGN_FIX | UARCH_FIX
- Root cause: {description}
- Fix applied: {description of change}
- Alternatives considered: {rejected options with rationale}
- Affected artifacts: {list of modified files}
- Re-verification scope: {which checks to re-run}
This enables post-mortem analysis of verification efficiency and identifies recurring patterns that should become preventive rules.
When invoked from rat-auto-design, state is tracked in .rat/state/rat-auto-design-state.json:
{
"current_phase": 5,
"completed_sub_phases": ["stage-1-module-a", ...],
"pending_sub_phases": ["stage-2-integration", "stage-3-compliance"],
"fix_history": [
{"sub_phase": "stage-1-v2", "module": "module_a", "fix_count": 1, "status": "resolved"}
]
}
This enables resume: re-read state and continue from next pending sub-phase.
If commercial simulator available and UVM mandated, invoke /rtl-agent-team:rtl-p5s-uvm-verify
alongside V5. UVM is NOT a replacement for cocotb regression — both provide complementary coverage.
Stage 3 includes a Formal Traceability Audit that gates P6 entry:
This ensures no Critical/High requirement ships without at least one verification artifact.