Skill

p1-spec-research-policy

Defines quality criteria, structured interview protocols, naming conventions, artifact specs, and checklists for Phase 1 hardware research pipeline. Use for spec parsing and requirement gathering.

documentation

Install

npx claudepluginhub babyworm/rtl-agent-team --plugin rtl-agent-team

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Every ambiguity, design choice, or scope decision MUST be resolved via AskUserQuestion

SKILL.md

Similar Skills

design-system

Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.

team-skills-platform

163.7k

ui-demo

Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.

team-skills-platform

163.7k

kotlin-patterns

Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.

team-skills-platform

163.7k

Stats

Stars12

Forks3

Last CommitApr 9, 2026

Actions

View Source View Plugin View on GitHub View README

Phase 1 Research Policy

Core Principles

AskUserQuestion-First

Every ambiguity, design choice, or scope decision MUST be resolved via AskUserQuestion BEFORE proceeding. Do not assume — ask. The cost of asking is low; the cost of a wrong assumption cascades to all later phases.

Structured Interview Protocol (before spec parsing)

Before analyzing spec documents, conduct a structured user interview to understand intent, priorities, and constraints. Ask one question per message — do not batch.

Interview sequence (adapt to context, skip if already clear from spec):

Goal: "What is the primary purpose of this design? What problem does it solve?"
Scope: "Which features from the spec are in-scope for this implementation? Any intentional omissions?"
Constraints: "Target frequency? Area budget? Power envelope? Technology node?"
Priority: "If trade-offs arise (area vs performance vs power), which takes precedence?"
Verification: "What is the verification strategy? cocotb/UVM? Formal? Target coverage?"
Dependencies: "Any existing IP/modules to integrate? Reference models to match?"

Record answers in docs/phase-1-research/design-intent.md. These answers become the interpretive context for all spec parsing — ambiguous spec language is resolved using the user's stated intent, not agent assumptions.

Approach Comparison for Open Items

When the spec allows multiple implementation paths (algorithm choices, architecture options, protocol selections), present structured comparisons to the user:

## OPEN-1-NNN: {topic}

| Approach | Pros | Cons | Area Est. | Latency Est. | Recommendation |
|----------|------|------|-----------|-------------|----------------|
| A: {name} | ... | ... | ... | ... | |
| B: {name} | ... | ... | ... | ... | ★ Recommended |
| C: {name} | ... | ... | ... | ... | |

Trade-off summary: {1-2 sentences}

Ask user to select via AskUserQuestion. Record choice + rationale in open-requirements.json resolution_rationale field.

Incremental Requirement Approval

Do NOT present all requirements at once. Group by functional area and seek approval in stages:

Present interface/IO requirements first (ports, protocols, clocks) → user approves
Present functional requirements by block → user approves per block
Present performance requirements (timing, throughput, area) → user approves
Present open items with approach comparisons → user selects

At each stage, the user can correct misinterpretations before they propagate. Only after all stages are approved, finalize iron-requirements.json.

Domain-Consult-First

Actively invoke domain-consult to acquire domain expert knowledge on algorithms, standards, coding tools, filter characteristics, and HW implementation trade-offs. Do not research in isolation. Domain experts provide knowledge; spec-analyst captures results as structured artifacts.

Propose, Do Not Decide

Present algorithm/tool candidates with trade-offs. Let the user make final selections. Architecture-level decisions (pipeline, block partitioning, memory hierarchy) are Phase 2's responsibility. Phase 1 surveys and recommends; Phase 2 designs.

Exhaustive Tree Exploration

Spawn maximum agents in parallel to explore all solution paths. Every feasible approach must be investigated and compared before committing. Skip ONLY if user specifies exact algorithm + architecture (even then, explore at least 2 variants for validation).

Spec Refinement Criteria

AskUserQuestion MUST cover these areas (skip items already provided by user):

Target codec, profile, level (e.g., H.264 High Profile Level 4.1)
Target resolution and framerate (e.g., 1080p@60fps, 4K@30fps)
Encoder, decoder, or both
Interface protocol (AXI4, AXI4-Lite, APB, custom)
Clock frequency target and process node (ASIC vs FPGA)
Feature scope restrictions (e.g., "TQ only", "intra-only")
Priority trade-off preference (throughput vs area vs power vs quality)

3-Round Chief Review Protocol

Mandatory 3 rounds, coordinated by rtl-architect (domain-agnostic default). If a domain chief exists (e.g., vcodec-chief-standard-expert for video-codec domain), invoke both rtl-architect AND domain chief for domain-specific validation:

Round 1: Cross-block data flow completeness, dependencies, performance constraints, fixed-point constraints, cross-block issues, [AMBIGUITY]/[CONFLICT] status Save: reviews/phase-1-research/research-review-r1.md
Round 2: Convergence assessment. Rebuttal: spec-analyst accepts/rejects each Round 1 finding with rationale. Even if converged, proceed to Round 3 Save: reviews/phase-1-research/research-review-r2.md
Round 3: Mandatory final quality pass. Remaining gaps → escalate via AskUserQuestion Save: reviews/phase-1-research/research-review-r3.md

Review criteria per round:

Data flow: inputs/outputs defined at every block boundary
Dependencies: which block produces/consumes what data
Performance: throughput, latency, bandwidth as specific numbers
Fixed-point: bit widths, rounding modes per block
Cross-block issues: RDOQ↔Entropy dependency, ME↔MC pipeline, etc.
Ambiguities: all resolved or promoted to [ARCHITECTURE_DECISION]

User may override round count: "set iterations to N" → N rounds (minimum 1).

Iron/Open Requirement Taxonomy

Phase 1 produces TWO requirement files instead of a single requirements.json:

iron-requirements.json — Settled Requirements (Authority = 1)

Located at docs/phase-1-research/iron-requirements.json. Contains functional and performance requirements that are binding constraints for ALL downstream phases.

Each iron requirement MUST have:

"id": "REQ-F-NNN" (functional) or "REQ-P-NNN" (performance) — unique, sequential
"type": "functional" or "performance"
"description": what the requirement is
"priority": "must" | "should" | "may"
"source": {"document": "...", "section": "...", "line": N} for traceability
"acceptance_criteria": array of measurable criteria (reject vague terms like "should support", "adequate", "sufficient")
"violation_policy": "user_escalation" (all P1 iron requirements use this)

open-requirements.json — Research Homework for Phase 2

Located at docs/phase-1-research/open-requirements.json. Contains research topics that Phase 2 must investigate and resolve into architecture decisions.

Each open item MUST have:

"id": "OPEN-1-NNN" — sequential
"topic": what needs to be investigated
"context": why this is an open question
"candidates": array of ≥ 2 candidates (single candidate = not a research topic)
"evaluation_criteria": metrics Phase 2 should use for comparison
"related_iron": array of REQ-F/REQ-P IDs that constrain this research
"resolution_expected": how this should be resolved in Phase 2

Classification Rules

Functional/performance requirements with clear, measurable acceptance_criteria → iron
Architecture/implementation choices needing further investigation → open
Items with ambiguity score > 0.5 → CANNOT become iron until clarified
A requirement cannot become iron until its ambiguity score passes (reproducibility check)

Iron/Open Classification Verification

After iron/open files are produced, verify:

FAIL conditions (must fix before exit):

acceptance_criteria contains vague terms ("should support", "adequate", "sufficient")
open item missing evaluation_criteria
open item has candidates.length ≤ 1
iron item missing violation_policy

WARN conditions (log and proceed):

iron ratio < 30% (most items pushed to open — weakens Phase 1 value)
open item related_iron is empty
CONDITIONAL PASS ambiguity axis linked to an iron-classified REQ

Port Naming Conventions (io_definition.json)

Inputs: i_ prefix (e.g., i_data, i_valid) — NOT suffix _i
Outputs: o_ prefix (e.g., o_result, o_ready) — NOT suffix _o
Bidirectional: io_ prefix (e.g., io_sda)
Clocks: clk (single domain) or {domain}_clk (e.g., sys_clk) — NOT clk_i
Resets: rst_n (single domain) or {domain}_rst_n (e.g., sys_rst_n) — NOT rst_ni
Single clock domain defaults to sys_clk / sys_rst_n

Self-Verification Format

Save to reviews/phase-1-research/research-review.md:

# Phase 1 Review: Research Completeness
- Date: YYYY-MM-DD
- Reviewer: spec-analyst
- Upper Spec: specs/
- Verdict: PASS | FAIL

## Feature Coverage Checklist
| Spec Section | Requirement ID | Status |

## Findings
### [severity] Finding-N: ...

## Verdict
PASS | FAIL: [reason]

Spec Feature Completeness Audit

Phase 1 spec analysis MUST enumerate ALL features defined in the specification and track their implementation status throughout the pipeline:

Feature enumeration: Extract every algorithm, mode, format, or capability from the spec
- Example: intra prediction modes, encoding modes, color formats, block sizes
- Assign each feature a REQ-F-* ID in iron-requirements.json
Reference model coverage check (if ref model exists at P1 or provided externally):
- Compare spec feature list against ref model implementation
- enum/define declarations vs actual function implementations
- "Enum declared but function not implemented" → COVERAGE_GAP warning
Gap escalation: When feature coverage < 100%, MUST ask user via AskUserQuestion:
- "Spec defines N features but model implements M. Omitting K features may reduce [quality metric]. Approve omission?"
- User-approved omissions → record in ADR with rationale and impact estimate
- Unapproved omissions → feature stays in iron-requirements as MUST_IMPLEMENT

Documentation: Save docs/phase-1-research/feature-coverage.md:

| Feature | Spec Count | Model Count | Coverage | Status |
|---------|-----------|-------------|----------|--------|
| Intra modes | 8 | 4 | 50% | USER_APPROVED / MUST_IMPLEMENT |

Escalation & Stop Conditions

Spec document not found → report to user, halt
Conflicting requirements between experts → flag conflict in domain-analysis.md, ask user
Chief not converged after 3 rounds → escalate remaining gaps to user with specific questions
Sub-domain expert returns [DOMAIN_UNCERTAINTY] → AskUserQuestion before proceeding

Ambiguity Score Protocol

Every Phase 1 completion MUST include an ambiguity assessment:

spec-analyst produces Ambiguity_Assessment with per-axis scores
Ambiguity Gate enforced by both orchestrators:
- p1-research-orchestrator: Step 7.5
- p1-research-team-orchestrator: Step 3.5
Score is recorded in docs/phase-1-research/ambiguity-assessment.md
Phase 2 entry reads this score — if > 0.3, phase 2 reviewers prioritize clarifying those axes

This is inspired by Ouroboros's AmbiguityScorer pattern:

Goal Ambiguity (40%): Is the design objective ambiguous? (0.0=clear, 1.0=ambiguous)
Constraint Ambiguity (30%): Are timing/area/power/protocol constraints missing? (0.0=explicit, 1.0=missing)
AC Ambiguity (30%): Are acceptance criteria untestable? (0.0=testable, 1.0=untestable)

Scoring: ambiguity_score = weighted_average(goal, constraint, ac) — higher = worse

≤ 0.3: PASS — proceed to Phase 2
0.3–0.5: CONDITIONAL PASS — log warnings, Phase 2 reviewers focus on flagged axes
> 0.5: BLOCK — resolve top ambiguities via AskUserQuestion before proceeding

Adversarial Interpretation Gate (Steps 7.6-7.9)

After ambiguity gate (Step 7.5a) and iron/open verification (Step 7.5b) pass, run adversarial reinterpretation to surface ambiguities the initial analysis missed.

Protocol

Step 7.6: Spawn adversarial spec-analyst (separate Task, clean context) to challenge iron-requirements.json. References items by source.section, not REQ ID. Output: challenge-report.json in .rat/scratch/stability/phase-1/. Schema: skills/p1-spec-research/templates/challenge-report-schema.json. Budget: max 30 challenges per pass.
Step 7.7: Present HIGH challenges to user (AskUserQuestion). MEDIUM batched if >10. LOW auto-documented. User may mark challenges as NOT_GENUINE (forced disagreements).
Step 7.8: Re-run spec-analyst with original spec + clarifications → all 4 canonical artifacts (iron-requirements.json, open-requirements.json, io_definition.json, timing_constraints.json)
- self-validation.
Step 7.9: Gate check + stability report.

Gate Metric

genuine = (HIGH + MEDIUM challenges) - NOT_GENUINE
resolved = RESOLVED + DOCUMENTED
resolution_ratio = resolved / genuine   (if genuine == 0: pass)
gate_pass = (all HIGH resolved) AND (resolution_ratio ≥ 0.8)

Gate failure: list unresolved HIGH challenges, loop back to Step 7.7 (max 1 re-loop). After 2nd failure: escalate to user with full divergence report.

Dual Gate Arbitration (Ambiguity Score + Adversarial Gate)

Ambiguity Score	Adversarial Gate	Decision
PASS (≤0.3)	PASS	Proceed
PASS (≤0.3)	FAIL	BLOCK
CONDITIONAL (0.3-0.5)	PASS	Proceed with WARNING
CONDITIONAL (0.3-0.5)	FAIL	BLOCK
BLOCK (>0.5)	PASS	BLOCK
BLOCK (>0.5)	FAIL	BLOCK

Rule: Either gate can block; neither can unblock the other.

Severity Classification

Severity	Criterion	Example
HIGH	Different RTL behavior	Signed vs unsigned arithmetic
HIGH	Different interface	32-bit vs 64-bit datapath
MEDIUM	Different parameterization	Fixed depth vs configurable
MEDIUM	Different timing	3-stage vs 4-stage pipeline
LOW	Cosmetic only	Block naming differences

Boundary rule: alternative interpretation would cause different RTL module → HIGH. Same module but different parameters → MEDIUM. Same module, same parameters → LOW.

Pathological Patterns

Zero challenges on >15 requirements → re-run with stronger adversarial framing
50% items at HIGH severity → spec fundamentally under-specified, escalate
Challenge budget exceeded (>30) → rank by severity, return top 30

p1-spec-research-policy

Install

Tool Access

Preview

SKILL.md

Similar Skills

p1-spec-research-policy

Install

Tool Access

Preview

SKILL.md

Phase 1 Research Policy

Core Principles

AskUserQuestion-First

Structured Interview Protocol (before spec parsing)

Approach Comparison for Open Items

Incremental Requirement Approval

Domain-Consult-First

Propose, Do Not Decide

Exhaustive Tree Exploration

Spec Refinement Criteria

3-Round Chief Review Protocol

Iron/Open Requirement Taxonomy

iron-requirements.json — Settled Requirements (Authority = 1)

open-requirements.json — Research Homework for Phase 2

Classification Rules

Iron/Open Classification Verification

Port Naming Conventions (io_definition.json)

Self-Verification Format

Spec Feature Completeness Audit

Escalation & Stop Conditions

Ambiguity Score Protocol

Adversarial Interpretation Gate (Steps 7.6-7.9)

Protocol

Gate Metric

Dual Gate Arbitration (Ambiguity Score + Adversarial Gate)

Severity Classification

Pathological Patterns

Final Checklist

Similar Skills

Phase 1 Research Policy

Core Principles

AskUserQuestion-First

Structured Interview Protocol (before spec parsing)

Approach Comparison for Open Items

Incremental Requirement Approval

Domain-Consult-First

Propose, Do Not Decide

Exhaustive Tree Exploration

Spec Refinement Criteria

3-Round Chief Review Protocol

Iron/Open Requirement Taxonomy

iron-requirements.json — Settled Requirements (Authority = 1)

open-requirements.json — Research Homework for Phase 2

Classification Rules

Iron/Open Classification Verification

Port Naming Conventions (io_definition.json)

Self-Verification Format

Spec Feature Completeness Audit

Escalation & Stop Conditions

Ambiguity Score Protocol

Adversarial Interpretation Gate (Steps 7.6-7.9)

Protocol

Gate Metric

Dual Gate Arbitration (Ambiguity Score + Adversarial Gate)

Severity Classification

Pathological Patterns

Final Checklist