From rtl-agent-team
Rate-Distortion evaluation automation for codec algorithm comparison. Builds ref C model encoder, runs parallel encoding simulations across multiple sequences and QP points, computes BD-PSNR/BD-rate (VCEG-M33 methodology), and generates comparison reports. Supports N-candidate comparison, configurable encoder CLI, SSIM/VMAF opt-in metrics.
npx claudepluginhub babyworm/rtl-agent-team --plugin rtl-agent-teamThis skill uses the workspace's default tool permissions.
<Purpose>
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
This skill automates the full Rate-Distortion evaluation pipeline:
Scope: Encoder RD evaluation only.
This skill evaluates encoder quality metrics (BD-PSNR, BD-rate, optional SSIM/VMAF).
For decoder conformance testing against JVET/JCTVC bitstreams, use /rtl-agent-team:codec-conformance-eval.
Phase-agnostic: While commonly used during rat-dse Step 3b, this skill can be invoked at any Phase where quantitative RD comparison of encoder configurations is needed — Phase 1 (algorithm exploration), Phase 2 (architecture validation), Phase 4 (fixed-point precision impact), or standalone evaluation outside the pipeline.
Execution modes:
Key features:
{encoder}, {cfg}, {input}, {width}, {height}, {fps}, {frames}, {qp}, {bitstream}, {recon}, {bit_depth}, {chroma_format}<Use_When>
<Do_Not_Use_When>
/rtl-agent-team:rtl-model-consistency instead)/rtl-agent-team:codec-conformance-eval)/rtl-agent-team:rtl-conformance-test)<Why_This_Exists> In codec design, algorithm selection has the highest impact on final quality and area. Theoretical complexity analysis (operations/pixel, gate estimates) provides useful guidance but cannot capture the full picture — actual RD performance on representative sequences is the definitive metric.
BD-PSNR/BD-rate (VCEG-M33) is the universally accepted method in the video coding community for comparing codec configurations. It normalizes across different operating points (QP values) to produce a single, meaningful comparison metric.
Without this skill, teams either skip quantitative RD evaluation (risking suboptimal algorithm selection) or manually set up evaluation infrastructure (time-consuming and error-prone). </Why_This_Exists>
<Execution_Policy>
python3 skills/codec-rd-eval/scripts/bd_rate.py --test runs built-in unit tests
</Execution_Policy>Encoder build (build_encoder.sh)
bash skills/codec-rd-eval/scripts/build_encoder.sh <src> <binary> [extra_cflags]gcc -std=c11 -O2 -Wall -Wextra -lm (C11 standard per CLAUDE.md)Simulation execution (run_eval.py)
python3 skills/codec-rd-eval/scripts/run_eval.py <config.hjson> --mode localpython3 skills/codec-rd-eval/scripts/run_eval.py <config.hjson> --mode aws-batch.rat/scratch/rd-eval/results.jsonBD-PSNR/BD-rate calculation (bd_rate.py)
python3 skills/codec-rd-eval/scripts/bd_rate.py .rat/scratch/rd-eval/results.json --output .rat/scratch/rd-eval/bd-metrics.json.rat/scratch/rd-eval/bd-metrics.jsonReport generation
<Tool_Usage>
# ============================================================
# Step 1: Prerequisite validation
# ============================================================
Glob("refc/*.c") # Verify encoder source exists
Read("<test-config.hjson>") # Read test configuration
Bash("python3 -c 'import numpy; import hjson; print(\"OK\")'") # Check dependencies
# ============================================================
# Step 2: Encoder build (for each unique encoder_src)
# ============================================================
Bash("bash skills/codec-rd-eval/scripts/build_encoder.sh refc .rat/scratch/rd-eval/anchor_encoder")
Bash("bash skills/codec-rd-eval/scripts/build_encoder.sh refc .rat/scratch/rd-eval/test_encoder")
# ============================================================
# Step 3: Simulation execution (parallel)
# ============================================================
Bash("python3 skills/codec-rd-eval/scripts/run_eval.py <config.hjson> --mode local",
timeout=600000) # Up to 10 min for large evaluations
# ============================================================
# Step 4: BD metric calculation
# ============================================================
Bash("python3 skills/codec-rd-eval/scripts/bd_rate.py .rat/scratch/rd-eval/results.json --output .rat/scratch/rd-eval/bd-metrics.json")
# ============================================================
# Step 5: Report generation
# ============================================================
# Read bd-metrics.json and generate markdown report
Read(".rat/scratch/rd-eval/bd-metrics.json")
# Write report to configured output path
</Tool_Usage>
**Example 1: Algorithm comparison during DSE** ``` User: "Compare RD performance of 3 H.264 intra prediction algorithm candidates" → Invoke /rtl-agent-team:codec-rd-eval → Step 1: Verify refc/ exists, generate test-config.hjson (candidates[] mode) → Step 2: Build 3 encoders: SAD, Hadamard, SATD+RDOQ → Step 3: Simulate BasketballDrill, BQTerrace, RaceHorses × QP{22,27,32,37} → Step 4: vs SAD(anchor): Hadamard BD-rate=-3.2%, SATD+RDOQ BD-rate=-5.1% → Step 5: Generate docs/phase-1-research/rd-eval-report.md (with N-candidate matrix) → "SATD+RDOQ achieves best BD-rate -5.1% over SAD. Hadamard gives -3.2%." ```Example 2: Fixed-point precision evaluation with SSIM
User: "Measure quality difference between 12-bit vs 16-bit internal paths including SSIM"
→ quality_metrics: ["psnr", "ssim"] configured
→ anchor: 16-bit internal path encoder
→ test: 12-bit internal path encoder
→ BD-PSNR = -0.02 dB, BD-rate = +0.5%, SSIM delta = -0.0001
→ "12-bit path shows BD-rate +0.5% vs 16-bit, negligible SSIM difference (-0.0001). Recommend 12-bit considering gate count savings."
Example 3: No encoder source available
User: "Run BD-rate comparison"
→ Step 1: refc/*.c not found
→ "No ref C model encoder source found. Run /rtl-agent-team:ref-model first to generate the reference model."
<Escalation_And_Stop_Conditions>
<Final_Checklist> Before reporting completion, verify ALL of the following:
If ANY item is unchecked → DO NOT report completion. Fix the issue first. </Final_Checklist>