Skill

tldrs-interbench-sync

Sync interbench eval coverage with tldrs capabilities. Run when tldrs gains new formats, flags, or commands. Reads the tldrs manifest as ground truth and generates minimal targeted edits to 4 interbench files.

npx claudepluginhub mistakeknot/interagency-marketplace --plugin tldr-swinton

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/tldr-swinton:tldrs-interbench-sync

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

BashReadEdit

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Sync interbench's eval coverage with the current tldrs capabilities.

SKILL.md

113 lines · ~866 tokens

Similar Skills

receiving-code-review

214.9k

Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.

superpowers

Stats

LanguagePython

Stars2

MaintenanceExcellent

Last CommitMar 24, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

interbench Sync Protocol

Sync interbench's eval coverage with the current tldrs capabilities.

Step 1: Get ground truth

tldrs manifest --pretty

Save this output — it is the single source of truth for all tldrs capabilities.

Step 2: Run gap detection

tldrs manifest | python3 /root/projects/Interverse/infra/interbench/scripts/check_tldrs_sync.py

If exit 0: report "interbench is in sync" and stop. If exit 1: continue with the gaps listed.

Step 3: Read target files

Read ALL 4 files to understand existing patterns before editing:

/root/projects/Interverse/infra/interbench/scripts/regression_suite.json
/root/projects/Interverse/infra/interbench/scripts/ab_formats.py
/root/projects/Interverse/infra/interbench/demo-tldrs.sh
/root/projects/Interverse/infra/interbench/scripts/score_tokens.py

Step 4: Generate edits for each gap

regression_suite.json patterns

Each query entry follows this pattern:

{
    "name": "{command}_{qualifier}",
    "description": "Human-readable description",
    "command": ["command", "entry_or_flags...", "--format", "fmt"],
    "metadata": {"tool": "tldrs", "command": "cmd", ...}
}

For (command, format) gaps: add a query using truncate_output as the default entry
For boolean flag gaps: add a query combining the flag with --format ultracompact
For zoom level gaps: add a query with --zoom Lx --format ultracompact
command_raw (with {project} placeholder) is only for slice since it needs absolute paths

ab_formats.py patterns

The DEFAULT_FORMATS list should contain all formats from the context command. Add missing formats to the list in the existing style:

DEFAULT_FORMATS = ["ultracompact", "text", "cache-friendly", "packed-json", "columnar-json"]

demo-tldrs.sh patterns

Each demo run block follows this pattern:

# -- Run N: Description --
echo "-- Run N: description --"
"$ASHPOOL" run \
  -m tool=tldrs \
  -m command=context \
  -m entry=truncate_output \
  -m format=FORMAT_NAME \
  -- $TLDRS context truncate_output \
       --project "$TLDRS_PROJECT" \
       --format FORMAT_NAME
echo

Add new run blocks before the # -- Summary -- section. Increment the run number.

score_tokens.py patterns

For scoring hints with metrics, add a parse_* function following the existing pattern:

def parse_FORMAT_NAME(context: str) -> dict | None:
    """Extract SIGNAL from tldrs FORMAT output."""
    # Parse the signal from context
    ...

Only add parsers for formats listed in scoring_hints that have non-empty metrics.

Step 5: Verify

After all edits, re-run the sync check:

tldrs manifest | python3 /root/projects/Interverse/infra/interbench/scripts/check_tldrs_sync.py

Report the result. Exit 0 means all gaps are covered.

Common Errors

Do NOT add entries for non-eval commands (tree, search, imports, etc.)
Do NOT modify manifest.py — it is the source of truth, not a target
Do NOT remove existing entries — only add missing ones
For diff-context formats, note it has json and json-pretty which context does not

tldrs-interbench-sync

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

tldrs-interbench-sync

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

interbench Sync Protocol

Step 1: Get ground truth

Step 2: Run gap detection

Step 3: Read target files

Step 4: Generate edits for each gap

regression_suite.json patterns

ab_formats.py patterns

demo-tldrs.sh patterns

score_tokens.py patterns

Step 5: Verify

Common Errors

Similar Skills

Help us improve

interbench Sync Protocol

Step 1: Get ground truth

Step 2: Run gap detection

Step 3: Read target files

Step 4: Generate edits for each gap

regression_suite.json patterns

ab_formats.py patterns

demo-tldrs.sh patterns

score_tokens.py patterns

Step 5: Verify

Common Errors