Skill

agents-meet-rl

Troubleshoots agentic-RL training, evaluation, and experiment design for LLM agents. Routes symptoms to fixes anchored in a corpus of RL methods and frameworks.

ai-ml

research

Popularity

Stars

1,623

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/agents-meet-rl:agents-meet-rl

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A corpus-anchored handbook for diagnosis and selection. It supplies

Supporting Files

SKILL.md

67 lines · ~870 tokens

Stats

LanguageHTML

Stars1,623

Forks63

MaintenanceExcellent

Last CommitJun 20, 2026

Actions

View Source View Plugin View on GitHub View README

What this is

A corpus-anchored handbook for diagnosis and selection. It supplies knowledge — it does not read or run your training: it can't inspect your logs, wandb, or live metrics. You bring the symptom; it returns likely causes, checks, and cited fixes for you to apply.

Where things are

problems/_INDEX.md — symptom → file routing, grouped under training/, evaluation/, research-workflow/. Start here.
problems/<cat>/<file>.md — per-symptom files. Most follow Symptoms → Root causes → Diagnosis → Fixes → References; knob / decision / modality / eval-checklist / research-workflow files use task-oriented structures.
references/_INDEX.md + references/<cat>.md — per-category project lists with full metadata. Each entry carries an Idea: line — one sentence on its distinctive contribution, grounded in the paper/repo. Use for "which framework / benchmark" selection, to look up project names not routed via problems/_INDEX.md, and to answer "what's the idea behind X" by quoting its Idea: line.
database.json — machine-readable, 312 entries (each with a takeaway field mirroring the Idea: line) plus 3 paper-only algorithms (DAPO, Dr.GRPO, VAPO) whitelisted in scripts/lint_skill.py.

Citing fixes

Name the algorithm or idea, then anchor with whatever canonical URLs exist for that entry — typically github + arxiv + org + date, but paper-only algorithms (in the whitelist) get just the paper URL, and tools / environments without papers get just github + org + date.

Examples:

Project with paper (typical): Adapt Search-R1's outcome-only reward — code · paper · UIUC/Google · 2025.3.

Paper-only algorithm (whitelist): Try DAPO's clip-higher — paper · ByteDance Seed · 2025.3.

Tool / environment without paper: Run rollouts in atropos — code · Nous Research · 2025.4.

Cite at the idea level, not paper sections or file paths inside repos — they rot. If an entry isn't in the corpus, say so; don't fabricate.

If two corpus entries share a name (e.g. ARPO appears as both a reasoning RL method and a GUI-agent training method), disambiguate by including the org and paper URL — they are different works.

Staleness

Snapshot date: 2026-06-20. If the user mentions a project or paper released after that, flag explicitly that this skill's corpus may not cover it.

agents-meet-rl

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

agents-meet-rl

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

What this is

Where things are

Citing fixes

Staleness

Reused across plugins

Similar Skills

What this is

Where things are

Citing fixes

Staleness

Reused across plugins

Similar Skills