Name: research
Author: borda

🏠 Borda's AI-Rig

Specialist-agent infrastructure for Python/ML OSS — the scaffolding that lets you maintain at scale without becoming a full-time reviewer.

16 specialist roles across Claude and Codex · 20+ Claude workflows · 12 Codex-native skills · 5 domain plugins — opinionated Claude Code + Codex CLI configuration for Python/ML OSS maintainers, version-controlled and self-calibrating.

Contents

🚀 What This Setup Enables
⚡ Quick Start
🔁 Daily OSS Workflow
🎯 Why
💡 Design Principles
🧩 Agents
🤖 Claude Code
🤖 Codex CLI
🤝 Claude + Codex Integration
🛠 Recommended Add-ons
📦 What's Here
🔌 Plugin Management

🚀 What This Setup Enables

Things not possible with vanilla Claude Code:

Parallel multi-specialist PR review with convergence callouts. /oss:review fans six specialist agents — architecture, tests, perf, docs, lint, security — plus an independent Codex pre-pass, all running simultaneously. The consolidator flags every finding that two or more reviewers independently raised. You see both per-dimension analysis and the overlap, in one report.
Codex-native specialist orchestration. Codex skills now split broad work into specialist-owned context packs instead of flooding every agent with the whole repo or PR thread. code-review, develop, code-remediate, and investigate can fan out to focused QA, architecture, docs, CI, security, data, performance, research, and challenge passes, then consolidate evidence into one decision.
PR code-review-to-remediation without report-path hunting. Inside Codex, $code-review #123 writes the review artifact. Later, $code-remediate #123 +review finds the newest matching report, re-collects current PR evidence, checks out the PR locally, fetches the target branch, pre-stages merge/conflict context, asks which findings to resolve, then applies only the selected work.
Remediation workplans that assign selected findings to the right owner. Before editing, Codex groups selected findings by root cause, closure type, affected files, verification command, or merge risk. Each group gets a primary owner, verifier, context pack, and expected closure evidence, so several related review comments become one coherent fix instead of scattered edits.
Feature development that cannot skip the demo test. /develop:feature requires a failing demo test to exist and pass review before a single line of production code is written. The gate is structural — the workflow does not proceed to implementation without it.
Metric-driven experiment loops that auto-rollback on regression. /research:run proposes a change, applies it, measures the target metric, and automatically reverts if the metric regresses — then tries the next hypothesis. The loop runs unattended; you set the goal and the guard, and review the committed result.
Agent and skill calibration that measures overconfidence and workflow leaks. /foundry:calibrate and .codex/calibration/run.py score recall, precision, confidence accuracy, stale assumptions, missing gates, fake fan-out claims, and unsafe PR-remediation behavior. The offline Codex harness runs those checks in CI without contacting any LLM.

vs. vanilla Claude Code

🏠 Borda's AI-Rig

Specialist-agent infrastructure for Python/ML OSS — the scaffolding that lets you maintain at scale without becoming a full-time reviewer.

Contents

🚀 What This Setup Enables
⚡ Quick Start
🔁 Daily OSS Workflow
🎯 Why
💡 Design Principles
🧩 Agents
🤖 Claude Code
🤖 Codex CLI
🤝 Claude + Codex Integration
🛠 Recommended Add-ons
📦 What's Here
🔌 Plugin Management

🚀 What This Setup Enables

Things not possible with vanilla Claude Code:

Parallel multi-specialist PR review with convergence callouts. /oss:review fans six specialist agents — architecture, tests, perf, docs, lint, security — plus an independent Codex pre-pass, all running simultaneously. The consolidator flags every finding that two or more reviewers independently raised. You see both per-dimension analysis and the overlap, in one report.
Codex-native specialist orchestration. Codex skills now split broad work into specialist-owned context packs instead of flooding every agent with the whole repo or PR thread. code-review, develop, code-remediate, and investigate can fan out to focused QA, architecture, docs, CI, security, data, performance, research, and challenge passes, then consolidate evidence into one decision.
PR code-review-to-remediation without report-path hunting. Inside Codex, $code-review #123 writes the review artifact. Later, $code-remediate #123 +review finds the newest matching report, re-collects current PR evidence, checks out the PR locally, fetches the target branch, pre-stages merge/conflict context, asks which findings to resolve, then applies only the selected work.
Remediation workplans that assign selected findings to the right owner. Before editing, Codex groups selected findings by root cause, closure type, affected files, verification command, or merge risk. Each group gets a primary owner, verifier, context pack, and expected closure evidence, so several related review comments become one coherent fix instead of scattered edits.
Feature development that cannot skip the demo test. /develop:feature requires a failing demo test to exist and pass review before a single line of production code is written. The gate is structural — the workflow does not proceed to implementation without it.
Metric-driven experiment loops that auto-rollback on regression. /research:run proposes a change, applies it, measures the target metric, and automatically reverts if the metric regresses — then tries the next hypothesis. The loop runs unattended; you set the goal and the guard, and review the committed result.
Agent and skill calibration that measures overconfidence and workflow leaks. /foundry:calibrate and .codex/calibration/run.py score recall, precision, confidence accuracy, stale assumptions, missing gates, fake fan-out claims, and unsafe PR-remediation behavior. The offline Codex harness runs those checks in CI without contacting any LLM.

research

Popularity

What's Inside

Confidence

README

🏠 Borda's AI-Rig

🚀 What This Setup Enables

vs. vanilla Claude Code

Similar Plugins

autoresearch

research-collaborator

phd-skills

foundry

clab

ai-research-workflows

More by Borda

foundry

oss

develop

codemap

sentinel

🏠 Borda's AI-Rig

🚀 What This Setup Enables

vs. vanilla Claude Code

Popularity

Health & Quality

More by Borda

foundry

oss

develop

codemap

sentinel

Similar Plugins

autoresearch

research-collaborator

phd-skills

foundry

clab

ai-research-workflows