Autonomous multi-round research review loop. Repeatedly reviews via Codex MCP, implements fixes, and re-reviews until positive assessment or max rounds reached. Use when user says "auto review loop", "review until it passes", or wants autonomous iterative improvement.
npx claudepluginhub llv22/autoresearchwitheyestopic-or-scope# Auto Review Loop: Autonomous Research Improvement Autonomously iterate: review → implement fixes → re-review, until the external reviewer gives a positive assessment or MAX_ROUNDS is reached. ## Context: $ARGUMENTS ## Constants All constants (MAX_ROUNDS, POSITIVE_THRESHOLD, REVIEWER_MODEL) are defined in the project's **CLAUDE.md**. Read them from there before proceeding. - REVIEW_DOC: `AUTO_REVIEW.md` in project root (cumulative log) ## State Persistence (Compact Recovery) Long-running loops may hit the context window limit, triggering automatic compaction. To survive this, persis...
Share bugs, ideas, or general feedback.
Autonomously iterate: review → implement fixes → re-review, until the external reviewer gives a positive assessment or MAX_ROUNDS is reached.
All constants (MAX_ROUNDS, POSITIVE_THRESHOLD, REVIEWER_MODEL) are defined in the project's CLAUDE.md. Read them from there before proceeding.
AUTO_REVIEW.md in project root (cumulative log)Long-running loops may hit the context window limit, triggering automatic compaction. To survive this, persist state to REVIEW_STATE.json after each round:
{
"round": 2,
"threadId": "019cd392-...",
"status": "in_progress",
"last_score": 5.0,
"last_verdict": "not ready",
"pending_experiments": ["screen_name_1"],
"timestamp": "2026-03-13T21:00:00"
}
Write this file at the end of every Phase E (after documenting the round). Overwrite each time — only the latest state matters.
On completion (positive assessment or max rounds), set "status": "completed" so future invocations don't accidentally resume a finished loop.
REVIEW_STATE.json in project root:
status is "completed": fresh start (previous loop finished normally)status is "in_progress" AND timestamp is older than 24 hours: fresh start (stale state from a killed/abandoned run — delete the file and start over)status is "in_progress" AND timestamp is within 24 hours: resume
round, threadId, last_score, pending_experimentsAUTO_REVIEW.md to restore full context of prior roundspending_experiments is non-empty, check if they have completed (e.g., check screen sessions)AUTO_REVIEW.md with header and timestampSend comprehensive context to the external reviewer:
mcp__codex__codex:
model: REVIEWER_MODEL
config: {"model_reasoning_effort": "xhigh"}
prompt: |
[Round N/MAX_ROUNDS of autonomous review loop]
[Full research context: claims, methods, results, known weaknesses]
[Changes since last round, if any]
Please act as a senior ML reviewer (NeurIPS/ICML level).
1. Score this work 1-10 for a top venue
2. List remaining critical weaknesses (ranked by severity)
3. For each weakness, specify the MINIMUM fix (experiment, analysis, or reframing)
4. State clearly: is this READY for submission? Yes/No/Almost
Be brutally honest. If the work is ready, say so clearly.
If this is round 2+, use mcp__codex__codex-reply with the saved threadId to maintain conversation context.
CRITICAL: Save the FULL raw response from the external reviewer verbatim (store in a variable for Phase E). Do NOT discard or summarize — the raw text is the primary record.
Then extract structured fields:
STOP CONDITION: If POSITIVE_THRESHOLD is met (see CLAUDE.md for threshold definition) → stop loop, document final state.
For each action item (highest priority first):
Prioritization rules:
If experiments were launched:
Append to AUTO_REVIEW.md:
## Round N (timestamp)
### Assessment (Summary)
- Score: X/10
- Verdict: [ready/almost/not ready]
- Key criticisms: [bullet list]
### Reviewer Raw Response
<details>
<summary>Click to expand full reviewer response</summary>
[Paste the COMPLETE raw response from the external reviewer here — verbatim, unedited.
This is the authoritative record. Do NOT truncate or paraphrase.]
</details>
### Actions Taken
- [what was implemented/changed]
### Results
- [experiment outcomes, if any]
### Status
- [continuing to round N+1 / stopping]
Write REVIEW_STATE.json with current round, threadId, score, verdict, and any pending experiments.
Increment round counter → back to Phase A.
When loop ends (positive assessment or max rounds):
REVIEW_STATE.json with "status": "completed"AUTO_REVIEW.mdconfig: {"model_reasoning_effort": "xhigh"} for maximum reasoning depthmcp__codex__codex-reply for subsequent roundsmcp__codex__codex-reply:
threadId: [saved from round 1]
model: REVIEWER_MODEL
config: {"model_reasoning_effort": "xhigh"}
prompt: |
[Round N update]
Since your last review, we have:
1. [Action 1]: [result]
2. [Action 2]: [result]
3. [Action 3]: [result]
Updated results table:
[paste metrics]
Please re-score and re-assess. Are the remaining concerns addressed?
Same format: Score, Verdict, Remaining Weaknesses, Minimum Fixes.