By llv22
Autonomous ML research pipeline: idea discovery → experiment → review → paper writing
Autonomous multi-round research review loop. Repeatedly reviews via Codex MCP, implements fixes, and re-reviews until positive assessment or max rounds reached. Use when user says "auto review loop", "review until it passes", or wants autonomous iterative improvement.
Workflow 1: Full idea discovery pipeline. Orchestrates research-lit → idea-creator → novelty-check → research-reviewer to go from a broad research direction to validated, pilot-tested ideas. Use when user says "idea discovery pipeline" or wants the complete idea exploration workflow.
Workflow 3: Full paper writing pipeline. Orchestrates paper-plan → paper-figure → paper-write → paper-compile → paper-improver to go from a narrative report to a polished, submission-ready PDF. Use when user says "write paper pipeline", "paper writing", or wants the complete paper generation workflow.
Full research pipeline: Workflow 1 (idea discovery) → implementation → Workflow 2 (auto review loop). Goes from a broad research direction to validated, reviewed research. Use when user says "full pipeline", "end-to-end research", or wants the complete autonomous research lifecycle. Does NOT include paper writing (Workflow 3) — invoke /autor.paper-writing separately after this completes.
Download and set up venue-specific LaTeX templates. Supports iclr2026, neurips2026, icml2026, emnlp2026, or a custom venue. Use when user says "download template", "setup venue", or wants to configure a new conference template.
Use this agent when the paper-writing pipeline needs to iteratively improve a compiled paper. Runs REVIEWER_MODEL xhigh review, implements fixes, and recompiles for MAX_IMPROVEMENT_ROUNDS rounds to polish writing quality, fix theoretical inconsistencies, and soften overclaims.
Use this agent when the idea-discovery pipeline needs external critical feedback on research ideas, papers, or experimental results. Invokes REVIEWER_MODEL via Codex MCP with xhigh reasoning to act as a senior ML reviewer.
Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says "analyze results", "compare", or needs to interpret experimental data.
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.
Verify research idea novelty against recent literature. Use when user says "查新", "novelty check", "有没有人做过", "check novelty", or wants to verify a research idea is novel before implementing.
Compile LaTeX paper to PDF, fix errors, and verify output. Use when user says "编译论文", "compile paper", "build PDF", "生成PDF", or wants to compile LaTeX into a submission-ready PDF.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Let Claude Code do research while you sleep. Wake up to find your paper scored, weaknesses identified, experiments run, and narrative rewritten — autonomously.
A Claude Code plugin for autonomous ML research workflows. Orchestrates cross-model collaboration — Claude Code drives the research while an external LLM (via Codex MCP) acts as a critical reviewer. Also supports alternative model combinations (e.g., GLM + GPT, GLM + MiniMax) — no Claude API required.
Why cross-model? A single model reviewing its own output creates blind spots. Two complementary models — Claude Code for fast execution, GPT-5.4 xhigh for rigorous critique — produce better outcomes than either alone. Going from 1 to 2 models is the biggest gain; adding more gives diminishing returns.
# 1. Clone and enter the project
git clone https://github.com/llv23/AutoResearchWithEyes.git
cd AutoResearchWithEyes
# 2. Set up Codex MCP (for cross-model review)
npm install -g @openai/codex
codex auth login
# Codex MCP auto-configures from .mcp.json when running in the project directory
# 3. Launch Claude Code — skills and commands are auto-discovered
claude
# Run any workflow command:
> /autor.idea-discovery "your research direction" # Plain text → ./autor.idea_discovery/
> /autor.idea-discovery path/to/autor.idea_discovery/task_spec.md # Spec file → path/to/autor.idea_discovery/
> /autor.auto-review-loop # Review → fix → re-review overnight
> /autor.paper-writing "NARRATIVE_REPORT.md" # Narrative → polished PDF
> /autor.research-pipeline "your research direction" # Full end-to-end pipeline
Each command produces a self-contained output folder — the input spec, intermediate results, final reports, and live status tracking are all organized in one place:
path/to/
└── autor.idea_discovery/ # Self-contained output folder
├── task_spec.md # Copy of input (portability & reproducibility)
├── task_status.md # Live status, decisions, linked idea rankings
├── LITERATURE_SURVEY.md # Phase 1: landscape summary with paper table
├── IDEA_REPORT.md # Phase 5: ranked ideas + experiment plan
└── REVIEW_<topic>_<date>.md # Phase 4: external reviewer feedback
task_status.md is updated after each phase — it tracks pipeline progress, key decisions, and includes a ranked idea table where each idea links to its full description in IDEA_REPORT.md.
See click-through/case-0/ for a complete worked example.
All workflow commands follow the same self-contained output pattern: the output folder includes task_spec.md (input), task_status.md (live status with linked rankings), and all generated results — making it portable, shareable, and reproducible.
| Command | Input | Output Folder |
|---|---|---|
/autor.idea-discovery path/to/spec.md | spec.md | path/to/autor.idea_discovery/ |
/autor.idea-discovery "plain text" | (inline) | ./autor.idea_discovery/ |
/autor.idea-discovery spec.md — output: custom/ | spec.md | custom/ |
Override the output folder with
— output: path/to/output/appended to any command.
See Setup for full details.
/autor.idea-discovery, /autor.auto-review-loop, /autor.paper-writing, /autor.research-pipeline)research-reviewer (senior ML reviewer via Codex MCP) and paper-improver (2-round auto-improvement)CLAUDE.md, override per-invocation with inline argumentsTEMPLATE_DIR/VENUE/ → bundled → error with instructionsnpx claudepluginhub llv22/autoresearchwitheyesOh My Paper research harness: memory system, Codex delegation, and pipeline commands for academic research projects.
Three AI models, one synthesis — multi-model research workflow for scientific domains
Autonomous, personalized research loops for Claude Code. Set a topic, walk away, come back to a quality-gated report adapted to your projects.
Research-team agents for Claude Code: supervisor, analysis-implementer, paper-writer, figure-descriptor, reviewer, literature-curator.
Strategic research thinking agents — idea evaluation, project triage, and structured brainstorming inspired by Carlini's research methodology
Autonomous research loops with 10 commands. Generalizes Karpathy's autoresearch loop to any domain with mechanical evaluation, overnight persistence, and zero dependencies.