From aradotso-trending-skills-37
Automates ML research workflows—idea discovery, experiments, paper writing, reviews, rebuttals—using Markdown skills with Claude Code and cross-model LLM reviewers like Codex.
npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-1 --plugin aradotso-trending-skills-37This skill uses the workspace's default tool permissions.
```markdown
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
---
name: aris-autonomous-ml-research
description: Autonomous ML research workflows using ARIS (Auto-Research-In-Sleep) — Markdown-only skills for cross-model paper review, idea discovery, experiment automation, and paper writing with Claude Code, Codex, or any LLM agent.
triggers:
- "set up ARIS for autonomous research"
- "run research pipeline while I sleep"
- "automate ML paper writing with Claude Code"
- "cross-model review loop for my paper"
- "use ARIS to find research ideas"
- "run experiment automation with ARIS"
- "set up auto paper review workflow"
- "write rebuttal with ARIS"
---
# ARIS — Auto-Research-In-Sleep
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
ARIS is a **zero-dependency, Markdown-only** autonomous ML research system. Every "skill" is a plain `SKILL.md` file that any LLM agent can read and execute. It orchestrates cross-model collaboration — one model executes research (Claude Code, Codex, etc.) while another acts as adversarial reviewer (GPT-5.4, Gemini, GLM, MiniMax, etc.) to break self-play blind spots.
**Core value**: going from research direction → paper ideas → experiments → written paper → rebuttal, autonomously, overnight.
---
## Installation
### 1. Clone the Repository
```bash
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep.git
cd Auto-claude-code-research-in-sleep
No pip install, no Docker, no daemon. The entire system is Markdown files.
npm install -g @anthropic-ai/claude-code
npm install -g @openai/codex
Configure Claude Code to use the Codex MCP server by adding to your Claude Code config (~/.claude/settings.json):
{
"mcpServers": {
"codex": {
"command": "codex",
"args": ["mcp"],
"env": {
"OPENAI_API_KEY": "$OPENAI_API_KEY"
}
}
}
}
# Copy all skills to Claude Code's custom skills directory
cp -r skills/claude-code/ ~/.claude/skills/
# Or symlink to stay up to date
ln -s $(pwd)/skills/claude-code ~/.claude/skills/aris
# Required for Claude Code
export ANTHROPIC_API_KEY=your_anthropic_key
# Required for cross-model review (GPT-5.4 as reviewer)
export OPENAI_API_KEY=your_openai_key
# Optional: alternative reviewer models (no OpenAI needed)
export LLM_REVIEWER_BASE_URL=https://api.minimax.chat/v1
export LLM_REVIEWER_API_KEY=your_minimax_key
export LLM_REVIEWER_MODEL=MiniMax-M2.7
ARIS works with any OpenAI-compatible API. Configure the llm-chat MCP server:
{
"mcpServers": {
"llm-chat": {
"command": "node",
"args": ["mcp-servers/llm-chat/index.js"],
"env": {
"LLM_BASE_URL": "$LLM_REVIEWER_BASE_URL",
"LLM_API_KEY": "$LLM_REVIEWER_API_KEY",
"LLM_MODEL": "$LLM_REVIEWER_MODEL"
}
}
}
}
Tested combinations:
| Executor | Reviewer | Config |
|---|---|---|
| Claude Code | GPT-5.4 xhigh | Default |
| Codex CLI | Gemini | Guide |
| Claude Code | MiniMax-M2.7 | LLM_BASE_URL=https://api.minimax.chat/v1 |
| Claude Code | GLM-5 | LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4 |
| MiniMax-M2.7 | GLM-5 | Guide |
| Codex CLI | Claude | Swap executor/reviewer |
/research-pipeline "factorized gap in discrete diffusion LMs"
With a reference paper and base repo:
/research-pipeline "improve method X" — ref paper: https://arxiv.org/abs/2406.04329, base repo: https://github.com/org/project
ARIS will:
Parameters:
/research-pipeline "topic"
— ref paper: <arxiv_url> # Optional: paper to improve
— base repo: <github_url> # Optional: codebase to build on
— venue: ICML # Target venue (default: ICML)
— compact: true # Lean summaries for short-context models
/idea-discovery "discrete diffusion language models"
Scans literature, identifies gaps, generates novel research directions, scores each idea for novelty/feasibility, and outputs a ranked proposal list.
/experiment-bridge "run ablation on temperature scaling" — code review: true
Cross-model code review before GPU deployment (enabled by default). Catches bugs, confirms experimental validity, then runs.
# Example: what experiment-bridge automates
# 1. Claude Code writes training script
# 2. GPT-5.4 reviews the code (code review gate)
# 3. If approved → submits to GPU cluster
# 4. Monitors via W&B API
import wandb
api = wandb.Api()
runs = api.runs("your-entity/your-project")
for run in runs:
print(run.name, run.summary.get("val_loss", None))
/paper-writing "results/" — venue: NeurIPS
Generates LaTeX paper from experiment results. Anti-hallucination enforced: every citation verified via DBLP → CrossRef → [VERIFY] tag if unconfirmed.
Venue templates available: ICML, NeurIPS, ICLR, CVPR, ACL, AAAI, ACM MM
/auto-review "paper.pdf"
The core ARIS loop:
Score progression: 5.2 → 6.1 → 7.3 → 8.0 ✓
/rebuttal "paper/ + reviews" — venue: ICML, character limit: 5000
Parameters:
| Parameter | Default | Description |
|---|---|---|
venue | ICML | Target venue |
character limit | required | Hard limit for submission |
quick mode | false | Stop after parsing + strategy (no draft) |
auto experiment | false | Auto-run supplementary experiments |
max stress test rounds | 1 | GPT-5.4 stress-test iterations |
max followup rounds | 3 | Per-reviewer follow-up limit |
Three safety gates (rebuttal won't finalize if any fails):
Outputs:
PASTE_READY.txt — exact char count, paste directly to venueREBUTTAL_DRAFT_rich.md — extended version for manual editing# Conference presentation
/paper-slides "paper/" # → Beamer PDF + PPTX + speaker notes + Q&A prep
# Conference poster
/paper-poster "paper/" # → A0/A1 poster PDF + editable PPTX + SVG
These skills can be invoked independently or are integrated into the core workflows:
| Skill | Command | Description |
|---|---|---|
| Research Refine | /research-refine | Turn vague ideas into anchored proposals |
| Experiment Plan | /experiment-plan | Claim-driven experiment roadmaps |
| Training Check | /training-check | Validate training runs before full launch |
| Result to Claim | /result-to-claim | Convert raw results to paper claims |
| Ablation Planner | /ablation-planner | Design ablation study structure |
| Formula Derivation | /formula-derivation | Research formula development and verification |
| Grant Proposal | /grant-proposal | Write grant proposals from research |
| Paper Illustration | /paper-illustration | Generate figures (Gemini-powered) |
| Citation Claw | /citation-claw | Verify and format citations |
For short-context models or after interruption:
/research-pipeline "topic" — compact: true
Generates lean summary files at each checkpoint. Resume after interruption:
/research-refine — resume: true
ARIS auto-checkpoints the research-refine workflow and resumes from last completed phase.
Full skill set available for OpenAI Codex without Claude Code:
cd skills/skills-codex/
codex "run idea-discovery on discrete diffusion"
The llm-chat MCP server bridges any OpenAI-compatible API as a reviewer. Start it manually for debugging:
cd mcp-servers/llm-chat/
node index.js
Environment variables:
export LLM_BASE_URL=https://api.openai.com/v1 # Any OpenAI-compatible endpoint
export LLM_API_KEY=$OPENAI_API_KEY
export LLM_MODEL=gpt-4o # Any model name
Zero-cost option — no API key required:
# See full guide: docs/MODELSCOPE_GUIDE.md
export MODELSCOPE_API_KEY=your_modelscope_token
export LLM_BASE_URL=https://api-inference.modelscope.cn/v1
export LLM_MODEL=Qwen/Qwen2.5-72B-Instruct
Templates for every workflow live in templates/:
ls templates/
# idea-discovery.md
# experiment-bridge.md
# paper-writing.md
# auto-review.md
# rebuttal.md
# research-refine.md
Use them to structure your inputs:
cat templates/rebuttal.md
# Fill in: paper path, review text, venue, character limit
# Then: /rebuttal [filled template]
Auto-claude-code-research-in-sleep/
├── skills/
│ ├── claude-code/ # Claude Code SKILL.md files
│ ├── skills-codex/ # Codex CLI native skills
│ ├── idea-discovery/
│ ├── experiment-bridge/
│ ├── paper-writing/
│ ├── auto-review/
│ ├── rebuttal/ SKILL.md ← each is a single readable file
│ ├── paper-slides/
│ ├── paper-poster/
│ ├── research-refine/
│ ├── formula-derivation/
│ └── ...
├── mcp-servers/
│ └── llm-chat/ # Universal reviewer bridge
├── templates/ # Input templates for every workflow
├── docs/
│ ├── CURSOR_ADAPTATION.md
│ ├── TRAE_ARIS_RUNBOOK_EN.md
│ ├── ANTIGRAVITY_ADAPTATION.md
│ ├── MODELSCOPE_GUIDE.md
│ ├── MiniMax-GLM-Configuration.md
│ └── CODEX_GEMINI_REVIEW_GUIDE.md
└── README.md
Cross-model review not triggering:
codex mcp or node mcp-servers/llm-chat/index.jsOPENAI_API_KEY or LLM_API_KEY is set~/.claude/settings.jsonW&B metrics not loading:
import wandb
# Ensure you're logged in
wandb.login(key=os.environ["WANDB_API_KEY"])
api = wandb.Api()
# Use full entity/project path
runs = api.runs("your-entity/your-project")
Context window exceeded mid-workflow:
/research-pipeline "topic" — compact: true
Then resume with — resume: true on the next interrupted skill.
Citation hallucination warnings ([VERIFY] tags):
These are intentional — ARIS flags unverified citations rather than silently hallucinating. Manually verify flagged citations before submission.
Rebuttal exceeds character limit:
Increase max stress test rounds — each round trims the draft:
/rebuttal "paper/ + reviews" — character limit: 5000, max stress test rounds: 3
ModelScope free tier rate limits: Add delay between skill calls or switch to a paid endpoint for overnight runs.
Claude Code = fast fluid execution. GPT-5.4/Gemini/GLM = slower, more deliberate critique. Speed × Rigor = better outcomes than either model alone.
@software{aris2026,
title = {ARIS: Auto-Research-In-Sleep},
author = {wanshuiyin},
year = {2026},
url = {https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep}
}
Join the community: GitHub Discussions
Papers accepted using ARIS: CS Conference (8/10 "clear accept"), AAAI 2026 Main Technical (7/10 "good paper, accept").