From autoresearch
Scans codebase for files and functions with tunable parameters, magic numbers, scoring logic, or prompt templates optimizable via autoresearch against metrics. Use to discover candidates before /autoresearch.
npx claudepluginhub pjhoberman/autoresearch --plugin autoresearch-discoverThis skill uses the workspace's default tool permissions.
Scan a codebase to find where autoresearch experiments would be most valuable. Outputs a ranked list of candidates with suggested metrics, so the user can pick one and run `/autoresearch <file>`.
Sets up Karpathy-style autoresearch experiments to autonomously optimize code in one constrained file via iterative evals against a numerical metric, generating instructions.md, eval script, test data, and launch prompt.
Guides interactive setup of optimization goals, metrics, and scope; runs autonomous git-committed experiment loops: code changes, testing, measurement, keep improvements or revert. For performance tuning in git repos.
Orchestrates autonomous experiments to optimize measurable metrics like build time, latency, accuracy, or configs via git branches and .lab/ logging.
Share bugs, ideas, or general feedback.
Scan a codebase to find where autoresearch experiments would be most valuable. Outputs a ranked list of candidates with suggested metrics, so the user can pick one and run /autoresearch <file>.
Search the codebase for these patterns, roughly in order of how likely they are to benefit from autoresearch:
Scoring/ranking logic — functions that compute scores, rank items, or sort results. Look for:
Magic numbers and thresholds — hardcoded numeric values that control behavior:
LLM prompts with downstream metrics — prompt templates where output quality is measurable:
Algorithm parameters — configuration that controls algorithm behavior:
Regex patterns and parsing rules — patterns that extract or match data:
Feature engineering — code that transforms raw data into signals:
Filtering and selection logic — code that decides what to include/exclude:
Search the codebase for tunable patterns. Use a combination of:
weight, threshold, score, boost, penalty, factor, coefficient, alpha, beta, gamma, lambda, decay, damping, scaling, top_k, top_n, max_, min_, num_, temperature, promptFor each candidate file/region, assess:
Present findings as a ranked list. For each candidate:
### Candidate N: [short description]
**File:** `path/to/file.py` (lines X-Y)
**Tunables:** [list of specific parameters/logic that could be optimized]
**Suggested metric:** [specific, measurable metric]
**Eval exists:** Yes / Partial / No
**Eval difficulty:** Easy / Medium / Hard
**Potential:** High / Medium / Low
**Why:** [1-2 sentences on why this is a good autoresearch target]
To run: `/autoresearch path/to/file.py`
Rank by: (eval feasibility × potential impact). A high-potential target with no eval path is less actionable than a medium-potential target where you can start tonight.
After the list, recommend which candidate to start with and why. Prefer:
If nothing looks like a good autoresearch target, say so. Not every codebase has tunable code — some are pure CRUD, some have already been well-optimized, some need architectural changes rather than parameter tuning.