From fabrik
Runs autonomous experiment loops to iteratively optimize measurable metrics like code performance, ML loss, build size via git branches, code changes, verify commands, and guards.
npx claudepluginhub maragudk/fabrik --plugin fabrikThis skill uses the workspace's default tool permissions.
An autonomous experiment loop. You make a change, measure it, keep it if it's better, discard it if it's not, and repeat. The idea is simple: given a clear metric and a way to measure it, you can run experiments indefinitely while the user sleeps, eats, or touches grass.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Processes PDFs: extracts text/tables/images, merges/splits/rotates pages, adds watermarks, creates/fills forms, encrypts/decrypts, OCRs scans. Activates on PDF mentions or output requests.
Share bugs, ideas, or general feedback.
An autonomous experiment loop. You make a change, measure it, keep it if it's better, discard it if it's not, and repeat. The idea is simple: given a clear metric and a way to measure it, you can run experiments indefinitely while the user sleeps, eats, or touches grass.
This works for anything with a measurable outcome: code performance, ML model quality, bundle size, test coverage, response latency -- if you can extract a number from a command, you can optimize it.
There are two phases: plan (interactive setup with the user) and loop (autonomous experimentation).
Before the loop starts, gather the configuration through a short conversation. Ask these questions one at a time, waiting for the user's answer before proceeding.
Ask: "What are you trying to optimize?"
This is a free-text description of the objective. Examples: "Reduce API p95 response time", "Lower validation loss on the language model", "Minimize Docker image size".
Ask: "What command should I run to measure the metric? The output should contain the metric value as a number."
This is a shell command whose output contains the metric. Examples:
npm run bench | grep "p95"uv run train.py 2>&1 | grep "val_bpb"du -sh dist/ | awk '{print $1}'Based on the goal description, infer whether lower or higher is better, then confirm with the user. For example: "From your goal, I'm assuming lower is better -- is that right?"
Ask: "Is there a command that must always pass? For example, a test suite. Leave blank if not needed."
A guard is a safety net: a command that must exit with code 0 for an experiment to be kept. The guard prevents the optimization from breaking things. For example, if you're optimizing response time, npm test as a guard ensures you don't accidentally break functionality in the process.
Ask: "Do you want to restrict which files I can modify, or is everything fair game?"
If the user provides a list of files or directories, only modify those. If they say everything is fair game, use your judgment based on the goal.
Create and push a dedicated autoresearch branch from the current branch (typically main). All experimentation happens on or from this branch -- main stays clean.
git checkout -b autoresearch
git push -u origin autoresearch
Before starting the loop, run both the verify command and the guard command (if set) once. Confirm that:
If either fails, ask the user to fix the command before proceeding. This catches misconfigurations before the loop wastes time on them.
Record the metric value from the dry run as the baseline.
Once the plan is confirmed, the loop runs autonomously. Never stop. Never ask. Each iteration follows these steps:
Read the current state to inform your next experiment:
autoresearch-results.tsv) to see what's been triedgit branch -a | grep autoresearch-) to recall past attemptsThis is your memory. The TSV tells you what worked and what didn't. The branches tell you what approaches have been explored. Use this to avoid repeating failed ideas and to build on successful ones.
Pick one idea to try. Favor approaches that:
When stuck, don't just make tiny variations of the same idea. Try something radically different -- a different algorithm, a different data structure, a completely different approach to the problem.
Create a new branch from the autoresearch branch with a descriptive name:
autoresearch-<short-description>
Examples: autoresearch-increase-batch-size, autoresearch-inline-hot-path, autoresearch-switch-to-radix-sort
Make one atomic change. One idea, one experiment. Keep changes small and focused so the metric delta is clearly attributable to the change. If equal results can be achieved with less code, prefer the simpler version.
Respect the scope if one was defined. Never modify the guard command's test files or the verify command's evaluation harness -- the measurement infrastructure is sacred.
Commit with a descriptive message and push the branch:
git add <specific files>
git commit -m "autoresearch: <description of the change>"
git push -u origin autoresearch-<short-description>
Run the verify command and extract the metric value from the output. If the command crashes or fails, read the error output, note it in the log, and treat it as a failed experiment.
Run the guard command. It either passes (exit code 0) or fails. Only run this if a guard was configured.
Compare the metric to the current best (starting from the baseline established in the dry run):
autoresearch branch, merge the experiment branch, and push autoresearch. This is now the new baseline.autoresearch branch. Leave the experiment branch as-is -- it serves as a record of what was tried.Append a row to autoresearch-results.tsv in the project root. Create the file with headers on the first iteration if it doesn't exist.
Columns:
iteration branch metric delta guard status description
-0.03 or +12.5)pass, fail, or - (no guard configured)kept (merged to autoresearch) or discarded (branch left as-is)Commit and push this file to the autoresearch branch after each iteration so the log is always up to date.
Go back to step 1. Do not stop. Do not ask for confirmation. Keep running experiments until the user interrupts.