Plugin

optimize-anything

Name: optimize-anything
Author: asragab

By asragab

Optimize any text artifact using gepa — prompts, code, configs, skills

Component Overview

/analyze, /budget +7

Commands

Agents

evaluator-patterns, generate-evaluator +1

Skills

Hooks

MCP Servers

LSP Servers

Output Styles

Install

npx claudepluginhub asragab/optimize-anything

Component Details

Commands (9)

analyze

/analyze

Discover quality dimensions for an artifact and objective

budget

/budget

Recommend an iteration budget based on artifact complexity

compare

/compare

Side-by-side comparison of two artifacts using composed score calls

explain

/explain

Show the optimization plan for given inputs

intake

/intake

Normalize and validate an intake specification

optimize

/optimize

Guided optimization workflow with mode selection and evaluator setup

quick

/quick

Zero-config one-shot optimization for fast improvements

score

/score

Score a single artifact with an evaluator

validate

/validate

Cross-validate an artifact with multiple LLM judge providers

Skills (3)

evaluator-patterns

/evaluator-patterns

Complete runnable evaluator templates for prompt, code, documentation, and agent-instructions artifacts

generate-evaluator

/generate-evaluator

Create or write an evaluator script for scoring text artifacts, prompts, or configs during gepa optimization. Use when asked to build, scaffold, or generate an evaluator, scoring function, or judge for optimize-anything.

optimization-guide

/optimization-guide

Guide for running, configuring, and interpreting `optimize-anything` and `gepa` optimization workflows. Use when asked how to optimize a prompt, artifact, config, or skill, or when troubleshooting evaluator feedback, budget, or score interpretation.

README

optimize-anything

LLM-guided optimization for text artifacts using an iterative propose-evaluate-reflect loop with a bring-your-own evaluator.

Quickstart (v2)

# 1) Install
curl -fsSL https://raw.githubusercontent.com/ASRagab/optimize-anything/main/install.sh | bash

# 2) Create a seed artifact
echo "Write a concise support prompt" > seed.txt

# 3) Generate a starter evaluator (default: judge/python template)
optimize-anything generate-evaluator seed.txt \
  --objective "Score clarity, actionability, and specificity" \
  > eval.py

# 4) Optimize
optimize-anything optimize seed.txt \
  --judge-model openai/gpt-4o-mini \
  --objective "Improve clarity and specificity" \
  --model openai/gpt-4o-mini \
  --budget 20 \
  --parallel --workers 4 \
  --cache \
  --run-dir runs \
  --output result.txt

CLI stdout returns a JSON summary — see Result Contract for the full shape.

How It Works

optimize-anything runs a GEPA (Guided Evolutionary Prompt Algorithm) loop: propose → evaluate → reflect, repeating until budget is exhausted or early stopping kicks in.

seed.txt ──► [Propose] ──► candidates
                 ▲               │
                 │           [Evaluate]
             [Reflect] ◄──── scores + diagnostics

Propose — The optimizer generates candidate artifacts from your seed (or from scratch in seedless mode).
Evaluate — Each candidate is scored by your evaluator. Three evaluator types are supported: a command evaluator (any executable that reads JSON on stdin and writes a score on stdout), an HTTP evaluator (a service that accepts POST requests), or a built-in LLM judge (no evaluator script required — just pass --judge-model).
Reflect — Scores and diagnostics feed back into the next proposal round. The loop continues, progressively improving the artifact toward your objective.

The evaluator is the only thing you bring. Everything else — proposal strategy, reflection, early stopping, caching, parallelism — is handled by the optimizer.

Runtime Modes

Dataset / Valset modes

Use --dataset for multi-task optimization (one evaluator call per example). Add --valset for generalization validation.

optimize-anything optimize prompt.txt \
  --judge-model openai/gpt-4o-mini \
  --objective "Generalize across customer request types" \
  --dataset data/train.jsonl \
  --valset data/val.jsonl \
  --model openai/gpt-4o-mini \
  --budget 120 --parallel --workers 6 --cache --run-dir runs

Multi-provider validation

Cross-check one artifact with multiple judge providers:

optimize-anything validate result.txt \
  --providers openai/gpt-4o-mini anthropic/claude-sonnet-4-5 google/gemini-2.0-flash \
  --objective "Score clarity, constraints, and robustness" \
  --intake-file intake.json

Seedless mode

No seed file required; GEPA bootstraps from objective.

optimize-anything optimize --no-seed \
  --objective "Draft a concise, testable API prompt" \
  --model openai/gpt-4o-mini \
  --judge-model openai/gpt-4o-mini

--no-seed requires both --objective and --model.

Early stopping and cache reuse

Early stop is auto-enabled when --budget > 30 (or force with --early-stop)
Reuse prior evaluator cache with --cache-from (requires --cache + --run-dir)

optimize-anything optimize seed.txt \
  --evaluator-command bash eval.sh \
  --model openai/gpt-4o-mini \
  --budget 150 \
  --cache --cache-from runs/run-20260303-120000 \
  --run-dir runs \
  --early-stop --early-stop-window 12 --early-stop-threshold 0.003

Score range options

For command/HTTP evaluators:

--score-range unit (default): enforce score in [0, 1]
--score-range any: allow any finite float

optimize-anything optimize seed.txt \
  --evaluator-command bash eval.sh \
  --model openai/gpt-4o-mini \
  --score-range any

CLI Subcommands

optimize
generate-evaluator
intake
explain
budget
score
analyze
validate

Claude Code Plugin

optimize-anything is also a Claude Code plugin with guided slash commands and skills.

Plugin Regression Workflow

Use the regression harness when you want to verify that Claude can actually invoke the plugin correctly end-to-end, not merely that the CLI itself still works.

# Direct plugin regression run
uv run python scripts/plugin_regression.py

# Full repo validation including plugin regression
uv run python scripts/check.py --with-plugin

Requirements:

claude CLI installed and authenticated
OPENAI_API_KEY set in the shell that launches the command
ANTHROPIC_API_KEY set in the shell that launches the command

The harness runs three real scenarios (analyze, validate, quick), saves Claude JSON outputs plus stderr logs, and fails if Claude does not execute the expected workflow or the optimized artifact is not written.

Installation

View full README on GitHub

Similar Plugins

ui-design

32.9k

204

Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns

Stats

Version0.3.5

Stars0

Installs2

MaintenanceExcellent

LicenseMIT

AddedMar 22, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

optimize-anything

LLM-guided optimization for text artifacts using an iterative propose-evaluate-reflect loop with a bring-your-own evaluator.

Quickstart (v2)

# 1) Install
curl -fsSL https://raw.githubusercontent.com/ASRagab/optimize-anything/main/install.sh | bash

# 2) Create a seed artifact
echo "Write a concise support prompt" > seed.txt

# 3) Generate a starter evaluator (default: judge/python template)
optimize-anything generate-evaluator seed.txt \
  --objective "Score clarity, actionability, and specificity" \
  > eval.py

# 4) Optimize
optimize-anything optimize seed.txt \
  --judge-model openai/gpt-4o-mini \
  --objective "Improve clarity and specificity" \
  --model openai/gpt-4o-mini \
  --budget 20 \
  --parallel --workers 4 \
  --cache \
  --run-dir runs \
  --output result.txt

CLI stdout returns a JSON summary — see Result Contract for the full shape.

How It Works

optimize-anything runs a GEPA (Guided Evolutionary Prompt Algorithm) loop: propose → evaluate → reflect, repeating until budget is exhausted or early stopping kicks in.

seed.txt ──► [Propose] ──► candidates
                 ▲               │
                 │           [Evaluate]
             [Reflect] ◄──── scores + diagnostics

Propose — The optimizer generates candidate artifacts from your seed (or from scratch in seedless mode).
Evaluate — Each candidate is scored by your evaluator. Three evaluator types are supported: a command evaluator (any executable that reads JSON on stdin and writes a score on stdout), an HTTP evaluator (a service that accepts POST requests), or a built-in LLM judge (no evaluator script required — just pass --judge-model).
Reflect — Scores and diagnostics feed back into the next proposal round. The loop continues, progressively improving the artifact toward your objective.

The evaluator is the only thing you bring. Everything else — proposal strategy, reflection, early stopping, caching, parallelism — is handled by the optimizer.

Runtime Modes

Dataset / Valset modes

Use --dataset for multi-task optimization (one evaluator call per example). Add --valset for generalization validation.

optimize-anything optimize prompt.txt \
  --judge-model openai/gpt-4o-mini \
  --objective "Generalize across customer request types" \
  --dataset data/train.jsonl \
  --valset data/val.jsonl \
  --model openai/gpt-4o-mini \
  --budget 120 --parallel --workers 6 --cache --run-dir runs

Multi-provider validation

Cross-check one artifact with multiple judge providers:

optimize-anything validate result.txt \
  --providers openai/gpt-4o-mini anthropic/claude-sonnet-4-5 google/gemini-2.0-flash \
  --objective "Score clarity, constraints, and robustness" \
  --intake-file intake.json

Seedless mode

No seed file required; GEPA bootstraps from objective.

optimize-anything optimize --no-seed \
  --objective "Draft a concise, testable API prompt" \
  --model openai/gpt-4o-mini \
  --judge-model openai/gpt-4o-mini

--no-seed requires both --objective and --model.

Early stopping and cache reuse

Early stop is auto-enabled when --budget > 30 (or force with --early-stop)
Reuse prior evaluator cache with --cache-from (requires --cache + --run-dir)

optimize-anything optimize seed.txt \
  --evaluator-command bash eval.sh \
  --model openai/gpt-4o-mini \
  --budget 150 \
  --cache --cache-from runs/run-20260303-120000 \
  --run-dir runs \
  --early-stop --early-stop-window 12 --early-stop-threshold 0.003

Score range options

For command/HTTP evaluators:

--score-range unit (default): enforce score in [0, 1]
--score-range any: allow any finite float

optimize-anything optimize seed.txt \
  --evaluator-command bash eval.sh \
  --model openai/gpt-4o-mini \
  --score-range any

CLI Subcommands

optimize
generate-evaluator
intake
explain
budget
score
analyze
validate

Claude Code Plugin

optimize-anything is also a Claude Code plugin with guided slash commands and skills.

Plugin Regression Workflow

Use the regression harness when you want to verify that Claude can actually invoke the plugin correctly end-to-end, not merely that the CLI itself still works.

# Direct plugin regression run
uv run python scripts/plugin_regression.py

# Full repo validation including plugin regression
uv run python scripts/check.py --with-plugin

Requirements:

claude CLI installed and authenticated
OPENAI_API_KEY set in the shell that launches the command
ANTHROPIC_API_KEY set in the shell that launches the command

optimize-anything

Component Overview

Install

Component Details

Commands (9)

Skills (3)

README

optimize-anything

Quickstart (v2)

How It Works

Runtime Modes

Dataset / Valset modes

Multi-provider validation

Seedless mode

Early stopping and cache reuse

Score range options

CLI Subcommands

Claude Code Plugin

Plugin Regression Workflow

Installation

Similar Plugins

ui-design

optimize-anything

Component Overview

Install

Component Details

Commands (9)

Skills (3)

README

optimize-anything

Quickstart (v2)

How It Works

Runtime Modes

Dataset / Valset modes

Multi-provider validation

Seedless mode

Early stopping and cache reuse

Score range options

CLI Subcommands

Claude Code Plugin

Plugin Regression Workflow

Installation

Similar Plugins

ui-design

nanobanana

qiushi-skill

prompts.chat