From optimize-anything
Guide for running, configuring, and interpreting `optimize-anything` and `gepa` optimization workflows. Use when asked how to optimize a prompt, artifact, config, or skill, or when troubleshooting evaluator feedback, budget, or score interpretation.
npx claudepluginhub asragab/optimize-anythingThis skill uses the workspace's default tool permissions.
End-to-end guide for optimizing text artifacts with `optimize-anything` and `gepa`.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Automates semantic versioning and release workflow for Claude Code plugins: bumps versions in package.json, marketplace.json, plugin.json; verifies builds; creates git tags, GitHub releases, changelogs.
End-to-end guide for optimizing text artifacts with optimize-anything and gepa.
Start with your current best version of the artifact. gepa evolves from here.
objective if you have no seed, and let gepa bootstrap one from the description.{"system_prompt": "...", "examples": "..."} for multi-component artifacts (e.g., system_prompt + few-shot examples).Use the generate-evaluator skill to create one matched to your objective. The evaluator is the most critical pieceโgepa's optimization quality is bounded by your evaluator's feedback quality.
Single-task (no dataset) โ optimize one artifact against one evaluator:
{"seed": "...", "evaluator_command": ["bash", "evaluators/eval.sh"]}
Multi-task (with dataset) โ optimize across multiple examples for cross-task transfer:
result = optimize_anything(
seed_candidate="...",
evaluator=eval_fn,
dataset=[{"input": "q1", "expected": "a1"}, ...],
)
Generalization (train + validation split) โ ensure the artifact transfers to unseen examples:
result = optimize_anything(
seed_candidate="...",
evaluator=eval_fn,
dataset=train_examples,
valset=val_examples,
)
Use the budget subcommand for a starting point, then adjust:
| Seed length | Recommended budget | Rationale |
|---|---|---|
| < 100 chars | 50 | Short artifact, fewer mutations needed |
| 100-499 | 100 | Moderate exploration |
| 500-1999 | 200 | More search space to cover |
| 2000+ | 300 | Extensive exploration recommended |
Configure options via GEPAConfig:
from gepa.optimize_anything import GEPAConfig, EngineConfig
config = GEPAConfig(
engine=EngineConfig(
max_metric_calls=150, # Budget
parallel=True, # Parallel evaluation
max_workers=8, # Worker count
),
)
Via CLI:
optimize-anything optimize seed.txt --evaluator-command bash evaluators/eval.sh --budget 100 --objective "maximize clarity" -o result.txt
Via Python API:
from optimize_anything import optimize_anything, command_evaluator
from gepa.optimize_anything import GEPAConfig, EngineConfig
result = optimize_anything(
seed_candidate=open("seed.txt").read(),
evaluator=command_evaluator(["bash", "evaluators/eval.sh"]),
objective="maximize clarity",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
Use plateau-based early stopping to avoid wasting budget after convergence:
optimize-anything optimize seed.txt \
--evaluator-command bash evaluators/eval.sh \
--budget 120 \
--early-stop \
--early-stop-window 10 \
--early-stop-threshold 0.005
Notes:
--early-stop is auto-enabled when --budget > 30.--early-stop-window and --early-stop-threshold for noisier evaluators.early_stopped and stopped_at_iteration when a run exits early.For cache reuse across runs, copy prior disk cache entries into a new run directory:
optimize-anything optimize seed.txt \
--evaluator-command bash evaluators/eval.sh \
--run-dir runs \
--cache \
--cache-from runs/run-20260303-120000
Notes:
--cache-from requires --cache and --run-dir.--cache-from copies fitness_cache/ from the previous run before optimization starts.The result contains:
best_candidate โ the optimized artifact.val_aggregate_scores โ score progression across iterations.total_metric_calls โ how many evaluator invocations were used.Signs of a good run:
total_metric_calls < budget (converged early).best_candidate against seed.txt or in-memory seed to see targeted differences.Signs of problems:
seed โ add richer feedback, increase budget, or refine objective.budget 20-50 first to validate your evaluator on seed.txt and confirm that scores change meaningfully.gepa's reflection.objective string that is injected into gepa's reflection prompt and specify constraints like token limits or format requirements.background for domain knowledge, constraints, or strategies such as "Target audience is non-technical users. Never use jargon."budget if optimization results on seed.txt are poor.evaluator_cwd as an absolute project path next to seed.txt and evaluators/eval.sh when evaluators/eval.sh or other evaluator commands use repo-relative files or scripts.