From eidos
Enables direct invocation of the Experiment skill via /Experiment. Auto-loads when 'experiment.md' matches conversation context for Claude Code skill testing.
npx claudepluginhub agenticnotetaking/eidos --plugin eidosThis skill uses the workspace's default tool permissions.
experiment.md
Tests and benchmarks Claude Code skills empirically via evaluation-driven development. Compares skill vs baseline performance using pass rates, timing, token metrics in quick workflow or 7-phase full pipeline.
Manages ML experiment lifecycle via YAML registry: register experiments, record benchmarks, compare runs, track status. For Python ML research metadata without databases or job launching.
Creates new Claude Code skills from scratch, modifies and improves existing ones, evaluates with test cases and benchmarks including variance analysis, and optimizes descriptions for triggering accuracy.
Share bugs, ideas, or general feedback.
experiment.md