Help us improve
Share bugs, ideas, or general feedback.
From autoresearch-agent
Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator.
npx claudepluginhub ciciliaeth/claude-skills --plugin autoresearch-agentHow this skill is triggered — by the user, by Claude, or both
Slash command
/autoresearch-agent:setupThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Set up a new autoresearch experiment with all required configuration.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Creates, reads, edits, and analyzes .docx files using docx-js for new documents, pandoc for text extraction, Python scripts for XML unpacking/validation/changes, and LibreOffice for conversions.
Share bugs, ideas, or general feedback.
Set up a new autoresearch experiment with all required configuration.
/ar:setup # Interactive mode
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list # Show existing experiments
/ar:setup --list-evaluators # Show available evaluators
Pass them directly to the setup script:
python {skill_path}/scripts/setup_experiment.py \
--domain {domain} --name {name} \
--target {target} --eval "{eval_cmd}" \
--metric {metric} --direction {direction} \
[--evaluator {evaluator}] [--scope {scope}]
Collect each parameter one at a time:
Then run setup_experiment.py with the collected parameters.
# Show existing experiments
python {skill_path}/scripts/setup_experiment.py --list
# Show available evaluators
python {skill_path}/scripts/setup_experiment.py --list-evaluators
| Name | Metric | Use Case |
|---|---|---|
benchmark_speed | p50_ms (lower) | Function/API execution time |
benchmark_size | size_bytes (lower) | File, bundle, Docker image size |
test_pass_rate | pass_rate (higher) | Test suite pass percentage |
build_speed | build_seconds (lower) | Build/compile/Docker build time |
memory_usage | peak_mb (lower) | Peak memory during execution |
llm_judge_content | ctr_score (higher) | Headlines, titles, descriptions |
llm_judge_prompt | quality_score (higher) | System prompts, agent instructions |
llm_judge_copy | engagement_score (higher) | Social posts, ad copy, emails |
Report to the user:
/ar:run {domain}/{name} to start iterating, or /ar:loop {domain}/{name} for autonomous mode."