Help us improve
Share bugs, ideas, or general feedback.
From its-hub
Detects the inference-scaling environment, runs inference-time scaling on a prompt, and presents results with vote counts or scores.
npx claudepluginhub red-hat-ai-innovation-team/its_hub --plugin its-hubHow this skill is triggered — by the user, by Claude, or both
Slash command
/its-hub:inference-scalingThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Execute scaling on a prompt. For algorithm guidance, budget tuning, and troubleshooting, consult the `inference-scaling-guide` skill.
Runs inference-time scaling on multiple prompts from JSONL, CSV, or TXT files. Useful for batch processing, evaluation runs, or dataset-level scaling.
Measures latency, token cost, and accuracy across LLM skill/prompt variants. Runs paired evaluations, audits token-budget compliance, and flags insufficient sample sizes.
Provides LLM serving optimization recommendations for latency, inference costs, and throughput. Scans configs, detects stacks like vLLM/TGI, suggests quantization, batching, KV cache, and framework changes.
Share bugs, ideas, or general feedback.
Execute scaling on a prompt. For algorithm guidance, budget tuning, and troubleshooting, consult the inference-scaling-guide skill.
"${CLAUDE_PLUGIN_ROOT}/scripts/its_detect.sh"
library=missing or config=missing: invoke the setup-guide skill.library=installed, config=found)Proceed to Step 2.
Run the scaling script with the user's prompt and any overrides:
"${CLAUDE_PLUGIN_ROOT}/scripts/its_scale.sh" --metadata $ARGUMENTS
If the user provides a file path (e.g., "scale all prompts in data/eval.jsonl"), invoke the batch-scaling skill instead.
If the scaling failed, consult the inference-scaling-guide skill for troubleshooting.