Help us improve
Share bugs, ideas, or general feedback.
From agentic-usability
Initializes a new agentic-usability benchmark pipeline project with interactive wizard or direct config.json creation. Use when setting up SDK benchmarks or evaluation projects.
npx claudepluginhub pspdfkit-labs/agentic-usability --plugin agentic-usabilityHow this skill is triggered — by the user, by Claude, or both
Slash command
/agentic-usability:init [project-directory][project-directory]This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Set up a new agentic-usability benchmark pipeline in the given project directory.
Generates SDK usability benchmark test cases by exploring source code. Use when creating evaluation scenarios or test suites for SDKs.
Scaffolds a multi-agent repository from the Antigravity template with quick or full mode (MCP toggle, swarm preference, sandbox type, optional git init).
Initializes Claude Code project configuration with directory skeleton, AGENTS.md, and interactive setup for project-specific rules and workflows. Use when starting a new project or configuring Claude Code.
Share bugs, ideas, or general feedback.
Set up a new agentic-usability benchmark pipeline in the given project directory.
echo "Project directory: $ARGUMENTS"
You have two approaches:
Run agentic-usability init -p $ARGUMENTS for a step-by-step interactive setup.
If the user has described their SDK, create config.json directly. This is faster and allows you to tailor the config to their exact setup.
After init, the project should have:
<project>/
config.json # Configuration (you create this)
suite.json # Test suite (created by generate)
results/ # Run results (created by eval/execute)
cache/repos/ # Git repo cache (created automatically)
{
"privateInfo": [],
"publicInfo": [],
"agents": {},
"targets": [],
"sandbox": {}
}
privateInfo (required, non-empty array)SDK source code and internal docs. Visible to generator and judge, never to executor. Each entry is a SourceConfig with a type discriminator:
Local source — filesystem path:
{ "type": "local", "path": "./src", "subpath": "packages/core", "additionalContext": "Focus on the Builder API" }
Fields: path (required), subpath (optional), additionalContext (optional)
Git source — clone a repository:
{ "type": "git", "url": "https://github.com/org/sdk.git", "branch": "main", "subpath": "src", "sparse": ["src/api"], "additionalContext": "..." }
Fields: url (required), branch, subpath, sparse (sparse checkout paths), additionalContext (all optional)
URL source — fetch documentation:
{ "type": "url", "url": "https://internal-docs.example.com/api-ref", "additionalContext": "..." }
Fields: url (required), additionalContext (optional)
Package source — metadata about the SDK package:
{ "type": "package", "name": "@example/sdk", "installCommand": "npm install @example/sdk", "language": "typescript", "additionalContext": "..." }
Fields: name (required), installCommand, language, additionalContext (all optional)
publicInfo (optional array)Public docs and package info visible to both executor and judge. Same SourceConfig types as above. Typically includes:
package source so executors know what to installurl source for public documentationagents (optional object)| Role | Type | Runs in sandbox? | Secret required? |
|---|---|---|---|
generator | AgentConfig | No (host) | No |
executor | SandboxAgentConfig | Yes | Yes |
judge | SandboxAgentConfig | Yes | Yes |
insights | AgentConfig | No (host) | No |
AgentConfig fields (generator, insights):
command (required): "claude", "codex", "gemini", or custom CLI namesystemPrompt (optional): supports {{packageName}} and {{docsUrl}} placeholdersSandboxAgentConfig — extends AgentConfig with required secret:
{
"command": "claude",
"secret": { "value": "$ANTHROPIC_API_KEY" }
}
AgentSecretConfig fields:
value (required): raw API key or "$ENV_VAR" referenceenvVar: env var name for key inside sandbox — auto-detected for known agentsbaseUrl: API base URL — auto-detected for known agentsbaseUrlEnvVar: env var for base URL override — auto-detected for known agentsKnown agent defaults (auto-filled, user only needs value):
| command | envVar | baseUrl | baseUrlEnvVar |
|---|---|---|---|
claude | ANTHROPIC_API_KEY | https://api.anthropic.com | ANTHROPIC_BASE_URL |
codex | CODEX_API_KEY | https://api.openai.com/v1 | OPENAI_BASE_URL |
gemini | GEMINI_API_KEY | https://generativelanguage.googleapis.com | GEMINI_API_BASE_URL |
Custom agents must explicitly set envVar and baseUrl in the secret.
targets (required, non-empty array)Docker images for sandboxed execution:
{ "name": "node-20", "image": "node:20-slim", "timeout": 1200, "additionalContext": "Node.js 20 with npm" }
Fields: name (required), image (required), timeout (optional, seconds), additionalContext (optional, included in generator prompt)
sandbox (required object, can be {}){
"concurrency": 3,
"defaultTimeout": 600,
"memoryMib": 2048,
"cpus": 2,
"secrets": {
"EXTRA_API_KEY": {
"value": "$EXTRA_KEY",
"allowHosts": ["api.extra-service.com"],
"allowHostPatterns": ["*.extra-service.com"]
}
},
"env": {
"LICENSE_KEY": "$MY_LICENSE_KEY"
}
}
secrets: TLS-injected secrets that never enter the VM. Each needs value and non-empty allowHosts.env: Plain env vars passed directly into sandbox. Values can use $VAR to reference host env.workspace (optional){ "template": "./workspace-template", "setupScript": "./setup.sh" }
For the full schema with all validation rules, see config-schema.md.
{
"privateInfo": [
{ "type": "local", "path": "./sdk-source", "additionalContext": "Main SDK source code" }
],
"publicInfo": [
{ "type": "package", "name": "my-sdk", "installCommand": "npm install my-sdk", "language": "typescript" },
{ "type": "url", "url": "https://docs.my-sdk.io/getting-started" }
],
"agents": {
"generator": { "command": "claude" },
"executor": { "command": "claude", "secret": { "value": "$ANTHROPIC_API_KEY" } },
"judge": { "command": "claude", "secret": { "value": "$ANTHROPIC_API_KEY" } }
},
"targets": [
{ "name": "node-20", "image": "node:20-slim", "timeout": 1200 }
],
"sandbox": {
"concurrency": 3,
"defaultTimeout": 600
}
}
After creating config.json, run agentic-usability generate -p <project> to create the test suite.