Interactive sandbox testing for PopKit skills and commands.
/plugin marketplace add jrc1883/popkit-claude/plugin install popkit@popkit-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Interactive sandbox testing for PopKit skills and commands.
Use this skill when the user wants to:
This skill provides an interactive interface to the PopKit Sandbox Testing Platform. It runs skills and commands in isolated environments, captures telemetry, and provides detailed analytics.
Present test options using AskUserQuestion:
Use AskUserQuestion tool with:
- question: "What would you like to test?"
- header: "Test Mode"
- options:
1. label: "Run P0 smoke tests"
description: "Critical tests that must pass (10 tests, ~10min)"
2. label: "Run full test suite"
description: "P0 + P1 tests (25 tests, ~1hr)"
3. label: "Test specific skill"
description: "Choose a skill to test"
4. label: "Test specific command"
description: "Choose a command to test"
- multiSelect: false
If testing specific skill/command, ask:
Use AskUserQuestion tool with:
- question: "Which test runner should we use?"
- header: "Runner"
- options:
1. label: "Local (Recommended)"
description: "Fast, no setup required, uses temp directory"
2. label: "E2B Cloud"
description: "Full isolation, requires E2B API key"
3. label: "Both (Comparison)"
description: "Run in both and compare results"
- multiSelect: false
Read the test matrix and filter based on selection:
cd packages/plugin/tests/sandbox
python matrix_loader.py --suite smoke --json # For P0 tests
python matrix_loader.py --suite full --json # For full suite
python matrix_loader.py --type skill --json # For skills only
For each test in the filtered list:
Show Progress:
[2/10] Testing pop-brainstorming...
Run Test (local runner):
from local_runner import LocalTestRunner, TestConfig
runner = LocalTestRunner()
config = TestConfig(
test_name="pop-brainstorming",
test_type="skill",
timeout_seconds=180
)
result = runner.run_skill_test("pop-brainstorming", {"topic": "test topic"}, config)
Capture Result:
After all tests complete:
cd packages/plugin/tests/sandbox
python analytics.py --recent 10 --json
Present summary:
## Test Results Summary
| Metric | Value |
|--------|-------|
| Total Tests | 10 |
| Passed | 8 |
| Failed | 1 |
| Partial | 1 |
| Duration | 8m 32s |
### Failed Tests
1. ❌ skill-session-resume-001
- Error: STATUS.json not found
- Duration: 45s
### Partial Tests
1. ⚠️ command-git-commit-001
- Warning: No staged changes
- Duration: 12s
Use AskUserQuestion tool with:
- question: "What would you like to do next?"
- header: "Next"
- options:
1. label: "View detailed report"
description: "Full markdown report with all metrics"
2. label: "Re-run failed tests"
description: "Retry the 2 failed/partial tests"
3. label: "Compare with previous run"
description: "Show regression analysis"
4. label: "Done"
description: "Exit sandbox testing"
- multiSelect: false
Test individual skills in isolation:
Test slash commands:
Test multi-step workflows:
All tests capture:
User: Run sandbox tests
Claude: I'll help you run sandbox tests for PopKit.
[AskUserQuestion: What would you like to test?]
> Run P0 smoke tests
[AskUserQuestion: Which test runner?]
> Local (Recommended)
Starting P0 smoke tests (10 tests)...
[1/10] ✅ skill-brainstorming-001 (42s)
[2/10] ✅ skill-code-review-001 (1m 15s)
[3/10] ✅ skill-morning-generator-001 (38s)
[4/10] ✅ skill-session-capture-001 (22s)
[5/10] ❌ skill-session-resume-001 (45s)
Error: STATUS.json not found in test directory
[6/10] ✅ skill-plugin-test-001 (2m 8s)
[7/10] ✅ command-dev-brainstorm-001 (55s)
[8/10] ✅ command-routine-morning-001 (1m 32s)
[9/10] ✅ command-routine-nightly-001 (1m 28s)
[10/10] ✅ command-plugin-test-001 (2m 5s)
## Results Summary
| Metric | Value |
|--------|-------|
| Passed | 9/10 (90%) |
| Failed | 1/10 |
| Duration | 11m 10s |
| Token Usage | 45,230 |
| Est. Cost | $0.18 |
[AskUserQuestion: What would you like to do next?]
The existing plugin test command can invoke sandbox tests:
/popkit:plugin test sandbox # Run smoke tests
/popkit:plugin test sandbox --full # Run full suite
/popkit:plugin test sandbox --skill X # Test specific skill
Generate CI-friendly reports:
python analytics.py --ci --recent 10 > test-results.json
Output includes pass/fail status and regression detection.
Tests are defined in packages/plugin/tests/sandbox/test_matrix.json.
Environment variables:
POPKIT_TEST_MODE=1 - Enable test telemetryE2B_API_KEY - For E2B cloud tests (optional)UPSTASH_REDIS_REST_URL - For cloud telemetry sync (optional)| Component | Location | Purpose |
|---|---|---|
| Test Matrix | tests/sandbox/test_matrix.json | Test definitions |
| Matrix Loader | tests/sandbox/matrix_loader.py | Filter and load tests |
| Local Runner | tests/sandbox/local_runner.py | Local test execution |
| E2B Runner | tests/sandbox/e2b_runner.py | Cloud test execution |
| Analytics | tests/sandbox/analytics.py | Results analysis |
| Telemetry | hooks/utils/test_telemetry.py | Event capture |
/popkit:plugin test - Plugin validation commandpop-plugin-test - Plugin self-test skillThis skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.