From beagle-testing
Executes YAML test plans: parses setup with prerequisites, env vars, builds, services and health checks; runs tests sequentially, stops on first failure with rich debug output. Ideal for automated testing setups.
npx claudepluginhub existential-birds/beagle --plugin beagle-testingThis skill uses the workspace's default tool permissions.
Execute a YAML test plan, run setup commands, health checks, and each test sequentially. Stop on first failure with rich debug output.
Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.
Builds DCF models with sensitivity analysis, Monte Carlo simulations, and scenario planning for investment valuation and risk assessment.
Calculates profitability (ROE, margins), liquidity (current ratio), leverage, efficiency, and valuation (P/E, EV/EBITDA) ratios from financial statements in CSV, JSON, text, or Excel for investment analysis.
Execute a YAML test plan, run setup commands, health checks, and each test sequentially. Stop on first failure with rich debug output.
agent-browser:agent-browser skill to be available--plan <path>: Path to test plan (default: docs/testing/test-plan.yaml)--skip-setup: Skip setup commands and health checks (for re-running after failure)Read and validate the test plan:
# Check file exists
ls docs/testing/test-plan.yaml || { echo "Error: Test plan not found"; exit 1; }
# Validate YAML
python3 -c "import yaml; yaml.safe_load(open('docs/testing/test-plan.yaml'))" || { echo "Error: Invalid YAML"; exit 1; }
Extract from the YAML:
setup.commands: List of setup commandssetup.health_checks: List of URLs to polltests: Array of test casesIf setup.prerequisites exists, verify each one:
# For each prerequisite in setup.prerequisites
<prerequisite.check> || { echo "Prerequisite not met: <prerequisite.name>"; exit 1; }
If setup.env exists, export each variable. Variables using ${VAR} syntax should be resolved from the current environment:
# For each key/value in setup.env
export <key>="<value>"
If setup.build exists, execute build commands sequentially:
# For each command in setup.build
<command> || { echo "Build failed: <command>"; exit 1; }
If setup.services exists, start long-running processes and wait for health checks:
# For each service in setup.services
nohup <service.command> > .beagle/service-<index>.log 2>&1 &
echo $! > .beagle/service-<index>.pid
For each service with a health_check, poll until ready:
timeout=<service.health_check.timeout or 30>
url=<service.health_check.url>
elapsed=0
while [ $elapsed -lt $timeout ]; do
if curl -s -o /dev/null -w "%{http_code}" "$url" | grep -qE "^(200|301|302)"; then
echo "✓ Health check passed: $url"
break
fi
sleep 2
elapsed=$((elapsed + 2))
done
if [ $elapsed -ge $timeout ]; then
echo "✗ Health check timeout: $url"
exit 1
fi
If the plan uses the older flat format (setup.commands + setup.health_checks instead of prerequisites/build/services), fall back to executing setup.commands sequentially and polling setup.health_checks as before.
For each test in the plan:
## Running: TC-XX - <test.name>
Context: <test.context>
For each step in test.steps, determine the step type and execute accordingly:
Shell commands (run: steps):
The most common step type. Execute the command via Bash and capture stdout, stderr, and exit code:
# Execute the command, capture output and exit code
<command> 2>&1
echo "EXIT_CODE: $?"
Capture all output for evaluation in step 4c. Shell steps cover:
./target/debug/myapp status --all)psql "${DATABASE_URL}" -c "SELECT ...")ls -la /path/to/expected/output)timeout 5 ./myapp 2>&1 || true)curl actions (action: curl steps):
curl -X <method> \
-H "Content-Type: application/json" \
<additional headers> \
-d '<body>' \
"<url>" \
-o response.json \
-w "%{http_code}" > status_code.txt
# Capture response for evaluation
cat response.json
cat status_code.txt
agent-browser CLI actions:
Steps starting with agent-browser are browser automation commands:
# Navigate
agent-browser open <url>
# Snapshot interactive elements (always do before interacting)
agent-browser snapshot -i
# Interact using refs from snapshot output (@e1, @e2, etc.)
agent-browser fill @<ref> "<value>"
agent-browser click @<ref>
# Wait for conditions
agent-browser wait --url "<pattern>"
agent-browser wait --text "<text>"
agent-browser wait --load networkidle
# Capture evidence
agent-browser screenshot docs/testing/evidence/<test.id>.png
Important: Always run agent-browser snapshot -i before interacting with elements to get valid refs, and re-snapshot after navigation or significant DOM changes.
Save screenshots to docs/testing/evidence/<test.id>.png
Using agent reasoning, compare actual outcome against test.expected:
✓ TC-XX PASSED: <test.name>
Continue to next test.
Stop immediately. Go to Step 6.
## Test Results: ALL PASSED
| ID | Name | Result |
|----|------|--------|
| TC-01 | <name> | ✓ PASS |
| TC-02 | <name> | ✓ PASS |
| ... | ... | ... |
**Total:** N/N tests passed
### Evidence
Screenshots saved to `docs/testing/evidence/`
### Cleanup
Stopping background services...
Clean up:
# Kill background services
for pidfile in .beagle/service-*.pid .beagle/dev-server.pid; do
if [ -f "$pidfile" ]; then
kill $(cat "$pidfile") 2>/dev/null
rm "$pidfile"
fi
done
When a test fails, generate rich debug output:
# Get changed files relevant to the failure
git diff --name-only $(git merge-base HEAD origin/main)..HEAD
# Get recent changes in files mentioned in test.context
git diff $(git merge-base HEAD origin/main)..HEAD -- <relevant_files>
## Test Failure: TC-XX - <test.name>
### What Failed
**Test:** <test.name>
**Expected:**
<test.expected>
**Actual:**
<Describe what actually happened - response code, error message, screenshot description>
### Relevant Changes in This PR
<For each file mentioned in test.context or related to the failure:>
- `<file>` (lines X-Y) - <brief description of changes>
### Evidence
<If screenshot exists:>
- Screenshot: `docs/testing/evidence/<test.id>.png`
<If API response:>
- Status code: <code>
- Response body:
```json
<response>
<Based on the error, suggest 2-3 specific things to check:>
Copy this to start a new Claude session:
I'm debugging a test failure in branch <branch>.
Test: <test.name> Error:
Relevant files:
### 6c. Preserve Evidence
```bash
# Ensure evidence directory exists
mkdir -p docs/testing/evidence
# Save failure context
cat > docs/testing/evidence/<test.id>-failure.md << 'EOF'
# Failure Report: <test.id>
<Full debug report content>
EOF
# Kill background services
for pidfile in .beagle/service-*.pid .beagle/dev-server.pid; do
if [ -f "$pidfile" ]; then
kill $(cat "$pidfile") 2>/dev/null
rm "$pidfile"
fi
done
Always output a summary table showing progress:
## Test Results
| ID | Name | Result |
|----|------|--------|
| TC-01 | <name> | ✓ PASS |
| TC-02 | <name> | ✗ FAIL |
| TC-03 | <name> | - SKIP |
**Passed:** 1/3
**Failed:** TC-02
Tests after a failure are marked as SKIP (not executed).
Before completing:
# Verify evidence directory exists
ls -la docs/testing/evidence/
# List captured evidence
ls docs/testing/evidence/*.png docs/testing/evidence/*.md 2>/dev/null
Verification Checklist:
docs/testing/evidence/--skip-setup flag to re-run after fixing issues