From antigravity-awesome-skills
Probes AI model's behavioral patterns across refusals, hallucinations, reasoning, formatting, tool use, and persona via 30 questions; generates HTML reports with charts for model evaluation, debugging, and compliance.
npx claudepluginhub sickn33/antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
Systematically probe an AI model's behavioral patterns and generate a visual report. The AI agent probes *itself* — no API key or external setup needed.
Probes AI model's behavioral patterns across refusals, hallucinations, reasoning, formatting, tool use, and persona via 30 questions; generates HTML reports with charts for model evaluation, debugging, and compliance.
Executes 44 OWASP AI Testing Guide v1 test cases for AI trustworthiness across Application, Model, Infrastructure, and Data layers with payloads and remediation. Useful for pen testing AI/ML systems before deployment.
Audits pre-launch AI features across 6 dimensions—model selection, data quality, cost, monitoring, failure UX, optimization—grading readiness and blocking shipment of broken products.
Share bugs, ideas, or general feedback.
Systematically probe an AI model's behavioral patterns and generate a visual report. The AI agent probes itself — no API key or external setup needed.
bdistill's Behavioral X-Ray runs 30 carefully designed probe questions across 6 dimensions, auto-tags each response with behavioral metadata, and compiles results into a styled HTML report with radar charts and actionable insights.
Use it to understand your model before building with it, compare models for task selection, or track behavioral drift over time.
pip install bdistill
claude mcp add bdistill -- bdistill-mcp # Claude Code
For other tools, add bdistill-mcp as an MCP server in your project config.
In Claude Code:
/xray # Full behavioral probe (30 questions)
/xray --dimensions refusal # Probe just one dimension
/xray-report # Generate report from completed probe
In any tool with MCP:
"X-ray your behavioral patterns"
"Test your refusal boundaries"
"Generate a behavioral report"
| Dimension | What it measures |
|---|---|
| tool_use | When does it call tools vs. answer from knowledge? |
| refusal | Where does it draw safety boundaries? Does it over-refuse? |
| formatting | Lists vs. prose? Code blocks? Length calibration? |
| reasoning | Does it show chain-of-thought? Handle trick questions? |
| persona | Identity, tone matching, composure under hostility |
| grounding | Hallucination resistance, fabrication traps, knowledge limits |
A styled HTML report showing:
/distill --adversarial) alongside behavioral probes for complete model profiling@bdistill-knowledge-extraction - Extract structured domain knowledge from any AI model