Agentic Property-Based Testing

Get a coding agent to find bugs in your codebase by mining properties and testing them via Hypothesis.

For the artifacts from the paper, including bug reports and rankings, see the paper directory. Note that the code that was used in the paper is slightly behind what is in the main folder. See paper/README.md for more details.

To see all the bugs our agent found, see our website.

Running the agent

The agent is a Claude Code command. You will need to have Claude Code installed to run it. You will need a subscription to Claude Code, or an API key (we recommend an API key if you are running it over a large number of packages, or to reproduce the paper).

The command is contained in the hypo.md file. You will need to place this file in the .claude/commands/ directory, which can either be in ~ or in whichever directory you are running the agent from. The agent can then be invoked with /hypo <target>.

You will need pytest, hypothesis, and the package you are testing installed.

The agent takes one argument, which is the target to test. This can be a file, a function, or a module. If no argument is given, it will test the entire codebase, i.e., the current working directory. You can pass whichever other arguments that Claude Code supports, like the model, permissions, etc.

Example usage:

claude "/hypo numpy"
claude "/hypo statistics.median" --model opus

You can also just start Claude Code, and then invoke the agent.

hypo-plugin

Installation

From All Plugins

Quick Install

From Category

Agentic Property-Based Testing

Running the agent

Agent runner

How the agent works

Security

Outputs

Ranking the bug reports

Statistics

Information

Categories