From harness-claude
Guides property-based testing with fast-check (JS/TS), hypothesis (Python), proptest (Rust) to verify invariants like round-trip, idempotence on pure functions, serializers, and large input spaces.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Property-based and generative testing with fast-check, hypothesis, and automatic shrinking. Discovers edge cases that example-based tests miss by generating thousands of random inputs and verifying invariants hold for all of them.
Designs property-based tests verifying code invariants across automatically generated inputs. Guides Hypothesis (Python), fast-check (JS/TS), JUnit QuickCheck (Java) for algorithms, parsers, edge cases.
Provides property-based testing with Hypothesis for Python to discover edge cases, validate invariants, test serialization round-trips, and handle inputs missed by example tests.
Guides property-based testing for serialization, validation, normalization, and pure functions with property catalog, pattern detection, and library references.
Share bugs, ideas, or general feedback.
Property-based and generative testing with fast-check, hypothesis, and automatic shrinking. Discovers edge cases that example-based tests miss by generating thousands of random inputs and verifying invariants hold for all of them.
Catalog candidate functions. Search for functions that exhibit testable properties:
decode(encode(x)) === x (round-trip property)Identify properties for each candidate. Common property categories:
deserialize(serialize(x)) === x for any valid xf(f(x)) === f(x) (applying the function twice gives the same result)f(a, b) === f(b, a) for operations where order should not mattera <= b, then f(a) <= f(b) for order-preserving functionsfastImpl(x) === referenceImpl(x) for optimized implementationsDefine input domains. For each property, specify:
Prioritize by risk. Focus property tests on:
Report findings. List candidate functions, their properties, and the expected generator configuration.
Select the property testing framework. Based on the project's language:
Define custom generators (arbitraries) for domain types. For each domain model:
map, flatMap, and filterWrite property test specifications. For each property identified in Phase 1:
Configure shrinking. Ensure the framework's automatic shrinking is enabled:
map over filter where possible, since filter breaks shrinking)Write seed values for reproducibility. Configure:
Run property tests with verbose output. Execute the test suite and observe:
Analyze counterexamples. For each failing property:
Reproduce counterexamples deterministically. For each counterexample:
Handle flaky property tests. If a property test fails intermittently:
Iterate on generator quality. If the generator frequently produces uninteresting inputs:
filter sparingly (it discards inputs, wasting iterations)map and flatMap to construct valid inputs directlyFix bugs exposed by counterexamples. For each real bug found:
Strengthen property specifications. After fixing bugs:
Measure property test effectiveness. Evaluate:
Integrate property tests into CI. Configure:
Run harness validate. Confirm the project passes all harness checks with property tests in place.
If a knowledge graph exists at .harness/graph/, refresh it after code changes to keep graph queries accurate:
harness scan [path]
harness validate -- Run in ANALYZE phase after property tests are written and bugs are fixed. Confirms project health.harness check-deps -- Run after DEFINE phase to verify property testing framework is in devDependencies.emit_interaction -- Used to present counterexample analysis and property specification decisions to the human.filterharness validate passes with property tests in placeIDENTIFY -- Properties of a URL parser:
Function: parseUrl(input: string): ParsedUrl
Properties:
1. Round-trip: formatUrl(parseUrl(url)) === url for any valid URL
2. No-crash: parseUrl(arbitrary_string) never throws (returns Result type)
3. Invariant: parsed.protocol is always lowercase
4. Invariant: parsed.host never contains a trailing slash
DEFINE -- Custom generator and property tests:
// tests/property/url-parser.prop.test.ts
import fc from 'fast-check';
import { parseUrl, formatUrl } from '../../src/url-parser';
// Custom generator for valid URLs
const urlArb = fc
.record({
protocol: fc.constantFrom('http', 'https', 'ftp'),
host: fc.domain(),
port: fc.option(fc.integer({ min: 1, max: 65535 }), { nil: undefined }),
path: fc
.array(
fc.stringOf(fc.constantFrom(...'abcdefghijklmnopqrstuvwxyz0123456789-_'.split('')), {
minLength: 1,
})
)
.map((segments) => '/' + segments.join('/')),
})
.map(({ protocol, host, port, path }) => `${protocol}://${host}${port ? ':' + port : ''}${path}`);
describe('URL parser properties', () => {
it('round-trips valid URLs', () => {
fc.assert(
fc.property(urlArb, (url) => {
const parsed = parseUrl(url);
if (!parsed.ok) return false; // skip invalid (generator should not produce these)
return formatUrl(parsed.value) === url;
}),
{ numRuns: 1000, seed: 42 }
);
});
it('never throws on arbitrary string input', () => {
fc.assert(
fc.property(fc.string(), (input) => {
const result = parseUrl(input);
// Must return a Result, never throw
return result.ok === true || result.ok === false;
}),
{ numRuns: 5000 }
);
});
it('always produces lowercase protocol', () => {
fc.assert(
fc.property(urlArb, (url) => {
const parsed = parseUrl(url.toUpperCase());
if (!parsed.ok) return true; // skip failures
return parsed.value.protocol === parsed.value.protocol.toLowerCase();
})
);
});
});
DEFINE -- Property tests with hypothesis:
# tests/property/test_sort_properties.py
from hypothesis import given, settings, assume
from hypothesis import strategies as st
from myapp.sorting import merge_sort
@given(st.lists(st.integers()))
def test_sort_preserves_length(xs):
"""Sorted output has the same length as input."""
assert len(merge_sort(xs)) == len(xs)
@given(st.lists(st.integers()))
def test_sort_preserves_elements(xs):
"""Sorted output contains exactly the same elements as input."""
assert sorted(merge_sort(xs)) == sorted(xs)
@given(st.lists(st.integers(), min_size=1))
def test_sort_produces_ordered_output(xs):
"""Every element is less than or equal to the next."""
result = merge_sort(xs)
for i in range(len(result) - 1):
assert result[i] <= result[i + 1]
@given(st.lists(st.integers()))
def test_sort_is_idempotent(xs):
"""Sorting an already-sorted list produces the same result."""
once = merge_sort(xs)
twice = merge_sort(once)
assert once == twice
@settings(max_examples=5000)
@given(st.lists(st.floats(allow_nan=False, allow_infinity=False)))
def test_sort_handles_floats(xs):
"""Sort works correctly with floating-point numbers."""
result = merge_sort(xs)
for i in range(len(result) - 1):
assert result[i] <= result[i + 1]
| Rationalization | Reality |
|---|---|
| "We already have example-based tests that cover the edge cases — property tests would just be redundant." | Example-based tests cover the cases the author thought of. Property tests cover the cases they did not. The entire value of generative testing is that it explores regions of the input space that human intuition misses — off-by-one errors, Unicode combining characters, signed integer overflow at boundaries. |
| "The generator keeps producing rejected inputs, so I'll just filter more aggressively to make the test pass faster." | Heavy filter usage is a symptom of a broken generator, not a solution. Each rejected sample wastes an iteration, and filter destroys the shrinking chain, leaving you with an unhelpful counterexample when a bug is found. Rewrite the generator using map and flatMap to construct valid inputs directly. |
| "The counterexample is too strange to be a real-world case — I'll just increase the iteration count so it appears less often." | A shrunk counterexample that triggers a property failure is a real bug by definition. "Unlikely in practice" is not a property of correctness — the question is whether the invariant holds. If the counterexample is a valid input the function might receive, fix the function. If it is not a valid input, constrain the generator. |
| "This function has too many invariants to specify — I'll just skip property testing and trust the unit tests." | Complex functions with many invariants are exactly the functions most in need of property testing. High complexity means a larger bug-hiding surface. Start with the most important invariants (no-crash, round-trip, idempotence) rather than attempting to encode all properties at once. |
| "Property tests are too slow — they'll block CI for 10 minutes." | Run 100 iterations on PR, 10,000 iterations nightly. The CI time argument justifies reducing iteration count, never eliminating property tests entirely. A suite that runs 0 property tests found 0 edge cases. |
filter), counterexamples will be unhelpfully large. Fix the generator to support shrinking.true for every input is useless. Review that properties make substantive assertions. If a property has a return true fallback for most inputs, the generator is producing too many invalid inputs.flatMap to build constrained structures incrementally. If the domain constraints are too complex for a generator, consider whether the function's API needs simplification.