From nw
Guides TLA+ formal verification of concurrent/distributed designs using PlusCal and TLC, complementing PBT for implementation. Use for protocol correctness and state risks.
npx claudepluginhub nwave-ai/nwave --plugin nwThis skill uses the workspace's default tool permissions.
When and how to use TLA+ for design verification. Complements PBT (which verifies implementation).
Guides TLA+ and PlusCal use for verifying distributed system invariants, with decision trees, patterns for consensus/locking/CRDTs, and heuristics on when formal methods add value.
Writes and refines TLA+ specs (.tla) and TLC configs (.cfg) from natural-language designs; runs model checking; summarizes results, counterexamples, and assumptions. For protocol/state machine validation.
Provides TLA+ templates, syntax guidance, type invariants, and best practices for specifying distributed systems and concurrent algorithms. Useful for formal design and verification with TLC.
Share bugs, ideas, or general feedback.
When and how to use TLA+ for design verification. Complements PBT (which verifies implementation).
Is the risk in the DESIGN or the IMPLEMENTATION?
|
+-- Design risk (protocol correctness, distributed coordination, concurrency)
| -> Does the system involve concurrent or distributed state?
| Yes -> Use TLA+ for design verification
| Then use PBT to verify implementation matches design
| No -> PBT alone is likely sufficient
|
+-- Implementation risk (edge cases, serialization, data transforms)
| -> Use PBT alone
|
+-- Both
-> TLA+ validates design, PBT validates implementation
TLA+ describes what a system should do, not how. A specification consists of:
TLC model checker exhaustively explores all reachable states within bounded model. If invariant violated, TLC provides counterexample trace -- shortest path from initial state to violation.
PlusCal is imperative-looking language transpiling to TLA+. Most programmers find it easier to learn first.
(* --algorithm Transfer
variables accounts = [a |-> 100, b |-> 100];
process Transfer \in 1..2
variables from, to, amount;
begin
Pick:
from := "a"; to := "b"; amount := 50;
Check:
if accounts[from] >= amount then
Debit:
accounts[from] := accounts[from] - amount;
Credit:
accounts[to] := accounts[to] + amount;
end if;
end process;
end algorithm; *)
\* Invariant: total money is conserved
MoneyConserved == accounts["a"] + accounts["b"] = 200
Labels define atomicity boundaries. Everything between two labels executes atomically. Concurrency interleavings happen at label boundaries.
PlusCal limitation: can't express all TLA+ patterns (complex fairness, some non-determinism forms). Start with PlusCal, switch to raw TLA+ when needed.
SPECIFICATION Spec
CONSTANT
Nodes = {n1, n2, n3}
MaxMessages = 4
INVARIANT
MoneyConserved
NoDoubleDelivery
State space grows O(constants^variables). 3 nodes = ~1,000 states; 4 nodes = ~100,000. Most bugs manifest with 2-3 nodes.
Mitigation:
-simulate) for large models (random sampling instead of exhaustive)Network as set of in-flight messages. Send adds to set, receive removes. Unreliable network: messages removed without receipt (loss) or kept after receipt (duplication).
Variables represent memory locations. Locks modeled as variables indicating holding process.
Node crash: non-deterministic removal from active set. Volatile state lost, persistent retained. Restart: rejoin with fresh volatile state.
Nodes propose, vote, become leader when majority achieved. Safety: at most one leader per term.
Two-phase commit: resource managers prepare, transaction manager commits when all prepared. Safety: no partial commits.
Safety ("nothing bad happens"): Expressed as invariants. Violated by finite counterexample trace. Example: "Two processes never hold the same lock."
Liveness ("something good eventually happens"): Requires fairness assumptions. Cannot be violated by finite prefix. Example: "Every request eventually gets a response."
Start with safety. Add liveness after safety established.
Key integration point between formal verification and testing:
Properties discovered during TLA+ become PBT test oracles:
assert sum(all_accounts) == initial_totalTLA+ invariants are exhaustive. PBT samples. If TLA+ proves A, B, and C, your PBT must check all three.
| TLA+ Catches (Design) | PBT Catches (Implementation) |
|---|---|
| Protocol deadlocks | Off-by-one errors |
| Consistency violations under failure | Incorrect serialization |
| Missing error handling paths | Memory management issues |
| Race conditions in algorithms | Edge cases in data transforms |
Neither catches: performance bugs | usability issues | integration with external systems.
alygin.vscode-tlaplus): Primary IDE. Syntax highlighting, model checking, counterexample visualization. Requires Java 11+.java -jar tla2tools.jar -config Spec.cfg Spec.tla