Use when verifying claims through Docker experimentation. Applies TBV principle to test claims before relying on them. Trigger with experiment setup or claim verification.
npx claudepluginhub emasoft/emasoft-plugins --plugin emasoft-architect-agentThis skill uses the workspace's default tool permissions.
Patterns for **personally verifying claims** through controlled Docker experimentation. Use this skill when you need to test whether a claim (from docs, researchers, or developers) is actually true.
README.mdreferences/docker-experimentation.mdreferences/experiment-scenarios.mdreferences/multiplicity-rule.mdreferences/op-archive-prototype.mdreferences/op-classify-result.mdreferences/op-cleanup-containers.mdreferences/op-design-multiplicity-experiment.mdreferences/op-document-findings.mdreferences/op-execute-experiment.mdreferences/op-setup-docker-experiment.mdreferences/output-templates.mdreferences/researcher-vs-experimenter.mdProvides evidence-based verification patterns for code and systems, including exit code proofs, E2E testing, and integration checks. Generates pass/fail reports with reproducible evidence.
Validates factual claims in code reviews, system analysis, documentation, and test reports using tools; prohibits superlatives and unverified metrics.
Validates claims through tool execution and enforces factual language without superlatives or unsubstantiated metrics. Use for reviewing codebases, analyzing systems, reporting test results, or factual claims about code.
Share bugs, ideas, or general feedback.
Patterns for personally verifying claims through controlled Docker experimentation. Use this skill when you need to test whether a claim (from docs, researchers, or developers) is actually true.
TBV Principle: Everything is "To Be Verified" until you personally test it. Claims from any source require experimental confirmation before relying on them for decisions.
Copy this checklist and track your progress:
experiments/<claim-name>/experiments/<claim-name>/data/experiments/<claim-name>/REPORT.mdprototypes/<claim-name>/| Artifact | Location | Purpose |
|---|---|---|
| Experimentation Report | experiments/<claim-name>/REPORT.md | Documents hypothesis, approaches tested, measurements, and classification |
| Status Classification | Report header | VERIFIED / UNVERIFIED / PARTIALLY VERIFIED / TBV |
| Measurement Data | experiments/<claim-name>/data/ | Raw metrics, logs, benchmark results |
| Prototype Archive (if valuable) | prototypes/<claim-name>/ | Working code with README explaining findings |
| Docker Cleanup Log | Terminal output | Confirms containers removed after experiment |
For Docker container setup and experiment infrastructure, see docker-experimentation.md:
For understanding the critical distinction between roles, see researcher-vs-experimenter.md:
For when to invoke the experimenter, see experiment-scenarios.md:
For the evidence-based selection process, see multiplicity-rule.md:
For experiment documentation and prototype archiving, see output-templates.md:
| Status | Meaning | Safe to Rely On? |
|---|---|---|
| VERIFIED | Experimentally confirmed | YES |
| UNVERIFIED | Tested but failed to match claim | NO (dangerous) |
| PARTIALLY VERIFIED | True under specific conditions | YES (with conditions) |
| TBV | Not yet tested | NO (unknown risk) |
| Implementation Code | Experimental Code |
|---|---|
| Permanent (committed) | Ephemeral (deleted after) |
| Production-ready | Throwaway testbed |
| Follows specifications | Generates specifications |
| One chosen solution | Multiple solutions compared |
| Part of delivery | Part of decision-making |
| Workflow | Trigger | Experimenter Action |
|---|---|---|
| BUILD | Architecture decision needs validation | Validates with testbeds |
| DEBUG | Root cause unclear or fix uncertain | Reproduces in isolation, tests fixes |
| REVIEW | Performance concerns or architectural questions | Benchmarks alternatives |
Claim: "Redis caches API responses 10x faster than in-memory dict"
Status: TBV
1. Create Docker container with Redis and Python
2. Implement both approaches:
- Approach A: In-memory dict cache
- Approach B: Redis cache
- Approach C: Redis with connection pooling
3. Run 1000 iterations, measure latency
4. Results:
- Dict: 0.001ms avg
- Redis: 0.15ms avg
- Redis pooled: 0.08ms avg
5. Classification: UNVERIFIED (Redis is slower for simple cases)
6. Conditions: Redis faster only for distributed scenarios
Claim: "Library X works with Python 3.12"
Status: TBV
1. Docker container with Python 3.12
2. Install library X
3. Run test suite
4. Result: Import error on async module
5. Classification: UNVERIFIED
6. Action: Use Python 3.11 or wait for library update
| Error | Cause | Solution |
|---|---|---|
| Docker not available | Docker daemon not running | Start Docker Desktop or docker service |
| Container cleanup failed | Orphaned containers | Run docker system prune |
| Experiment inconclusive | Insufficient test iterations | Increase sample size, reduce variables |
| Conflicting results | Environment differences | Standardize container configuration |
| Resource exhaustion | Too many containers | Clean up between experiments |