From second-brain
Adversarial validation of the second-brain plugin. Questions architectural layers from a position of maximum skepticism, drilling from general to granular. Rotates focus across runs using git-change priority, history rotation, and perspective variety so different questions are asked each time.
npx claudepluginhub cain-ish/claude-code-plugin --plugin second-brain[--layer <name> | --changed | --full | <path-or-topic>]This skill is limited to using the following tools:
Systematically question whether the second-brain plugin actually works. Start from doubt ("this probably doesn't work"), read the actual code, and prove or disprove each claim with line references.
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Systematically question whether the second-brain plugin actually works. Start from doubt ("this probably doesn't work"), read the actual code, and prove or disprove each claim with line references.
Each layer maps to specific files. Pick layers to doubt based on the selection algorithm below.
| ID | Layer | Key Files |
|---|---|---|
hooks | Hook system | hooks/hooks.json, scripts/ensure-dirs.sh, scripts/session-load.sh |
stop-extract | Stop-hook extraction | scripts/stop-extract.sh, scripts/lib.sh |
learning | Learning pipeline | skills/improve/SKILL.md |
mcp | MCP server | mcp/src/server.ts |
quality | Quality gate | scripts/quality-gate.sh, agents/quality-reviewer.md |
compact | Compaction handling | scripts/pre-compact.sh, scripts/lib.sh |
platform | Cross-platform | All scripts (date, mktemp, find, sed, flock) |
bootstrap | Setup/upgrade flow | skills/setup/SKILL.md, skills/upgrade/SKILL.md, scripts/ensure-dirs.sh |
wiki | Wiki maintenance | agents/knowledge-maintainer.md, skills/lint/SKILL.md |
Each perspective asks fundamentally different questions about the same layer.
| ID | Perspective | Core Question |
|---|---|---|
correctness | Does it work? | Trace the happy path end-to-end. Does the output match what the code claims? |
failure | What breaks? | Missing files, empty inputs, timeouts, permission errors, corrupt data |
race | Concurrency issues | Multiple hooks writing same file, stale reads, lock contention |
evolution | Does it degrade? | Unbounded file growth, stale entries never cleaned, counter drift |
fresh-install | Day-zero experience | What happens before any state files exist? Silent failures? |
adversarial | Can it be gamed? | Skipping quality gates, ignoring learnings, malformed input |
integration | Do layers connect? | Does output of layer X feed correctly into input of layer Y? |
observability | Can you tell what happened? | Logging, error surfaces, silent failures, debugging affordances |
Before selecting layers, scan for CLAUDE.md files that describe the project's ground truth:
# Project-level CLAUDE.md (most specific)
cat .claude/CLAUDE.md 2>/dev/null
# Repo-root CLAUDE.md
cat CLAUDE.md 2>/dev/null
# User-level CLAUDE.md (global preferences)
cat ~/.claude/CLAUDE.md 2>/dev/null
Read whatever is there and identify any claims, constraints, conventions, goals, or stated truths that are relevant to the layers you're about to doubt. CLAUDE.md content varies wildly by project domain — a marketing repo might state brand guidelines and approval workflows; a game project might list performance budgets and platform targets; a backend service might declare SLA requirements and data retention policies. Don't look for a fixed list of categories — read the file, understand the project's domain, and extract whatever would make your doubt questions sharper.
The most valuable doubt questions come from the gap between what CLAUDE.md says should be true and what the code actually does. Any stated claim is a target: "we always do X" → does the code actually do X? "Y is critical" → is Y actually tested/enforced?
If no CLAUDE.md files exist, proceed without — the layer taxonomy provides enough structure.
Read ~/.second-brain/doubt-history.jsonl (create if missing). Each line records a past run:
{"timestamp":"ISO8601","layers":["hooks","stop-predicate"],"perspectives":["failure","race"],"findings":2,"issues":1,"fragile":1}
Parse arguments:
--layer <name>: Force a specific layer. Pick the least-used perspective for it.--changed: Only doubt layers whose files changed recently (git log --since="30 days ago" --name-only).--full: Shallow scan of ALL 9 layers, correctness perspective only, 1 question each. Produces a coverage map.<path-or-topic> (free text): Target arbitrary code outside the predefined layers. See Ad-hoc focus below.When the argument doesn't match any flag, treat it as an ad-hoc focus target:
Resolve the target:
src/ui/, components/ActionDots.tsx): list the files there (find <path> -name '*.ts' -o -name '*.tsx' -o -name '*.js' -o -name '*.sh' -o -name '*.md').actiondots, authentication, caching): search the codebase (grep -rl <keyword> . --include='*.ts' --include='*.tsx' --include='*.sh').doubt src/ui/ actiondots finds files in src/ui/ related to "actiondots".Generate an ad-hoc layer: Create a temporary layer definition from discovered files:
Ad-hoc layer: "<target>"
Key files: [discovered files, up to 10 most relevant]
Select perspectives: Pick 2-3 perspectives from the taxonomy that best fit the discovered code (e.g., if it's UI code, correctness + failure + adversarial; if it's a data pipeline, race + evolution + integration).
Run the standard doubt protocol (step 3) against the ad-hoc layer. All other steps (runtime state checking, conversational drilling, cross-layer chains, critic validation) apply unchanged.
History: Log ad-hoc runs to doubt-history.jsonl with layers: ["adhoc:<target>"] so they appear in rotation tracking.
Ad-hoc mode works in any repository — the predefined layers are only for the second-brain plugin itself.
Score each combination:
git log --since="30 days ago" --name-only --pretty=format: in the plugin root. Map changed files to layers. Recently changed layers get +3.Report which pairs were selected and why before proceeding.
For each (layer, perspective):
ls ~/.second-brain/, ls ~/.second-brain/projects/, ls ~/.second-brain/wiki/)tail -5 ~/.second-brain/learnings.md, wc -l ~/.second-brain/.session-baseline-*.md)sure (likely to expose a real bug), maybe (worth a quick check), or impossible (the code clearly defends against it — drop it).impossible, you must cite the specific file:line that defends against it. No impossible labels without a code-grounded reason. This blocks the failure mode where you optimistically prune the hardest doubts and let the post-hoc critic only see what survived.branches: [{vector, score, drilled|abandoned, reasoning, citation}]). The step-4 critic receives the full log, not just findings, so it can flag systematic over-pruning.sure / maybe / impossible (the impossible one needs a file:line cite) → drill output stale.race condition.impossible (with a citation) — not after a fixed count.If doubt-history.jsonl shows previous runs that found ISSUE or FRAGILE findings, pick 1-2 of the most recent and verify they were actually fixed:
git log --oneline --all -- <file>)This prevents the pattern where a fix is claimed but never lands, or where a fix introduces a new problem.
For any finding classified as ISSUE or FRAGILE, spawn a single quality-reviewer subagent:
Agent(subagent_type: "second-brain:quality-reviewer")
Prompt: "Validate these doubt findings independently. For each, read the cited file and line, and classify as CONFIRMED (real issue), DISPUTED (doubt session is wrong), or NEEDS-TESTING (can't tell from code). Be specific about why.
Also review the branch log (drilled + abandoned). For each branch labeled `impossible`, verify the citation actually defends against the angle described. Flag any branch where the `impossible` reasoning is hand-wavy or the citation is unrelated — this is where same-context pruning bias hides. Report SYSTEMATIC OVER-PRUNING if more than one abandoned branch has weak justification."
Include the finding details, file paths, line numbers, AND the branch log from step 3.4. One subagent call total — bundle all findings together. The branch log is what makes the post-hoc critic able to catch ICRH at branch-selection time.
Output a structured report:
# Doubt Session Report (YYYY-MM-DD)
## Focus
- Layers: [selected layers]
- Perspectives: [selected perspectives]
- Selection rationale: [why these were chosen]
## Findings
| # | Layer | Finding | Severity | File:Line | Critic |
|---|-------|---------|----------|-----------|--------|
## Validated (no issues)
- [layer]: [what was checked and passed]
## Recommendations
1. [For ISSUE findings]: specific fix
2. [For FRAGILE findings]: hardening suggestion
Append to ~/.second-brain/doubt-history.jsonl:
jq -nc \
--arg t "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--argjson layers '["hooks","stop-predicate"]' \
--argjson perspectives '["failure","race"]' \
--argjson findings 2 \
--argjson issues 1 \
--argjson fragile 1 \
--argjson confirmed 2 \
--argjson disputed 0 \
--argjson branches_generated 7 \
--argjson branches_drilled 4 \
--argjson branches_abandoned 3 \
--argjson over_pruned 0 \
'{timestamp:$t, layers:$layers, perspectives:$perspectives, findings:$findings, issues:$issues, fragile:$fragile, confirmed:$confirmed, disputed:$disputed, branches_generated:$branches_generated, branches_drilled:$branches_drilled, branches_abandoned:$branches_abandoned, over_pruned:$over_pruned}' \
>> ~/.second-brain/doubt-history.jsonl
The confirmed and disputed counts from the critic gate track calibration over time — a skill that's always disputed is asking bad questions; one that's always confirmed is well-calibrated. The branches_* counts track the new branching pattern: a healthy run drills more than it abandons (drilled / generated > 0.5); an over_pruned count from the critic ≥ 1 means same-context bias is creeping back in. Old log entries lacking these fields are valid — readers must default missing branch fields to 0.
If ~/.second-brain/doubt-history.jsonl has previous entries, review this run's quality against past performance:
issues + fragile counts against previous runs. Is the hit rate improving, declining, or flat?Output a brief ## Self-Assessment section at the end of the report with 2-3 sentences on what worked, what didn't, and whether the skill definition should be updated. If concrete improvements are identified, offer to apply them.
After the report, offer but do NOT auto-execute:
/second-brain:improve?"~/.second-brain/regressions/ for finding #N?"--full mode: 1 question per layer, reads only the primary file. Broad coverage map./clear first.