Search everything...

Skill

agents-consilium

Queries external AI agents (Codex, Gemini, OpenCode, Claude Code headless) in parallel for code reviews, bug investigations, architecture choices, security audits, and consensus on complex decisions.

OpenAI

npx claudepluginhub codealive-ai/ai-driven-development --plugin ai-driven-development

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Query external AI agents for independent, unbiased expert opinions. Each agent has a distinct thinking role and responds in a structured format for easy comparison.

Supporting Assets

README.mdconfig.example.jsonconfig.jsonscripts/claude-query.shscripts/code-review.shscripts/code_review_validate.pyscripts/codex-query.shscripts/common.shscripts/config.shscripts/consensus-query.shscripts/gemini-query.shscripts/opencode-query.sh

SKILL.md

Similar Skills

reviewing

Invokes OpenAI Codex and Google Gemini CLIs for adversarial code reviews, tie-breaking, and multi-model consensus on critical decisions like security and architecture.

alice

brains-trust

689

Queries AI models via OpenRouter, Gemini, or OpenAI APIs for second opinions on code, architecture, strategy, or prompting. Supports consensus, single opinion, and devil's advocate modes.

2 files

dev-tools

council

Queries Claude, Codex, and Gemini CLIs in parallel via bash scripts for diverse AI perspectives on code reviews, architecture decisions, debugging, and consensus.

johnlindquist-claude

Stats

Stars46

Forks2

Last CommitMay 4, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

agents-consilium | ai-driven-development | ClaudePluginHub

Back to Skills

Skill

agents-consilium

From ai-driven-development

Queries external AI agents (Codex, Gemini, OpenCode, Claude Code headless) in parallel for code reviews, bug investigations, architecture choices, security audits, and consensus on complex decisions.

npx claudepluginhub codealive-ai/ai-driven-development --plugin ai-driven-development

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Query external AI agents for independent, unbiased expert opinions. Each agent has a distinct thinking role and responds in a structured format for easy comparison.

Supporting Assets

SKILL.md

Consilium: Multi-Agent Orchestration

Query external AI agents for independent, unbiased expert opinions. Each agent has a distinct thinking role and responds in a structured format for easy comparison.

Why this skill

Different frontier models see different things. Each has a slightly different training distribution, tool-use style, and failure mode — so they latch onto different aspects of the same problem.

Brainstorming / problem-solving / feature design. Querying Codex + Claude + OpenCode/Gemini (or any subset) in parallel yields a wider solution space than any single model alone. You get original, non-obvious alternatives that one model would never surface on its own.
Code review. Different models find different issues. One catches a subtle race condition; another flags an auth gap; a third questions the architecture. The union of their findings is materially broader than a single-reviewer pass.

The skill keeps each agent independent (no debate, no cross-contamination) and lets the caller adjudicate — you get raw parallel perspectives, not a homogenized committee answer.

Quick Start
Design Principles
Anti-Bias Protocol
Agent Freedom and Read-Only Guardrails
Configuration
- OpenCode provider choice: Zen vs Google direct vs OpenAI direct
- Claude Code backend
Scripts
- Flags & Exit Codes
Code Review Mode
When to Use Which
Synthesizing Responses
Prompt Patterns
Environment Variables
Prerequisites

Quick Start

# 1. See what's configured (XML plan — dry-run, no agents run).
scripts/consensus-query.sh --list-agents

# 2. Ask the consensus (human-readable markdown).
scripts/consensus-query.sh "Should we use Postgres or SQLite for this CLI tool?"

# 3. Agent-friendly output (stable XML, escaped via CDATA).
scripts/consensus-query.sh --xml "Review this function" < src/auth.py

# 4. Code review mode (2 specialists, quoted-code validated, XML or markdown).
scripts/code-review.sh path/to/file.py
git diff HEAD | scripts/code-review.sh --xml --diff

Edit config.json to enable/disable agents or swap models. See config.example.json for a fuller template with multiple backends.

Passing the prompt

The prompt is a positional argument. Three ways to pass it, pick whichever is convenient:

# (a) Inline string — best for short prompts.
scripts/consensus-query.sh --xml "review this design"

# (b) From a file via stdin — best for long multi-line prompts.
#     With NO positional argument, stdin is treated as the prompt.
scripts/consensus-query.sh --xml < prompt.txt
cat prompt.txt | scripts/consensus-query.sh --xml

# (c) From a file via flag — same as (b) but uses RAW mode
#     (no role/principles/template wrapping; agents see the file verbatim).
#     Use this for benchmarks/evals where wrapper differences would skew results.
scripts/consensus-query.sh --xml --prompt-file prompt.txt

When BOTH a positional prompt and stdin are given, stdin is appended to the prompt as --- Input --- context. That is the existing pattern for piping a file under review:

cat src/auth.py | scripts/consensus-query.sh "review this code"
#       └── stdin = context ────┘   └── positional = the prompt ─┘

Design Principles

Intellectual independence: Agents are instructed to think from first principles, challenge the framing of questions, and propose alternatives not mentioned in the query. They are free thinkers within the given context, not yes-men.

Role differentiation (set per agent in config.json):

analyst = Rigorous Analyst — precision, code correctness, edge cases, implementation depth, security (default for Codex)
lateral = Lateral Thinker — cross-domain patterns, creative alternatives, questioning premises, big picture (default for Gemini / OpenCode with Gemini-3.1-Pro)

Structured output: All agents respond using a common template (Assessment, Key Findings, Blind Spots, Alternatives, Recommendation with confidence level), making synthesis straightforward.

Anti-Bias Protocol

When formulating queries for consilium, follow these rules to maximize the value of independent opinions:

State the problem, not your solution. Instead of "Should we use X?", describe the constraints and goals.
Don't lead. Avoid "I think X is best, what do you think?" — this anchors the response.
Include raw context. Pipe code files or paste error logs directly rather than summarizing them (summaries carry your interpretation).
Omit your hypothesis when possible. Let agents form their own before revealing yours.

Agent Freedom and Read-Only Guardrails

Agents are spawned in the caller's current working directory with their native agentic toolchain intact. They can:

Read, Grep, Glob, find_references, git log/blame across the real repository
Consult CLAUDE.md, AGENTS.md, README, config files, tests, call sites, neighboring modules
Use web search / fetch if their backend supports it (Claude Code, OpenCode, Codex all do)
Run SAST-style introspection via their built-in shells

What they cannot do (enforced per backend):

Backend	Read-only guard
Codex	`--sandbox read-only --ask-for-approval never`
Claude Code	`--permission-mode plan`
OpenCode	`--agent plan` (plan is opencode's built-in read-only agent)
Gemini CLI	`--approval-mode plan`

No Edit, Write, Bash(git commit ...), Bash(rm ...), or any write-back tool is authorized. Implementation of recommendations is the caller's job. If a backend tries to escalate (e.g. needs to run a command that violates read-only), the call fails rather than silently escalating.

Configuration

Agents are declared in config.json at the skill root. Each agent has:

Field	Purpose
`enabled`	Whether it participates in `consensus-query`
`backend`	CLI that actually runs: `codex-cli`, `gemini-cli`, `opencode`, `claude-code`
`model`	Model id passed to that CLI
`role`	`analyst` (deep/precise) or `lateral` (broad/creative)
`label`	Display name in reports (optional)
`effort`	Reasoning effort. opencode: maps to `opencode run --variant` (e.g. `low`, `medium`, `high`, `max`) — provider-specific, see Discovering reasoning variants below. claude-code: maps to `claude --effort` (`low`, `medium`, `high`, `xhigh`, `max`). Other backends ignore it.

Default config (config.json):

codex (backend=codex-cli, model=gpt-5.5, role=analyst) — enabled
gemini-cli (backend=gemini-cli, model=gemini-3.1-pro-preview, role=lateral) — disabled
opencode (backend=opencode, model=opencode/gemini-3.1-pro, role=lateral, effort=high) — enabled
claude-code (backend=claude-code, model=opus, effort=max, role=analyst) — disabled
opencode-go-minimax (backend=opencode, model=opencode-go/minimax-m2.7, role=lateral, effort=high) — enabled
opencode-go-deepseek (backend=opencode, model=opencode-go/deepseek-v4-pro, role=analyst, effort=max) — enabled
opencode-go-mimo (backend=opencode, model=opencode-go/mimo-v2.5-pro, role=lateral, effort=high) — enabled
opencode-go-kimi (backend=opencode, model=opencode-go/kimi-k2.6, role=analyst, effort=high) — enabled
opencode-go-glm (backend=opencode, model=opencode-go/glm-5.1, role=lateral, effort=high) — enabled
opencode-openai (backend=opencode, model=openai/gpt-5.5, role=analyst, effort=high) — disabled (reference entry; flip on if you want OpenAI direct)

Effort policy: max is used wherever the model exposes it (claude-code, opencode-go/deepseek-v4-pro); high is the fallback for models that top out at high (opencode/gemini-3.1-pro, opencode-go/mimo-v2.5-pro, openai/gpt-5.5 if you don't want xhigh) or expose no variants at all (minimax, kimi, glm — effort is set but ignored by the provider).

Multiple agents can share one backend — the dispatcher passes the entry id through CONSILIUM_AGENT_ID, so each backend script reads its own slice of config.json.

Edit config.json to flip agents on/off or change models. Set CONSILIUM_CONFIG=/path/to/custom.json to use an override file.

OpenCode provider choice: Zen vs Google direct vs OpenAI direct

The opencode backend works with any provider/model that OpenCode supports. For Gemini 3.1 Pro you have two options:

Zen (default): "model": "opencode/gemini-3.1-pro" — goes through OpenCode Zen. Works out of the box once opencode providers login opencode (or a valid Zen credential) is configured.
Google direct: "model": "google/gemini-3.1-pro-preview" — goes straight to Google's v1beta API. Requires GOOGLE_GENERATIVE_AI_API_KEY (OpenCode does not pick up GEMINI_API_KEY for this provider).

For OpenAI flagship models (GPT-5.5, GPT-5.4, etc.) there's a third path:

OpenAI direct: "model": "openai/gpt-5.5" — goes straight to OpenAI's API via the openai/* provider in OpenCode. Requires either an opencode auth login session for OpenAI (oauth) or OPENAI_API_KEY in the environment. The default config ships an opencode-openai entry disabled as a reference; flip enabled=true if you want a GPT-5.5 voice in the consilium. Variants: none / low / medium / high / xhigh — pick xhigh if you want the heaviest reasoning, otherwise high is the safe default.

Flip between providers by editing the model field; the rest of the config stays the same.

Discovering OpenCode reasoning variants per model

opencode run --variant <effort> is provider-specific — each model exposes its own set (or none). Don't guess: enumerate them from the CLI before setting effort in config.json.

One-liner — list every model with its supported variants:

opencode models opencode --verbose 2>&1 | python3 -c '
import sys, json
lines = sys.stdin.read().split("\n")
i = 0
while i < len(lines):
    line = lines[i].strip()
    if line.startswith("opencode/") or line.startswith("opencode-go/"):
        model_id, json_lines, depth, started = line, [], 0, False
        i += 1
        while i < len(lines):
            s = lines[i]; json_lines.append(s)
            for c in s:
                if c == "{": depth += 1; started = True
                elif c == "}": depth -= 1
            i += 1
            if started and depth == 0: break
        try:
            v = list(json.loads("\n".join(json_lines)).get("variants", {}).keys())
            print(f"{model_id}\t{v}")
        except Exception: pass
    else:
        i += 1
'

Swap opencode for opencode-go (or any other provider id) to scan a different namespace; drop the provider arg to scan everything opencode models knows.

Interpreting the result:

Non-empty list (e.g. ['low', 'medium', 'high', 'max']) → set effort to the highest one you want.
[] → the model has no reasoning variants. --variant is silently ignored; setting effort in config is harmless but does nothing.
If a variant in your config isn't on the list, opencode run rejects the call. Re-enumerate after upgrading opencode — providers add/remove tiers between releases.

Snapshot of the currently configured opencode models (re-run the one-liner if you change the set):

Model	Variants	`effort` in default config
`opencode/gemini-3.1-pro`	low, medium, high	`high` (no `max`)
`opencode-go/deepseek-v4-pro`	low, medium, high, max	`max`
`opencode-go/mimo-v2.5-pro`	low, medium, high	`high` (no `max`)
`opencode-go/minimax-m2.7`	—	`high` (ignored)
`opencode-go/kimi-k2.6`	—	`high` (ignored)
`opencode-go/glm-5.1`	—	`high` (ignored)
`openai/gpt-5.5`	none, low, medium, high, xhigh	`high` (entry disabled by default)

Claude Code backend

The claude-code backend shells out to claude -p (headless mode, see docs). Useful when you want a second Claude in the consilium — e.g. Opus as analyst cross-checking Codex.

model: a shortname (opus, sonnet, haiku) or full id (claude-opus-4-7, claude-sonnet-4-6).
effort: maps to claude --effort — accepts low, medium, high, xhigh, max. Default config sets max for opus; omit the field to fall back to the skill's default of max for the claude-code backend.
Runs in the caller's CWD with --permission-mode plan — Claude can freely Read/Grep/Glob/Bash read-only across the project, but cannot Edit/Write. Override with CLAUDE_PERMISSION_MODE only if you know what you're doing.
Authentication uses the same Claude Code credentials the CLI is already logged in with (claude /login).

Note: claude-code is disabled in the default config to avoid spawning another Claude session accidentally. Flip enabled to true in config.json (or CONSILIUM_CONFIG) when you want it in the consensus run.

Scripts

All scripts in scripts/ directory. The skill auto-detects its install location.

Single Agent Queries

Per-agent scripts always execute when invoked. The enabled field in config.json is consulted only by consensus-query.sh to build the default agent set (when neither -a nor -x is given). Direct invocation of a per-agent script ignores enabled — that's by design (single source of truth for the run/skip decision lives in the dispatcher).

When -a/-x causes consensus-query.sh to run an enabled=false agent, the dispatcher emits a stderr line like [<Label>] forced via --agents (enabled=false in config) so the override is visible.

# Codex (analyst by default)
scripts/codex-query.sh "question" [context_file]
cat file.py | scripts/codex-query.sh "review this"

# Gemini CLI (lateral by default; disabled in default config)
scripts/gemini-query.sh "question" [context_file]
cat file.py | scripts/gemini-query.sh "review this"

# OpenCode (lateral by default, model per config.json)
scripts/opencode-query.sh "question" [context_file]
cat file.py | scripts/opencode-query.sh "review this"

# Claude Code (analyst by default; disabled in default config)
scripts/claude-query.sh "question" [context_file]
cat file.py | scripts/claude-query.sh "review this"

Consensus Query (All Enabled Agents in Parallel)

scripts/consensus-query.sh "architecture question"
cat file.py | scripts/consensus-query.sh "review this code"
scripts/consensus-query.sh --xml "review this"            # XML report for agent consumers
scripts/consensus-query.sh --list-agents                   # dry-run: dump plan, don't query

consensus-query.sh reads config.json, launches every agent with enabled=true in parallel, and prints their responses grouped by label. Add/remove agents permanently by editing the config; for ad-hoc runs use -a/--agents and -x/--exclude (see below).

Flags & Exit Codes

All scripts accept -h / --help. Both consensus-query.sh and code-review.sh accept:

Flag	Effect
`--xml`	Emit `<consilium-report>` (or `<code-review-report>`) with each agent wrapped in `<agent>…<response><![CDATA[…]]></response></agent>`. Stable for agent consumers (no markdown-heading collision).
`--list-agents` (consensus only)	Print `<consilium-plan>` (every configured agent, enabled/disabled, with `backend-available`) and exit. No queries are run — use this as an inspection / dry-run.
`-a, --agents <ID\|GLOB>`	Override the active agent set with this id or glob (e.g. `'opencode-go-'`). Repeatable*; comma-separated values also accepted (`-a codex,opencode-go-kimi`). When given, the per-agent `enabled` flag in `config.json` is ignored — only matched agents run. Falls back to env `CONSILIUM_AGENTS`.
`-x, --exclude <ID\|GLOB>`	Subtract matching agents from the active set. Repeatable. Combine with `--agents` for include-then-exclude composition. Falls back to env `CONSILIUM_EXCLUDE`.

Ad-hoc agent selection examples:

# Single agent
scripts/consensus-query.sh -a opencode-go-kimi "Q"

# All OC-Go models (glob)
scripts/consensus-query.sh -a 'opencode-go-*' "Q"

# Everything-except-codex
scripts/consensus-query.sh -x codex "Q"

# Composition: only OC-Go but skip MiniMax
scripts/consensus-query.sh -a 'opencode-go-*' -x opencode-go-minimax "Q"

# Same via env (scriptable)
CONSILIUM_AGENTS='codex,opencode-go-kimi' scripts/consensus-query.sh "Q"

Exit codes (stable across all scripts):

Code	Meaning
`0`	Success (all queried agents replied; or, for `consensus-query.sh`, the active agent set may be smaller than the configured set if some are disabled or filtered)
`2`	Consensus only: partial failure (≥1 succeeded, ≥1 failed)
`3`	Consensus only: every queried agent failed
`4`	Config error (missing CLI, invalid config, unknown role/agent id)
`5`	Usage error (missing prompt, unknown flag)
other	Propagated from the backend CLI (e.g. `124` on timeout)

Code Review Mode

scripts/code-review.sh is a focused pipeline for reviewing a single file or a unified diff. It runs exactly two specialist passes — security and correctness — in parallel, then validates each finding's quoted-code against the real source.

Design choices are grounded in the 2024-2026 multi-agent code review literature:

Two specializations only (security + correctness). Readability/perf agents empirically produce nit spam and hurt precision.
No coordinator / no debate. The caller (you) adjudicates. Debate rounds empirically entrench errors (Wu et al. 2025; Choi et al. 2025).
Heterogeneous models via the existing config (Codex + OpenCode by default) reduce shared blind spots.
Fixed cost. Adding a 3rd enabled agent does not add a 3rd pass; the skill always runs 2 passes and rotates agents round-robin.
Hallucinated line numbers are caught locally. Every finding carries <quoted-code>, and the validator cross-checks it against the source file (quote-valid="true|false").

Usage

# File on disk (quoted-code validated against the file)
scripts/code-review.sh path/to/file.py
scripts/code-review.sh --xml path/to/file.py

# Unified diff piped on stdin (quoted-code validation is skipped)
git diff HEAD | scripts/code-review.sh --diff
git diff HEAD | scripts/code-review.sh --xml --diff

Finding schema (XML output)

<finding index="N" severity="critical|high|medium|low" category="security|correctness"
         file="..." line-start="N" line-end="N" confidence="0.0..1.0"
         source-agent="..." source-role="security|correctness"
         quote-valid="true|false">
  <title>...</title>
  <rationale><![CDATA[includes one reason this might be a false positive]]></rationale>
  <suggested-fix><![CDATA[...]]></suggested-fix>
  <quoted-code><![CDATA[verbatim source at line-start..line-end]]></quoted-code>
</finding>

Findings are sorted severity desc, confidence desc. No severity filtering by default — triage is the caller's job.

Severity rubric

Unified across security + correctness. Specialists score each finding on two axes (worst-case impact × likelihood/reachability) and pick the tier that matches. Synthesized from CVSS v4, OWASP Risk Rating, GitHub Advisory DB, Chromium, MSRC, SEI CERT, SonarQube, Semgrep.

Severity	Action horizon	Operational definition	Security examples	Correctness examples
critical	Merge blocker	RCE / trust-boundary bypass / data loss / guaranteed outage, with a concrete exploit or dataflow trace	SQLi on public endpoint with concatenated query; unsafe deserialization of untrusted input; hardcoded prod credential	Payment/ledger math silently corrupts balances; unconditional null deref on hot request path; race on shared mutable state under prod load
high	Fix before release	Critical-tier impact gated by a non-trivial precondition (auth, specific config), OR moderate impact with high reachability	Stored XSS in authenticated admin view; CSRF on state-changing endpoint; path traversal behind login; missing authz on tenant resource	Unhandled exception on documented error path crashing a worker; file/DB-handle leak exhausting pools; retry logic that double-charges
medium	Schedule	Limited impact (info disclosure, localized incorrectness, degraded-but-recoverable), OR critical impact gated by implausible preconditions	Stack traces leaked to end users; missing `HttpOnly`/`Secure` on non-session cookie; weak-but-not-broken crypto parameter	Incorrect edge-case handling in non-critical helper; missing input validation that callers already satisfy; N+1 query degrading a list endpoint
low	Optional / backlog	Cosmetic, stylistic, defense-in-depth; minimal real-world impact	Missing `nosniff` header where CSP already mitigates; `Math.random()` for non-security id	Dead code; inconsistent naming; redundant null check after non-null assertion

Adjustments: downgrade one level on mitigating factors (auth required, non-default config, unusual interaction). Speculative findings stay at the lower tier — upgrade only with a working PoC or trace.

Using the results (for the caller)

You are the adjudicator. Specialists emit independent findings — your job is to select and synthesize, not re-review (RovoDev 2601.01129, RevAgent 2511.00517).

Drop quote-mismatched findings (quote-valid="false") — likely hallucinations.
Merge duplicates across specialists. Same root cause in different framings → one item; keep the clearer rationale and note both agents.
Surface conflicts without resolving them. If Security says "sanitize X" and Correctness says "X is fine" — present both to the user and let them adjudicate; don't break the tie yourself.
Gate by action horizon using the severity rubric above: critical = block the merge, high = fix before release, medium = track, low = optional.
Do not re-review. Do not generate new findings inside the adjudication step. Do not run a debate loop — adversarial re-reviewing empirically reduces precision (CR-Bench 2603.11078).

When NOT to use code-review mode

Open-ended architecture questions → use consensus-query.sh; specialists will be too narrow.
Huge files (>1000 lines) → split into function-sized diffs first; LLMs degrade past that length.
Multi-file cross-references → not modelled here; rerun per file and stitch findings.

When to Use Which

Pick by role, not by vendor. The default config has Codex (analyst) + OpenCode/Gemini-3.1-Pro (lateral) enabled; flip claude-code or gemini-cli on in config.json when you want an additional voice.

Situation	Script	Role(s) involved
Code review, security audit	per-agent `analyst` script (`codex-query.sh` or `claude-query.sh`)	analyst — precision, edge cases
Architecture decision, design choice	`consensus-query.sh`	analyst + lateral — depth + breadth
"Are we solving the right problem?"	per-agent `lateral` script (`opencode-query.sh` or `gemini-query.sh`)	lateral — challenges premises
Bug investigation, root cause analysis	per-agent `analyst` script	analyst — goes deep into implementation
Exploring alternatives, brainstorming	per-agent `lateral` script	lateral — cross-domain analogies
High-stakes or irreversible decision	`consensus-query.sh`	all enabled — reduce blind spots
Agent-to-agent integration (downstream parser)	`consensus-query.sh --xml`	any — stable structured output

Synthesizing Responses

Agents respond with a shared structure. Compare section by section:

Assessment vs Assessment: Do they frame the problem differently? A framing difference often reveals the most insight.
Blind Spots: Union of both agents' blind spots is your risk map.
Alternatives: Check if either agent proposed something neither you nor the other agent considered.
Recommendations: Agreement = high confidence. Divergence = investigate the reasoning, not just the conclusion.

Response Patterns

When comparing the two responses, classify the pattern and act accordingly:

Agreement: Both recommend same approach — high confidence, proceed
Complementary: Different valid points that don't conflict — combine insights into a richer picture
Contradiction: Conflicting recommendations — present both with reasoning, let user decide
Unique insight: One agent caught something the other missed — highlight it, this is often the most valuable output

Prompt Patterns

Architecture Decision (unbiased framing)

scripts/consensus-query.sh "We need real-time updates for ~100 concurrent users.
Updates are server-initiated only. Current stack: [describe your stack].
Latency target: under 500ms from event to UI update.
What approach would you recommend and why?"

Code Review (pipe raw code, let agents form opinions)

cat src/services/auth.py | scripts/codex-query.sh \
  "Review this authentication service. Focus on whatever concerns you most."

Problem Investigation (provide facts, not hypotheses)

scripts/codex-query.sh "Database query returns empty result.
Direct query with same filter returns 5 documents.
[paste query here]
What's happening?"

Environment Variables

CONSILIUM_CONFIG: Path to a custom JSON config (default: <skill>/config.json)
CODEX_MODEL: Override Codex model at runtime (default: value from config)
GEMINI_MODEL: Override Gemini CLI model at runtime (default: value from config)
OPENCODE_MODEL: Override OpenCode model at runtime (default: value from config)
OPENCODE_AGENT: Override OpenCode built-in agent (default: plan, read-only)
OPENCODE_EFFORT: Override OpenCode reasoning effort (default: config effort field, or high)
CLAUDE_MODEL: Override Claude Code model at runtime (alias like opus or full id)
CLAUDE_PERMISSION_MODE: Override Claude Code permission mode (default: plan)
CLAUDE_EFFORT: Override Claude Code reasoning effort (default: config effort field, or max if both unset). Levels: low, medium, high, xhigh, max.
CODEX_EFFORT: Override Codex reasoning effort (default: config effort field, or high if both unset). Levels: minimal, low, medium, high, xhigh.
GEMINI_API_KEY: Required for the gemini-cli backend (v1beta model access)
GOOGLE_GENERATIVE_AI_API_KEY: Required if the opencode backend uses google/... models
OPENAI_API_KEY: Required if the opencode backend uses openai/... models and OpenCode is not already logged in via opencode auth login
AGENT_TIMEOUT: Timeout seconds (default: 1200)

Prerequisites

Codex CLI installed and authenticated (codex --version) — for the codex-cli backend
OpenCode CLI installed (opencode --version) — for the opencode backend. For Zen models (opencode/...) run opencode providers login opencode once; for Google direct models (google/...) set GOOGLE_GENERATIVE_AI_API_KEY; for OpenAI direct models (openai/...) either run opencode auth login and pick OpenAI, or set OPENAI_API_KEY.
Gemini CLI installed (gemini --version) — for the gemini-cli backend (optional; falls back to direct API)
Claude Code CLI installed and logged in (claude --version, claude /login) — for the claude-code backend
GEMINI_API_KEY environment variable — required only when gemini-cli backend is enabled (get key at https://ai.google.dev/gemini-api/docs/api-key)
Python 3 (for config parsing and Gemini API fallback)

Similar Skills

reviewing

Invokes OpenAI Codex and Google Gemini CLIs for adversarial code reviews, tie-breaking, and multi-model consensus on critical decisions like security and architecture.

alice

brains-trust

689

Queries AI models via OpenRouter, Gemini, or OpenAI APIs for second opinions on code, architecture, strategy, or prompting. Supports consensus, single opinion, and devil's advocate modes.

2 files

dev-tools

council

Queries Claude, Codex, and Gemini CLIs in parallel via bash scripts for diverse AI perspectives on code reviews, architecture decisions, debugging, and consensus.

johnlindquist-claude

Stats

Stars46

Forks2

Last CommitMay 4, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Consilium: Multi-Agent Orchestration

Query external AI agents for independent, unbiased expert opinions. Each agent has a distinct thinking role and responds in a structured format for easy comparison.

Why this skill

Different frontier models see different things. Each has a slightly different training distribution, tool-use style, and failure mode — so they latch onto different aspects of the same problem.

Brainstorming / problem-solving / feature design. Querying Codex + Claude + OpenCode/Gemini (or any subset) in parallel yields a wider solution space than any single model alone. You get original, non-obvious alternatives that one model would never surface on its own.
Code review. Different models find different issues. One catches a subtle race condition; another flags an auth gap; a third questions the architecture. The union of their findings is materially broader than a single-reviewer pass.

The skill keeps each agent independent (no debate, no cross-contamination) and lets the caller adjudicate — you get raw parallel perspectives, not a homogenized committee answer.

Quick Start
Design Principles
Anti-Bias Protocol
Agent Freedom and Read-Only Guardrails
Configuration
- OpenCode provider choice: Zen vs Google direct vs OpenAI direct
- Claude Code backend
Scripts
- Flags & Exit Codes
Code Review Mode
When to Use Which
Synthesizing Responses
Prompt Patterns
Environment Variables
Prerequisites

Quick Start

# 1. See what's configured (XML plan — dry-run, no agents run).
scripts/consensus-query.sh --list-agents

# 2. Ask the consensus (human-readable markdown).
scripts/consensus-query.sh "Should we use Postgres or SQLite for this CLI tool?"

# 3. Agent-friendly output (stable XML, escaped via CDATA).
scripts/consensus-query.sh --xml "Review this function" < src/auth.py

# 4. Code review mode (2 specialists, quoted-code validated, XML or markdown).
scripts/code-review.sh path/to/file.py
git diff HEAD | scripts/code-review.sh --xml --diff

Edit config.json to enable/disable agents or swap models. See config.example.json for a fuller template with multiple backends.

Passing the prompt

The prompt is a positional argument. Three ways to pass it, pick whichever is convenient:

# (a) Inline string — best for short prompts.
scripts/consensus-query.sh --xml "review this design"

# (b) From a file via stdin — best for long multi-line prompts.
#     With NO positional argument, stdin is treated as the prompt.
scripts/consensus-query.sh --xml < prompt.txt
cat prompt.txt | scripts/consensus-query.sh --xml

# (c) From a file via flag — same as (b) but uses RAW mode
#     (no role/principles/template wrapping; agents see the file verbatim).
#     Use this for benchmarks/evals where wrapper differences would skew results.
scripts/consensus-query.sh --xml --prompt-file prompt.txt

When BOTH a positional prompt and stdin are given, stdin is appended to the prompt as --- Input --- context. That is the existing pattern for piping a file under review:

cat src/auth.py | scripts/consensus-query.sh "review this code"
#       └── stdin = context ────┘   └── positional = the prompt ─┘

Design Principles

Role differentiation (set per agent in config.json):

analyst = Rigorous Analyst — precision, code correctness, edge cases, implementation depth, security (default for Codex)
lateral = Lateral Thinker — cross-domain patterns, creative alternatives, questioning premises, big picture (default for Gemini / OpenCode with Gemini-3.1-Pro)

Structured output: All agents respond using a common template (Assessment, Key Findings, Blind Spots, Alternatives, Recommendation with confidence level), making synthesis straightforward.

Anti-Bias Protocol

When formulating queries for consilium, follow these rules to maximize the value of independent opinions:

State the problem, not your solution. Instead of "Should we use X?", describe the constraints and goals.
Don't lead. Avoid "I think X is best, what do you think?" — this anchors the response.
Include raw context. Pipe code files or paste error logs directly rather than summarizing them (summaries carry your interpretation).
Omit your hypothesis when possible. Let agents form their own before revealing yours.

Agent Freedom and Read-Only Guardrails

Agents are spawned in the caller's current working directory with their native agentic toolchain intact. They can:

Read, Grep, Glob, find_references, git log/blame across the real repository
Consult CLAUDE.md, AGENTS.md, README, config files, tests, call sites, neighboring modules
Use web search / fetch if their backend supports it (Claude Code, OpenCode, Codex all do)
Run SAST-style introspection via their built-in shells

What they cannot do (enforced per backend):

Backend	Read-only guard
Codex	`--sandbox read-only --ask-for-approval never`
Claude Code	`--permission-mode plan`
OpenCode	`--agent plan` (plan is opencode's built-in read-only agent)
Gemini CLI	`--approval-mode plan`

Configuration

Agents are declared in config.json at the skill root. Each agent has:

Field	Purpose
`enabled`	Whether it participates in `consensus-query`
`backend`	CLI that actually runs: `codex-cli`, `gemini-cli`, `opencode`, `claude-code`
`model`	Model id passed to that CLI
`role`	`analyst` (deep/precise) or `lateral` (broad/creative)
`label`	Display name in reports (optional)
`effort`	Reasoning effort. opencode: maps to `opencode run --variant` (e.g. `low`, `medium`, `high`, `max`) — provider-specific, see Discovering reasoning variants below. claude-code: maps to `claude --effort` (`low`, `medium`, `high`, `xhigh`, `max`). Other backends ignore it.

Default config (config.json):

codex (backend=codex-cli, model=gpt-5.5, role=analyst) — enabled
gemini-cli (backend=gemini-cli, model=gemini-3.1-pro-preview, role=lateral) — disabled
opencode (backend=opencode, model=opencode/gemini-3.1-pro, role=lateral, effort=high) — enabled
claude-code (backend=claude-code, model=opus, effort=max, role=analyst) — disabled
opencode-go-minimax (backend=opencode, model=opencode-go/minimax-m2.7, role=lateral, effort=high) — enabled
opencode-go-deepseek (backend=opencode, model=opencode-go/deepseek-v4-pro, role=analyst, effort=max) — enabled
opencode-go-mimo (backend=opencode, model=opencode-go/mimo-v2.5-pro, role=lateral, effort=high) — enabled
opencode-go-kimi (backend=opencode, model=opencode-go/kimi-k2.6, role=analyst, effort=high) — enabled
opencode-go-glm (backend=opencode, model=opencode-go/glm-5.1, role=lateral, effort=high) — enabled
opencode-openai (backend=opencode, model=openai/gpt-5.5, role=analyst, effort=high) — disabled (reference entry; flip on if you want OpenAI direct)

Multiple agents can share one backend — the dispatcher passes the entry id through CONSILIUM_AGENT_ID, so each backend script reads its own slice of config.json.

Edit config.json to flip agents on/off or change models. Set CONSILIUM_CONFIG=/path/to/custom.json to use an override file.

OpenCode provider choice: Zen vs Google direct vs OpenAI direct

The opencode backend works with any provider/model that OpenCode supports. For Gemini 3.1 Pro you have two options:

Zen (default): "model": "opencode/gemini-3.1-pro" — goes through OpenCode Zen. Works out of the box once opencode providers login opencode (or a valid Zen credential) is configured.
Google direct: "model": "google/gemini-3.1-pro-preview" — goes straight to Google's v1beta API. Requires GOOGLE_GENERATIVE_AI_API_KEY (OpenCode does not pick up GEMINI_API_KEY for this provider).

For OpenAI flagship models (GPT-5.5, GPT-5.4, etc.) there's a third path:

OpenAI direct: "model": "openai/gpt-5.5" — goes straight to OpenAI's API via the openai/* provider in OpenCode. Requires either an opencode auth login session for OpenAI (oauth) or OPENAI_API_KEY in the environment. The default config ships an opencode-openai entry disabled as a reference; flip enabled=true if you want a GPT-5.5 voice in the consilium. Variants: none / low / medium / high / xhigh — pick xhigh if you want the heaviest reasoning, otherwise high is the safe default.

Flip between providers by editing the model field; the rest of the config stays the same.

Discovering OpenCode reasoning variants per model

opencode run --variant <effort> is provider-specific — each model exposes its own set (or none). Don't guess: enumerate them from the CLI before setting effort in config.json.

One-liner — list every model with its supported variants:

opencode models opencode --verbose 2>&1 | python3 -c '
import sys, json
lines = sys.stdin.read().split("\n")
i = 0
while i < len(lines):
    line = lines[i].strip()
    if line.startswith("opencode/") or line.startswith("opencode-go/"):
        model_id, json_lines, depth, started = line, [], 0, False
        i += 1
        while i < len(lines):
            s = lines[i]; json_lines.append(s)
            for c in s:
                if c == "{": depth += 1; started = True
                elif c == "}": depth -= 1
            i += 1
            if started and depth == 0: break
        try:
            v = list(json.loads("\n".join(json_lines)).get("variants", {}).keys())
            print(f"{model_id}\t{v}")
        except Exception: pass
    else:
        i += 1
'

Swap opencode for opencode-go (or any other provider id) to scan a different namespace; drop the provider arg to scan everything opencode models knows.

Interpreting the result:

Non-empty list (e.g. ['low', 'medium', 'high', 'max']) → set effort to the highest one you want.
[] → the model has no reasoning variants. --variant is silently ignored; setting effort in config is harmless but does nothing.
If a variant in your config isn't on the list, opencode run rejects the call. Re-enumerate after upgrading opencode — providers add/remove tiers between releases.

Snapshot of the currently configured opencode models (re-run the one-liner if you change the set):

Model	Variants	`effort` in default config
`opencode/gemini-3.1-pro`	low, medium, high	`high` (no `max`)
`opencode-go/deepseek-v4-pro`	low, medium, high, max	`max`
`opencode-go/mimo-v2.5-pro`	low, medium, high	`high` (no `max`)
`opencode-go/minimax-m2.7`	—	`high` (ignored)
`opencode-go/kimi-k2.6`	—	`high` (ignored)
`opencode-go/glm-5.1`	—	`high` (ignored)
`openai/gpt-5.5`	none, low, medium, high, xhigh	`high` (entry disabled by default)

Claude Code backend

The claude-code backend shells out to claude -p (headless mode, see docs). Useful when you want a second Claude in the consilium — e.g. Opus as analyst cross-checking Codex.

model: a shortname (opus, sonnet, haiku) or full id (claude-opus-4-7, claude-sonnet-4-6).
effort: maps to claude --effort — accepts low, medium, high, xhigh, max. Default config sets max for opus; omit the field to fall back to the skill's default of max for the claude-code backend.
Runs in the caller's CWD with --permission-mode plan — Claude can freely Read/Grep/Glob/Bash read-only across the project, but cannot Edit/Write. Override with CLAUDE_PERMISSION_MODE only if you know what you're doing.
Authentication uses the same Claude Code credentials the CLI is already logged in with (claude /login).

Scripts

All scripts in scripts/ directory. The skill auto-detects its install location.

Single Agent Queries

# Codex (analyst by default)
scripts/codex-query.sh "question" [context_file]
cat file.py | scripts/codex-query.sh "review this"

# Gemini CLI (lateral by default; disabled in default config)
scripts/gemini-query.sh "question" [context_file]
cat file.py | scripts/gemini-query.sh "review this"

# OpenCode (lateral by default, model per config.json)
scripts/opencode-query.sh "question" [context_file]
cat file.py | scripts/opencode-query.sh "review this"

# Claude Code (analyst by default; disabled in default config)
scripts/claude-query.sh "question" [context_file]
cat file.py | scripts/claude-query.sh "review this"

Consensus Query (All Enabled Agents in Parallel)

scripts/consensus-query.sh "architecture question"
cat file.py | scripts/consensus-query.sh "review this code"
scripts/consensus-query.sh --xml "review this"            # XML report for agent consumers
scripts/consensus-query.sh --list-agents                   # dry-run: dump plan, don't query

Flags & Exit Codes

All scripts accept -h / --help. Both consensus-query.sh and code-review.sh accept:

Flag	Effect
`--xml`	Emit `<consilium-report>` (or `<code-review-report>`) with each agent wrapped in `<agent>…<response><![CDATA[…]]></response></agent>`. Stable for agent consumers (no markdown-heading collision).
`--list-agents` (consensus only)	Print `<consilium-plan>` (every configured agent, enabled/disabled, with `backend-available`) and exit. No queries are run — use this as an inspection / dry-run.
`-a, --agents <ID\|GLOB>`	Override the active agent set with this id or glob (e.g. `'opencode-go-'`). Repeatable*; comma-separated values also accepted (`-a codex,opencode-go-kimi`). When given, the per-agent `enabled` flag in `config.json` is ignored — only matched agents run. Falls back to env `CONSILIUM_AGENTS`.
`-x, --exclude <ID\|GLOB>`	Subtract matching agents from the active set. Repeatable. Combine with `--agents` for include-then-exclude composition. Falls back to env `CONSILIUM_EXCLUDE`.

Ad-hoc agent selection examples:

# Single agent
scripts/consensus-query.sh -a opencode-go-kimi "Q"

# All OC-Go models (glob)
scripts/consensus-query.sh -a 'opencode-go-*' "Q"

# Everything-except-codex
scripts/consensus-query.sh -x codex "Q"

# Composition: only OC-Go but skip MiniMax
scripts/consensus-query.sh -a 'opencode-go-*' -x opencode-go-minimax "Q"

# Same via env (scriptable)
CONSILIUM_AGENTS='codex,opencode-go-kimi' scripts/consensus-query.sh "Q"

Exit codes (stable across all scripts):

Code	Meaning
`0`	Success (all queried agents replied; or, for `consensus-query.sh`, the active agent set may be smaller than the configured set if some are disabled or filtered)
`2`	Consensus only: partial failure (≥1 succeeded, ≥1 failed)
`3`	Consensus only: every queried agent failed
`4`	Config error (missing CLI, invalid config, unknown role/agent id)
`5`	Usage error (missing prompt, unknown flag)
other	Propagated from the backend CLI (e.g. `124` on timeout)

Code Review Mode

Design choices are grounded in the 2024-2026 multi-agent code review literature:

Two specializations only (security + correctness). Readability/perf agents empirically produce nit spam and hurt precision.
No coordinator / no debate. The caller (you) adjudicates. Debate rounds empirically entrench errors (Wu et al. 2025; Choi et al. 2025).
Heterogeneous models via the existing config (Codex + OpenCode by default) reduce shared blind spots.
Fixed cost. Adding a 3rd enabled agent does not add a 3rd pass; the skill always runs 2 passes and rotates agents round-robin.
Hallucinated line numbers are caught locally. Every finding carries <quoted-code>, and the validator cross-checks it against the source file (quote-valid="true|false").

Usage

# File on disk (quoted-code validated against the file)
scripts/code-review.sh path/to/file.py
scripts/code-review.sh --xml path/to/file.py

# Unified diff piped on stdin (quoted-code validation is skipped)
git diff HEAD | scripts/code-review.sh --diff
git diff HEAD | scripts/code-review.sh --xml --diff

Finding schema (XML output)

<finding index="N" severity="critical|high|medium|low" category="security|correctness"
         file="..." line-start="N" line-end="N" confidence="0.0..1.0"
         source-agent="..." source-role="security|correctness"
         quote-valid="true|false">
  <title>...</title>
  <rationale><![CDATA[includes one reason this might be a false positive]]></rationale>
  <suggested-fix><![CDATA[...]]></suggested-fix>
  <quoted-code><![CDATA[verbatim source at line-start..line-end]]></quoted-code>
</finding>

Findings are sorted severity desc, confidence desc. No severity filtering by default — triage is the caller's job.

Severity rubric

Severity	Action horizon	Operational definition	Security examples	Correctness examples
critical	Merge blocker	RCE / trust-boundary bypass / data loss / guaranteed outage, with a concrete exploit or dataflow trace	SQLi on public endpoint with concatenated query; unsafe deserialization of untrusted input; hardcoded prod credential	Payment/ledger math silently corrupts balances; unconditional null deref on hot request path; race on shared mutable state under prod load
high	Fix before release	Critical-tier impact gated by a non-trivial precondition (auth, specific config), OR moderate impact with high reachability	Stored XSS in authenticated admin view; CSRF on state-changing endpoint; path traversal behind login; missing authz on tenant resource	Unhandled exception on documented error path crashing a worker; file/DB-handle leak exhausting pools; retry logic that double-charges
medium	Schedule	Limited impact (info disclosure, localized incorrectness, degraded-but-recoverable), OR critical impact gated by implausible preconditions	Stack traces leaked to end users; missing `HttpOnly`/`Secure` on non-session cookie; weak-but-not-broken crypto parameter	Incorrect edge-case handling in non-critical helper; missing input validation that callers already satisfy; N+1 query degrading a list endpoint
low	Optional / backlog	Cosmetic, stylistic, defense-in-depth; minimal real-world impact	Missing `nosniff` header where CSP already mitigates; `Math.random()` for non-security id	Dead code; inconsistent naming; redundant null check after non-null assertion

Using the results (for the caller)

You are the adjudicator. Specialists emit independent findings — your job is to select and synthesize, not re-review (RovoDev 2601.01129, RevAgent 2511.00517).

Drop quote-mismatched findings (quote-valid="false") — likely hallucinations.
Merge duplicates across specialists. Same root cause in different framings → one item; keep the clearer rationale and note both agents.
Surface conflicts without resolving them. If Security says "sanitize X" and Correctness says "X is fine" — present both to the user and let them adjudicate; don't break the tie yourself.
Gate by action horizon using the severity rubric above: critical = block the merge, high = fix before release, medium = track, low = optional.
Do not re-review. Do not generate new findings inside the adjudication step. Do not run a debate loop — adversarial re-reviewing empirically reduces precision (CR-Bench 2603.11078).

When NOT to use code-review mode

Open-ended architecture questions → use consensus-query.sh; specialists will be too narrow.
Huge files (>1000 lines) → split into function-sized diffs first; LLMs degrade past that length.
Multi-file cross-references → not modelled here; rerun per file and stitch findings.

When to Use Which

Situation	Script	Role(s) involved
Code review, security audit	per-agent `analyst` script (`codex-query.sh` or `claude-query.sh`)	analyst — precision, edge cases
Architecture decision, design choice	`consensus-query.sh`	analyst + lateral — depth + breadth
"Are we solving the right problem?"	per-agent `lateral` script (`opencode-query.sh` or `gemini-query.sh`)	lateral — challenges premises
Bug investigation, root cause analysis	per-agent `analyst` script	analyst — goes deep into implementation
Exploring alternatives, brainstorming	per-agent `lateral` script	lateral — cross-domain analogies
High-stakes or irreversible decision	`consensus-query.sh`	all enabled — reduce blind spots
Agent-to-agent integration (downstream parser)	`consensus-query.sh --xml`	any — stable structured output

Synthesizing Responses

Agents respond with a shared structure. Compare section by section:

Assessment vs Assessment: Do they frame the problem differently? A framing difference often reveals the most insight.
Blind Spots: Union of both agents' blind spots is your risk map.
Alternatives: Check if either agent proposed something neither you nor the other agent considered.
Recommendations: Agreement = high confidence. Divergence = investigate the reasoning, not just the conclusion.

Response Patterns

When comparing the two responses, classify the pattern and act accordingly:

Agreement: Both recommend same approach — high confidence, proceed
Complementary: Different valid points that don't conflict — combine insights into a richer picture
Contradiction: Conflicting recommendations — present both with reasoning, let user decide
Unique insight: One agent caught something the other missed — highlight it, this is often the most valuable output

Prompt Patterns

Architecture Decision (unbiased framing)

scripts/consensus-query.sh "We need real-time updates for ~100 concurrent users.
Updates are server-initiated only. Current stack: [describe your stack].
Latency target: under 500ms from event to UI update.
What approach would you recommend and why?"

Code Review (pipe raw code, let agents form opinions)

cat src/services/auth.py | scripts/codex-query.sh \
  "Review this authentication service. Focus on whatever concerns you most."

Problem Investigation (provide facts, not hypotheses)

scripts/codex-query.sh "Database query returns empty result.
Direct query with same filter returns 5 documents.
[paste query here]
What's happening?"

Environment Variables

CONSILIUM_CONFIG: Path to a custom JSON config (default: <skill>/config.json)
CODEX_MODEL: Override Codex model at runtime (default: value from config)
GEMINI_MODEL: Override Gemini CLI model at runtime (default: value from config)
OPENCODE_MODEL: Override OpenCode model at runtime (default: value from config)
OPENCODE_AGENT: Override OpenCode built-in agent (default: plan, read-only)
OPENCODE_EFFORT: Override OpenCode reasoning effort (default: config effort field, or high)
CLAUDE_MODEL: Override Claude Code model at runtime (alias like opus or full id)
CLAUDE_PERMISSION_MODE: Override Claude Code permission mode (default: plan)
CLAUDE_EFFORT: Override Claude Code reasoning effort (default: config effort field, or max if both unset). Levels: low, medium, high, xhigh, max.
CODEX_EFFORT: Override Codex reasoning effort (default: config effort field, or high if both unset). Levels: minimal, low, medium, high, xhigh.
GEMINI_API_KEY: Required for the gemini-cli backend (v1beta model access)
GOOGLE_GENERATIVE_AI_API_KEY: Required if the opencode backend uses google/... models
OPENAI_API_KEY: Required if the opencode backend uses openai/... models and OpenCode is not already logged in via opencode auth login
AGENT_TIMEOUT: Timeout seconds (default: 1200)

Prerequisites

Codex CLI installed and authenticated (codex --version) — for the codex-cli backend
OpenCode CLI installed (opencode --version) — for the opencode backend. For Zen models (opencode/...) run opencode providers login opencode once; for Google direct models (google/...) set GOOGLE_GENERATIVE_AI_API_KEY; for OpenAI direct models (openai/...) either run opencode auth login and pick OpenAI, or set OPENAI_API_KEY.
Gemini CLI installed (gemini --version) — for the gemini-cli backend (optional; falls back to direct API)
Claude Code CLI installed and logged in (claude --version, claude /login) — for the claude-code backend
GEMINI_API_KEY environment variable — required only when gemini-cli backend is enabled (get key at https://ai.google.dev/gemini-api/docs/api-key)
Python 3 (for config parsing and Gemini API fallback)

agents-consilium

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

Help us improve

Help us improve

agents-consilium

Tool Access

Preview

Supporting Assets

SKILL.md

Consilium: Multi-Agent Orchestration

Why this skill

Contents

Quick Start

Passing the prompt

Design Principles

Anti-Bias Protocol

Agent Freedom and Read-Only Guardrails

Configuration

OpenCode provider choice: Zen vs Google direct vs OpenAI direct

Discovering OpenCode reasoning variants per model

Claude Code backend

Scripts

Single Agent Queries

Consensus Query (All Enabled Agents in Parallel)

Flags & Exit Codes

Code Review Mode

Usage

Finding schema (XML output)

Severity rubric

Using the results (for the caller)

When NOT to use code-review mode

When to Use Which

Synthesizing Responses

Response Patterns

Prompt Patterns

Architecture Decision (unbiased framing)

Code Review (pipe raw code, let agents form opinions)

Problem Investigation (provide facts, not hypotheses)

Environment Variables

Prerequisites

Similar Skills

Help us improve

Consilium: Multi-Agent Orchestration

Why this skill

Contents

Quick Start

Passing the prompt

Design Principles

Anti-Bias Protocol

Agent Freedom and Read-Only Guardrails

Configuration

OpenCode provider choice: Zen vs Google direct vs OpenAI direct

Discovering OpenCode reasoning variants per model

Claude Code backend

Scripts

Single Agent Queries

Consensus Query (All Enabled Agents in Parallel)

Flags & Exit Codes

Code Review Mode

Usage

Finding schema (XML output)

Severity rubric

Using the results (for the caller)

When NOT to use code-review mode

When to Use Which

Synthesizing Responses

Response Patterns

Prompt Patterns

Architecture Decision (unbiased framing)

Code Review (pipe raw code, let agents form opinions)

Problem Investigation (provide facts, not hypotheses)

Environment Variables

Prerequisites