協調多個 AI 模型(ChatGPT、Gemini、Codex、QWEN、Claude)進行三角驗證,確保「Specification == Program == Test」一致性。過濾假警報後輸出報告,大幅減少人工介入時間。
/plugin marketplace add knowlet/skills/plugin install knowlet-skills@knowlet/skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
scripts/multi_model_review.py透過多模型交叉驗證,確保:
┌─────────────────┐
│ Specification │
│ (YAML Specs) │
└────────┬────────┘
│
┌────────────┼────────────┐
│ │ │
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐
│ Spec │ │ Spec │ │ Test │
│ == │ │ == │ │ == │
│Program│ │ Test │ │Program│
└───────┘ └───────┘ └───────┘
| # | Model | 呼叫方式 | 專長 |
|---|---|---|---|
| 1 | ChatGPT 5.2 | API (OpenAI) | 語意理解、邏輯推理 |
| 2 | Gemini | CLI (本地) | 多模態、長上下文 |
| 3 | Codex | CLI (本地) | 代碼生成、理解 |
| 4 | QWEN 32B | Local LLM | 中文理解、快速推理 |
| 5 | Claude | CLI (本地) | 規格分析、假警報過濾 |
┌─────────────────────────────────────────────────────────────────┐
│ Multi-Model Review Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Spec │ │ Program │ │ Test │ │
│ │ (YAML) │ │ (Code) │ │ (BDD) │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Parallel Review │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────┐ │ │
│ │ │ChatGPT │ │ Gemini │ │ Codex │ │ QWEN │ │Claude│ │ │
│ │ │ 5.2 │ │ CLI │ │ CLI │ │ 32B │ │ CLI │ │ │
│ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └──┬───┘ │ │
│ │ │ │ │ │ │ │ │
│ │ └───────────┴───────────┴───────────┴─────────┘ │ │
│ │ │ │ │
│ └───────────────────────────┼─────────────────────────────────┘ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Claude: False Positive Filter │ │
│ │ │ │
│ │ • 交叉比對各模型發現 │ │
│ │ • 過濾假警報 (≥3 models agree = real issue) │ │
│ │ • 分類嚴重程度 │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Final Report │ │
│ │ │ │
│ │ ✅ PASS / ⚠️ WARNINGS / ❌ ERRORS │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
檢查規格定義的內容是否都有對應實作:
checks:
- id: SP1
name: "Domain Events 實作完整性"
rule: "frame.yaml 定義的 domain_events 必須在程式中有對應實作"
example: "WorkflowCreated event defined but not published in CreateWorkflowUseCase"
- id: SP2
name: "Use Case 實作完整性"
rule: "use-case.yaml 定義的 input/output 必須與實際 Service 一致"
- id: SP3
name: "Invariants 實作"
rule: "aggregate.yaml 的 invariants 必須在 Aggregate 中有 enforced_in 對應的驗證"
- id: SP4
name: "Pre/Post Conditions"
rule: "contracts 定義的 pre_conditions 必須在程式中有檢查"
檢查實作的功能是否都有測試涵蓋:
checks:
- id: PT1
name: "Use Case 測試覆蓋"
rule: "每個 Service class 必須有對應的測試 class"
- id: PT2
name: "Domain Event 測試"
rule: "程式發布的 Domain Events 必須在測試中驗證"
- id: PT3
name: "Error Path 測試"
rule: "程式拋出的每種 Exception 必須有對應的測試案例"
- id: PT4
name: "Invariant 測試"
rule: "Aggregate 的每個 invariant 必須有違反時的測試"
檢查測試驗證的內容是否都有規格定義:
checks:
- id: TS1
name: "測試追溯性"
rule: "每個測試案例必須能追溯到 acceptance.yaml 的 AC"
- id: TS2
name: "Frame Concerns 覆蓋"
rule: "所有 frame_concerns 必須被測試涵蓋"
- id: TS3
name: "隱式行為"
rule: "測試驗證的行為如果規格沒有定義,標記為潛在遺漏"
╔═══════════════════════════════════════════════════════════════════╗
║ DOMAIN EVENT STANDARD UPDATE SUMMARY ║
╠═══════════════════════════════════════════════════════════════════╣
║ # │ Use Case │ Event │ Status ║
╠════╪═══════════════════════╪══════════════════════╪═══════════════╣
║ 1 │ create-workflow │ WorkflowCreated │ ✅ DONE ║
║ 2 │ create-stage │ StageCreated │ ✅ DONE ║
║ 3 │ create-swimlane │ SwimLaneCreated │ ✅ DONE ║
║ 4 │ copy-lane │ LaneCopied │ ✅ DONE ║
║ 5 │ move-lane │ LaneMoved │ ✅ DONE ║
║ 6 │ delete-lane │ LaneDeleted │ ✅ DONE ║
║ 7 │ rename-lane │ LaneRenamed │ ✅ DONE ║
║ 8 │ rename-workflow │ WorkflowRenamed │ ✅ DONE ║
║ 9 │ delete-workflow │ WorkflowDeleted │ ✅ DONE ║
║ 10 │ move-workflow │ WorkflowMoved │ ✅ DONE ║
║ 11 │ set-wip-limit │ WipLimitSet │ ✅ DONE ║
║ 12 │ change-workflow-note │ WorkflowNoteChanged │ ✅ DONE ║
╠════╧═══════════════════════╧══════════════════════╧═══════════════╣
║ TOTAL: 12/12 (100%) ✅ ║
╚═══════════════════════════════════════════════════════════════════╝
變更摘要:
每個 aggregate.yaml 的 domain_events 區塊現在統一:
1. 新增 includes_standard: true
2. 新增 standard_ref: "../../../../shared/domain-event-standard.yaml"
3. 移除重複的 id 和 occurredOn 屬性
4. 新增 metadata 屬性的註解說明
5. 調整 workflowId 為第一個屬性(一致的排序)
共用標準檔案:
- /.dev/problem-frames/ezkanban/board-management/shared/domain-event-standard.yaml
這樣就解決了 multi-model review 發現的 metadata spec mismatch 問題,
所有 12 個 Workflow aggregate 的 domain events 現在都符合標準。
review_report:
timestamp: "2025-12-31T10:30:00Z"
spec_dir: "docs/specs/create-workflow/"
summary:
total_checks: 24
passed: 22
warnings: 1
errors: 1
issues:
- id: ISSUE-001
severity: error
type: "spec_program_mismatch"
description: "Domain event 'WorkflowCreated' missing 'metadata' property in spec"
detected_by: ["chatgpt", "gemini", "claude"] # 3/5 models
confidence: high
spec_location: "aggregate.yaml#domain_events.WorkflowCreated"
program_location: "WorkflowEvents.java#WorkflowCreated"
spec_definition: |
properties:
- workflowId
- boardId
- name
program_implementation: |
record WorkflowCreated(
WorkflowId workflowId,
BoardId boardId,
String name,
EventMetadata metadata // ← Missing in spec
)
suggested_fix: |
Add 'metadata' property to aggregate.yaml:
```yaml
domain_events:
- name: WorkflowCreated
includes_standard: true
standard_ref: "../shared/domain-event-standard.yaml"
```
- id: ISSUE-002
severity: warning
type: "test_coverage_gap"
description: "No test for 'WorkflowCreated' event publication"
detected_by: ["codex", "qwen"] # 2/5 models - warning level
confidence: medium
test_location: "CreateWorkflowAcceptanceTest.java"
suggestion: "Add assertion for event publication in ThenSuccess block"
consensus_rules:
error:
threshold: 3 # ≥3 models agree = confirmed error
action: "report as error"
warning:
threshold: 2 # 2 models agree = warning
action: "report as warning"
ignored:
threshold: 1 # only 1 model = likely false positive
action: "log but don't report"
Claude 作為最終審查者,負責:
# 執行多模型審查
python ~/.claude/skills/multi-model-reviewer/scripts/multi_model_review.py \
--spec-dir docs/specs/create-workflow/ \
--program-dir src/application/workflow/ \
--test-dir tests/acceptance/workflow/ \
--output review-report.yaml
# 只驗證規格與程式
python ~/.claude/skills/multi-model-reviewer/scripts/multi_model_review.py \
--spec-dir docs/specs/create-workflow/ \
--program-dir src/application/workflow/ \
--check spec-program
# 使用特定模型子集
python ~/.claude/skills/multi-model-reviewer/scripts/multi_model_review.py \
--spec-dir docs/specs/create-workflow/ \
--models chatgpt,claude,gemini
spec-compliance-validator
│
├── 驗證單一規格完整性
│
└── 提供規格資料給 →
│
▼
multi-model-reviewer (本 Skill)
│
├── 協調 5 個 AI Agents
├── 交叉驗證 Spec == Program == Test
├── Claude 過濾假警報
│
└── 輸出報告給 →
│
▼
code-reviewer
│
└── 開發人員確認 → AI 修訂
# 專案根目錄配置
models:
chatgpt:
enabled: true
api_key_env: "OPENAI_API_KEY"
model: "gpt-5.2"
gemini:
enabled: true
cli_command: "gemini"
codex:
enabled: true
cli_command: "codex"
qwen:
enabled: true
endpoint: "http://localhost:11434/api/generate"
model: "qwen2.5:32b"
claude:
enabled: true
cli_command: "claude"
role: "final_arbiter" # 最終裁決者
paths:
specs: "docs/specs/"
source: "src/"
tests: "tests/"
shared_standards:
domain_events: "shared/domain-event-standard.yaml"
consensus:
error_threshold: 3
warning_threshold: 2
1. 開發完成
│
▼
2. 執行 multi-model-review
│
▼
3. 收到報告
├── ✅ PASS → 提交 PR
│
└── ❌ ISSUES FOUND
│
▼
4. 開發人員確認
├── 真問題 → 請 AI 修訂
│ │
│ ▼
│ 5. AI 自動修復
│ │
│ ▼
│ 6. 重新驗證 → 回到步驟 2
│
└── 假警報 → 標記忽略規則
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.