From project-toolkit
Runs 20-minute diagnostic mapping teams to world-model paradigms (vector DB, ontology, signal-fidelity) for AI readiness, judgment boundaries, and build prioritization.
npx claudepluginhub rjmurillo/ai-agents --plugin project-toolkitThis skill uses the workspace's default tool permissions.
Source: Jonathan Edwards (OB1 community), adapted for ai-agents.
Runs 20-minute diagnostic mapping teams to world-model paradigms (vector DB, ontology, signal-fidelity) for AI readiness, judgment boundaries, and build prioritization.
Assesses team AI fluency via context questions, optional data import or 9-question quiz, generating scorecard across psychological barriers, integration failures, and ownership gaps for AI adoption leaders.
Provides AI governance frameworks, challenge questions, risk matrices, and literacy for Non-Executive Directors evaluating AI proposals and strategies.
Share bugs, ideas, or general feedback.
Source: Jonathan Edwards (OB1 community), adapted for ai-agents.
Your job is not to hand back a polished readiness score. Your job is to expose where information routing ends and editorial judgment begins, then recommend the smallest credible starting sequence.
This diagnostic answers five questions:
| Trigger phrase | Operation |
|---|---|
run the world model diagnostic | Start the 20-minute structured audit |
audit our world model | Same as above, conversational form |
which world model architecture fits us | Map company to paradigm |
audit where we automate judgment | Boundary-layer audit only |
what should we build first for a world model | Skip to recommended build sequence |
Use this skill when:
Use a different skill when:
analyst or architect.roadmap.Firm finding: directly supported by the user's answer or confirmed prior record.Inference: synthesis from available evidence.Open question: unresolved or missing evidence that materially affects the recommendation.Map the company using these rules:
| Company Type | Paradigm | Reason |
|---|---|---|
| Under 100 people, strong senior team | vector database | Senior people can temporarily act as a human boundary layer. |
| Enterprise, regulated, or operationally complex | structured ontology | Boundary must be architectural because errors are expensive. |
| Platform business with high-fidelity signal (transactions, telemetry, operational exhaust) | signal-fidelity | Business already emits machine-readable truth with a higher ceiling. |
| Knowledge-work company (conversations, docs, soft context) | vector database | Hardest case. Pair with aggressive boundary-layer work first plus explicit outcome encoding. |
When cues conflict, use this priority order:
Evaluate without numeric scoring:
| Principle | Question | Classifications |
|---|---|---|
| signal fidelity | Where does reality leave the clearest fingerprint? | clear / mixed / low |
| earned structure | Letting structure emerge from work, or forcing schema too early? | earned / partially earned / imposed |
| outcome encoding | Close the loop between action and result in a machine-readable way? | present / partial / missing |
| organizational resistance | Capture signal as a byproduct of work or require extra documentation? | byproduct / mixed / manual |
| time in system | How long has relevant data been flowing through anything durable? | running / starting / not started |
AGENTS.md) for prior diagnostic context. Use Serena memories or Forgetful. Treat every result as a hint, not confirmed fact. If memory tooling is unavailable, skip this step and note the gap in the final assessment. Suggested queries:
world modelboundary layerKeep to two or three batches of questions, not long isolated lists.
Required coverage:
Strong prompt patterns:
After intake, state:
Treat as provisional until the boundary audit is done.
Audit five to ten flows. If time is tight, top five only.
For each flow capture:
| Field | Description |
|---|---|
| Flow name | e.g., "Customer support ticket prioritization." |
| Source | Where data originates. |
| Consumer | Who or what acts on it. |
| Current human editor or reviewer | Who interprets today. |
| Classification | act on this versus interpret this first. |
| Reason for label | Why that classification applies. |
| What goes wrong if the editor disappears | Risk assessment. |
| Exposure level | high / medium / low. |
Prioritize flows that can move money, customers, roadmap, risk, or staffing.
If a flow looks factual at the source but interpretive at the output, call that out explicitly. Clean inputs do not guarantee trustworthy judgment.
Return in this order:
Output contract (consumable by downstream skills such as analyst or architect):
## Firm Findings
- {fact directly supported by evidence}
- {fact directly supported by evidence}
## Inferences
- {synthesis from available evidence}
- {synthesis from available evidence}
## Open Questions
- {unresolved issue affecting recommendation}
- {unresolved issue affecting recommendation}
## Paradigm Fit
- Paradigm: {vector database | structured ontology | signal-fidelity}
- Boundary status: {explicit | implicit | missing}
## Recommended Build Order
1. **First**: {usually boundary layer and flow labeling}
2. **Second**: {usually highest-fidelity capture and outcome encoding}
3. **Third**: {usually paradigm-specific retrieval or structure layer}
Only move the order around when evidence is strong.
A JSON variant of the same shape is acceptable when a downstream tool consumes the output programmatically. Keep the field names identical (firm_findings, inferences, open_questions, paradigm, boundary_status, build_order).
Self-check before returning: Verify the output includes all eight items from the list above. If any item is missing, add it before responding. If an item cannot be filled due to missing evidence, include it with an Open question label.
Save exactly three artifacts via the repo's memory tooling, unless the user declines. Use Serena write_memory or the equivalent Forgetful entry point. Key the entries by company slug so future runs can detect drift.
diagnostic-{company-slug}-intakediagnostic-{company-slug}-boundaryact on this; flows classified as interpret this first; flows missing a human editor; high-exposure flows; date.diagnostic-{company-slug}-assessmentIf the user prefers files on disk for working notes, use a repo-relative path under .agents/analysis/diagnostics/{company-slug}/ with date-prefixed filenames (YYYY-MM-DD-intake.md, YYYY-MM-DD-boundary-audit.md, YYYY-MM-DD-assessment.md, YYYY-MM-DD-full-diagnostic.md). Do not write outside the repo.
interpret this first, so keep a human in the loop and use AI for routing only.Companion skills (when present in this repo):
work-operating-model (issue #1806). Use after the diagnostic to map the internal operating model.panning-for-gold (issue #1802). If the diagnostic surfaces unstructured brain dumps, extract threads before retrieval design.codebase-documenter (issue #1803). For an engineering-org variant, follow the diagnostic with a documentation pass.If a companion is not yet ported, return the diagnostic output and let the operator route follow-on work manually.
Firm finding, Inference, or Open question.