From ctx
Builds and maintains ARCHITECTURE.md and DETAILED_DESIGN.md incrementally with coverage tracking. Principal mode analyzes vision, bottlenecks, gaps, and alternatives.
npx claudepluginhub activememory/ctx --plugin ctxThis skill is limited to using the following tools:
Build and maintain two architecture documents incrementally:
Enriches /ctx-architecture artifacts with GitNexus MCP data: verifies blast radius, execution flows, domain clustering, registration sites. Run after base architecture analysis for quantification.
Scans codebase and generates high-level architecture graph as interlinked markdown files in docs/professor/architecture/. Use for new project analysis or significant changes.
Analyzes software architectures using first-principles and SpaceX 5-step methodology to validate decisions, identify technical debt, plan refactoring, and review changes.
Share bugs, ideas, or general feedback.
Build and maintain two architecture documents incrementally:
ARCHITECTURE.md (succinct project map, loaded at session start)
and DETAILED_DESIGN.md (deep per-module reference, consulted
on-demand). Coverage is tracked in map-tracking.json so each run
extends the map rather than re-analyzing everything.
When time or context budget runs short, execute in this order. Never skip a tier to do a lower one:
Read the invocation for a mode keyword:
default) → run Default mode (Phases 0-5 below)principal → run Principal mode (Phases 0-5 + Principal phases P1-P3)Examples:
/ctx-architecture
/ctx-architecture principal
/ctx-architecture (principal)
/ctx-drift
instead)opted_out: true in
map-tracking.json)Read .context/map-tracking.json. If it exists and
opted_out: true, say:
Architecture mapping is opted out for this project. Delete
.context/map-tracking.jsonto re-enable.
Then stop.
Check if Gemini Search MCP is available by attempting a simple query. Gemini is used for upstream documentation, design rationale, KEPs, peer-project patterns - anything outside the local codebase that helps understand why the code is shaped the way it is.
If available: note it silently. Use Gemini throughout the analysis for upstream lookups. Prefer it over built-in web search.
If not available: ask the user once:
Gemini Search is not connected. It helps me look up upstream
design docs, KEPs, and peer-project patterns during analysis.
Want to set it up now, or proceed without it?
Respect the answer and continue either way.
Important: Gemini is for upstream and external context only. Do not use it to understand the local codebase - read the code directly. The depth of analysis comes from forced reading, not from search shortcuts.
Before any deep analysis, do a lightweight structural survey to discover what the project actually contains. This takes seconds and makes the focus-area question concrete instead of open-ended.
Scan steps (no file reads - structure only):
# Detect ecosystem
ls go.mod package.json Cargo.toml pyproject.toml 2>/dev/null
# List top-level source directories / packages
# Go:
go list ./... 2>/dev/null | sed 's|.*/||' | sort -u | head -40
# or: ls internal/ cmd/ pkg/ 2>/dev/null
# Node/other: ls src/ lib/ packages/ 2>/dev/null
# Large monorepo guard: if >100 packages, limit to top 2 levels only
find . -mindepth 1 -maxdepth 2 -type d \
! -path './.git/*' ! -path './vendor/*' ! -path './node_modules/*' \
| sort | head -60
Then ask (present the discovered package/module names):
I found these top-level packages/modules:
[list from scan]
Any specific areas you'd like me to go deep on? You can name
packages from the list above, describe subsystems (e.g. "the
reconciler loop", "auth handling"), or say "all" for a uniform
pass.
Skip or press enter to do a standard uniform pass.
If focus areas are given, carry them forward:
If "all" or no answer, proceed with standard uniform analysis.
Determine if this is a first run or subsequent run:
.context/map-tracking.json existsFor subsequent runs, identify the frontier: modules that need analysis:
map-tracking.json for coverage stategit log --oneline --since="<last_analyzed>" \
-- <module_path>/
last_analyzed) + low-confidence modules (confidence < 0.7)First run: full survey:
Run ctx deps to bootstrap the dependency graph:
ctx deps
Auto-detects the ecosystem (Go, Node.js, Python, Rust) from manifest files. Use this as the starting point for "Package Dependency Graph": verify and enrich with semantic context.
Read the project manifest for project identity (name, version,
description): ctx deps covers the dependency tree
Explore directory structure:
ctx status
Read key files in each package: exported types, functions, imports
Trace data flow through main entry points
Identify architectural patterns (dependency injection, interfaces, registries)
Subsequent run: targeted analysis:
ARCHITECTURE.md: update ONLY if module boundaries, dependency graph, data flow, or key patterns changed. Internal implementation changes do NOT warrant updates. Target: under 4000 tokens (~16KB) so ARCHITECTURE.md loads within the session-start context budget.
Required sections:
graph TD)DETAILED_DESIGN.md: update per-module sections using this format:
## <module_path>
**Purpose**: One-line description.
**Key types**: List main structs/interfaces.
**Exported API**:
- `FuncName()`: what it does
- `Type.Method()`: what it does
**Data flow**: Entry → Processing → Output
Include an ASCII sequence diagram when there are 3+ actors or
non-obvious ordering:
Caller Scheduler Worker |--schedule()-->| | | |--dispatch()-->| | |<--result------| |<--done--------| |
Include an ASCII state diagram when the module manages lifecycle
or status transitions:
[Init] --configure()--> [Ready] --start()--> [Running] | | error()---------| |--stop()-->[ Stopped] | [Stopped] --reset()--> [Ready] [Failed]
Use plain ASCII (not mermaid) for DETAILED_DESIGN.md - it renders
in any terminal, editor, or raw file view without a renderer.
Reserve mermaid for ARCHITECTURE.md only.
**Edge cases**:
- Condition → behavior
**Performance considerations**:
- Known or likely bottlenecks (hot paths, allocation pressure,
lock contention, I/O bound operations)
- Scale assumptions baked into the design (e.g. "assumes <1000
items", "single-threaded reconcile loop")
- What breaks first under load
**Danger zones** (top 3 riskiest modification points):
1. `<symbol or area>` - why it's dangerous (hidden coupling,
ordering assumption, shared mutable state, etc.)
2. ...
3. ...
**Control loop & ownership** (if the module participates in
reconciliation or state management):
- What owns the reconciliation for this module's resources?
- What is source of truth vs. derived/cached state?
- What triggers re-reconciliation?
**Extension points** (where features would naturally attach):
- `<symbol or pattern>` - what kind of extension fits here
**Improvement ideas** (1-3 concrete suggestions, not generic):
- `<specific change>` - what it fixes and why it's feasible
**Dependencies**: list of internal packages used
Splitting DETAILED_DESIGN.md when it grows large:
When DETAILED_DESIGN.md exceeds ~600 lines or covers 3+ natural domains, split into domain files and keep a shallow index:
DETAILED_DESIGN.md - index only (domain name, file pointer,
module list, one-line domain purpose)DETAILED_DESIGN-<domain>.md - full module sections for that
domainDomains are natural groupings, not arbitrary splits. Examples:
Index format:
# Detailed Design Index
| Domain | File | Modules | Summary |
|---------|----------------------------|----------------------|-------------------|
| storage | DETAILED_DESIGN-storage.md | pkg/store, pkg/cache | Persistence layer |
| auth | DETAILED_DESIGN-auth.md | pkg/authn, pkg/authz | Identity + policy |
> See individual files for module-level detail.
Update map-tracking.json to record which domain file each module
lives in:
"pkg/store": {
"domain_file": "DETAILED_DESIGN-storage.md",
...
}
Each section is self-contained. The agent reads specific sections when working on a module, not the entire file.
CHEAT-SHEETS.md: write (or update) short mental models for key lifecycle flows. One cheat sheet per major lifecycle or flow identified in the codebase. Format:
## <Lifecycle or Flow Name>
Steps:
1. <event or trigger>
2. <what happens next>
3. ...
Key invariants:
- <thing that must always be true>
Common failure modes:
- <condition> → <outcome>
Flow (ASCII - include when sequence or state is non-obvious):
[Trigger] --> [Step A] --> [Step B] --> [Done]
|
[Error] --> [Retry] --> [Dead Letter]
Aim for cheat sheets that fit on one screen. If a flow needs more than ~15 steps, split it. Write cheat sheets for at minimum:
Skip if the project has no meaningful lifecycles (e.g. a pure library with no runtime behavior).
Write .context/map-tracking.json with:
{
"version": 1,
"opted_out": false,
"opted_out_at": null,
"last_run": "<ISO-8601 timestamp>",
"coverage": {
"<module_path>": {
"last_analyzed": "<ISO-8601 timestamp>",
"confidence": <0.0-1.0>,
"files_seen": ["file1.go", "file2.go"],
"notes": "Brief summary of understanding"
}
}
}
Print a structured convergence report AND write it to
.context/CONVERGENCE-REPORT.md. The printed version is the
primary output the user reads. The file version is the artifact
that /ctx-architecture-enrich and future sessions consume.
The source of truth for confidence scores is map-tracking.json.
CONVERGENCE-REPORT.md is a human-readable view of that data -
if they ever conflict, map-tracking.json wins.
Format:
## Convergence Report
### By Module
| Module | Confidence | Status | Blocker |
|--------|------------|--------|---------|
| pkg/foo | 0.9 | ✅ Converged | - |
| pkg/bar | 0.6 | 🔶 Shallow | Internal flow unclear |
| pkg/baz | 0.2 | 🔴 Stubbed | Not analyzed |
### By Domain (if natural groupings exist)
Group related modules and show aggregate coverage:
e.g. "Auth layer: 2/3 modules converged (avg 0.72)"
### Overall
- Total modules: N
- Converged (≥ 0.9): N ✅
- Solid (0.7-0.89): N 🟡
- Shallow (0.4-0.69): N 🔶
- Stubbed (< 0.4): N 🔴
### What Would Help Next
For each non-converged module, print a specific suggestion:
🔶 pkg/bar (0.6) - Shallow
→ Read the test files to understand expected behavior under
edge cases: `pkg/bar/*_test.go`
→ Trace the internal flow through <specific function identified>
→ Ask: "walk me through what happens when X"
🔴 pkg/baz (0.2) - Not analyzed
→ Run /ctx-architecture with focus area: pkg/baz
→ Or: open pkg/baz/README.md if present
### Convergence Verdict
One of:
- ✅ CONVERGED - all modules ≥ 0.9, frontier empty. Further runs
without code changes won't improve coverage.
- 🟡 MOSTLY CONVERGED - core modules ≥ 0.9, peripheral modules
shallow. Diminishing returns on full re-run; use focus areas.
- 🔶 PARTIAL - significant modules below 0.7. Re-run with focus
areas or read tests.
- 🔴 INCOMPLETE - substantial portions unanalyzed. Run again.
Convergence thresholds:
Blocker vocabulary (use these consistently in the table):
Internal flow unclear - exports known, internals not tracedNot analyzed - directory listed onlyTests not read - implementation known, behavior under edge
cases unknownDesign rationale unknown - code understood, "why" is unclearConverged - nothing left to learn from static readingAfter printing the convergence verdict, append a Search Prompts section. The skill has just read the codebase and knows its jargon - this is the most useful thing it can hand back to someone who is not blocked by intelligence but by not knowing the right words.
Format:
## Search Prompts
The right keyword changes everything. Based on what I found in
the codebase, here are targeted searches worth running - in your
internal docs, Confluence, Notion, Slack, or publicly:
### Fill the gaps (ranked by how much they'd help)
For modules/areas still below 0.9:
🔶 pkg/bar - Internal flow unclear
Try searching:
- "<SpecificTypeName> design" or "<SpecificTypeName> internals"
- "<pattern observed, e.g. 'leader election'> <project name>"
- "why does <ProjectName> use <pattern>" (ADR or design doc)
🔴 pkg/baz - Not analyzed
Try searching:
- "<package name> <project name> explained"
- "<key interface or type found> behavior"
### Concepts worth understanding deeply
List 3-5 technical concepts the codebase clearly depends on but
that can't be learned from the code alone. Give the exact search
phrase, not a topic:
- "<ExactConceptName> explained" - e.g. "etcd watch semantics
explained", "CRDT merge strategies", "OIDC token refresh flow"
- "<pattern name> tradeoffs" - e.g. "saga pattern vs 2PC tradeoffs"
### Architecture decision records (if relevant)
If the code shows signs of a deliberate non-obvious choice
(e.g. custom retry logic instead of a library, unusual data
structure), suggest:
- "<ProjectName> <decision> ADR"
- "<ProjectName> <decision> RFC"
- "why <ProjectName> doesn't use <obvious alternative>"
---
Note: I won't run these searches for you - you may have internal
docs where these are more useful than public results, and you know
which sources to trust. Pick the phrases that match what's blocking
you.
Rules for this section:
Run all default mode phases first (0-5), then continue below. Principal mode is for strategic thinking - beyond "what is" to "what could be" and "what should concern us."
In addition to the default phase sources, read:
.context/TASKS.md - outstanding work, future plansCHANGELOG.md or docs/changelog.md - trajectory of decisionsdocs/ - any design rationale in user-facing docsgit log --oneline -30Two-tier behavior - do not stall:
If answers are available (user provided them in the prompt,
or they exist in .context/TASKS.md / DECISIONS.md): use them.
Do not ask for what you already have.
If answers are not available: do NOT stop. Generate a provisional principal analysis with assumptions explicitly labeled (see Principal Mode Fallback below). Include a "Questions That Would Sharpen This" section at the end of ARCHITECTURE-PRINCIPAL.md.
When asking the user, present all questions at once as a numbered list - do not ask one-at-a-time:
Before I write the principal analysis, a few questions - skip
or say "unsure" on anything you don't know:
0. **Focus areas** (if not already set in Phase 0.5)
1. **Vision**: What is this project trying to become in 12-24 months?
2. **Future direction**: Any architectural pivots being considered?
(plugin system, multi-tenant, cloud sync, daemon model, etc.)
3. **Known bottlenecks**: Where does the current design hurt you?
4. **Implementation alternatives**: Any decisions you'd do
differently starting fresh?
5. **Gaps**: What's missing that you expect to need?
6. **Areas of improvement**: Known tech debt or structural awkwardness?
After collecting answers, write .context/ARCHITECTURE-PRINCIPAL.md
(separate from ARCHITECTURE.md - speculation must not pollute
the authoritative doc).
# Architecture - Principal Analysis
_Generated <date>. Strategic analysis only; see ARCHITECTURE.md
for the authoritative architecture reference._
## Current State Summary
[Condensed narrative of the current architecture - ~1 page max]
## Vision Alignment
[How does the current architecture support or constrain the stated
vision? What structural changes would enable it?]
## Future Direction
[Architectural implications of planned pivots or new capabilities.
What would need to change if [feature X] were added?]
## Known Bottlenecks
[Analysis of performance, scalability, or dev-experience pain
points identified in the codebase or raised by the user]
## Implementation Alternatives
[For 2-3 key design decisions: current approach, alternatives,
tradeoffs]
## Gaps
[Missing capabilities or abstractions the architecture doesn't
handle yet but probably will need to]
## Areas of Improvement
Ranked by impact/effort:
- **High impact, low effort** (do first)
- **High impact, high effort** (plan for)
- **Low impact** (defer or skip)
## Risks
[Architectural risks as the system scales, team grows, or
requirements evolve]
## Intervention Points
Top 5 highest-leverage places to implement new features or
improvements, ranked by impact/effort:
1. `<symbol or subsystem>` - what kind of change fits here and why
2. ...
(These are concrete locations - package paths, interface names,
function boundaries - not vague subsystem labels.)
## Upstream Proposals
2-3 changes worth proposing to the project upstream (KEP / RFC /
issue style thinking). For each:
- **What**: one-sentence description of the change
- **Why**: what problem it solves that the current design can't
- **Where**: which abstraction boundary it touches
- **Risk**: what it breaks or complicates
Each proposal must cross an abstraction boundary - it must affect
how modules interact, not just refactor internals. If it doesn't
change an interface, a contract, or an ownership boundary, it's
not upstream-worthy; it's a local improvement (put it in
Improvement Ideas instead).
## Productization Gaps
What would need to change for this to work at enterprise scale?
- Multi-cluster / multi-tenant gaps
- Observability and debuggability holes
- Operational hardening missing from current design
- What a large customer would hit first
## Failure-First Analysis
[Hidden assumptions baked into the architecture. What breaks
silently vs. loudly? What would cause a cascade? What does the
system assume about its environment that may not hold?]
## Onboarding Friction
[Practical, not theoretical - this is what a new engineer actually
hits in week one:]
- What makes this system hard to understand quickly?
- Which modules require tribal knowledge to use safely?
- Where would a new engineer get stuck first, and why?
- What isn't written down anywhere?
Boundary hygiene - ARCHITECTURE-PRINCIPAL.md is for synthesis, leverage, risk, direction, and judgment. Do NOT restate module details that already exist in DETAILED_DESIGN.md. Reference module paths only where needed to ground an argument. If you find yourself summarizing what a module does, stop - link to it instead.
Principal mode fallback - if Phase P2 answers were not provided, label speculative sections clearly and add at the end:
## Questions That Would Sharpen This Analysis
Answering any of these would move speculative sections to grounded ones:
1. **Vision** - What is this project trying to become in 12-24 months?
2. **Future direction** - Any architectural pivots being considered?
3. **Known bottlenecks** - Where does the current design hurt?
4. **Assumptions marked** - These sections are labeled [inferred]:
[list them]
Autonomous inferences - principal mode must also answer the following from the codebase alone, without waiting for user input. These are things the code is silently deciding. Surface them:
These go in a dedicated "Silent Choices" section in ARCHITECTURE-PRINCIPAL.md. The code is making bets - name them.
Opinion floor - ARCHITECTURE-PRINCIPAL.md must contain at minimum:
Generate opinions, not just descriptions. If you find yourself writing neutral summaries, push harder.
When in doubt, prefer a strong, falsifiable opinion over a safe, generic one. Weak opinions are noise; strong opinions can be corrected.
Cross-project comparison (include when the codebase shows non-obvious design choices or when focus areas have well-known peers):
For any module where a comparable exists in another project, add:
### Compared to <PeerProject>/<Component>
- What <ThisProject> does differently
- What <PeerProject> does better
- What could be unified or learned from
Examples worth comparing when relevant:
Skip if no meaningful peer exists. Do not force comparisons.
Be direct. This document is for engineering judgment, not external audiences.
Extract danger zones from all DETAILED_DESIGN.md module sections
and compile them into a standalone .context/DANGER-ZONES.md.
This is the consolidated view - one document a reviewer or new
engineer can read to know where the dragons live.
# Danger Zones
_Generated <date> from DETAILED_DESIGN.md danger zone sections.
Run `/ctx-architecture-enrich` to add verified blast radius data._
## Summary
| Module | Zone | Risk | Why |
|--------|------|------|-----|
| <path> | <symbol/area> | HIGH/MEDIUM/LOW | one-line reason |
## By Module
### <module_path>
1. **<symbol or area>** - <why it's dangerous>
- Hidden coupling / ordering assumption / shared mutable state
- Modification advice: <what to check before changing>
2. ...
Rules:
/ctx-architecture-enrich can later add verified blast radius
numbers - leave room for that (don't claim precision you don't
have from reading alone)Score by decision usefulness, not descriptive completeness. Ask: "What could an engineer safely do with this understanding?"
| Level | Decision usefulness |
|---|---|
| 0.0 - 0.3 | Stubbed: not safe to make any decisions; directory listed only |
| 0.4 - 0.6 | Shallow: can describe purpose; not safe to modify without more reading |
| 0.7 - 0.79 | Safe to make localized changes with care; can review simple PRs |
| 0.8 - 0.89 | Can reason about design tradeoffs; safe to design changes in this module |
| 0.9 - 1.0 | Can predict likely breakage from non-trivial changes; safe to own the module |
Inflate scores and you lie to the next agent that reads the tracking file. Under-score and the convergence report will never clear. Score the decision-usefulness honestly.
If the user says "never", "don't ask again", or similar:
opted_out: true and opted_out_at: "<timestamp>" in
map-tracking.json.context/map-tracking.json to re-enable."The agent MAY suggest /ctx-architecture during session start when:
/ctx-architecture?"The nudge is a suggestion, not automatic execution.
After running, verify:
ARCHITECTURE-PRINCIPAL.md,
not overwriting ARCHITECTURE.md