From co-dev
Review, create, update, check, write, document, or audit architecture documentation (docs/architecture.md). Use when the user wants to review the architecture, check architecture docs, write architecture docs, document the architecture, or update architecture documentation to match organizational standards with accurate technical content.
npx claudepluginhub cloud-officer/claude-code-plugin-dev --plugin co-devThis skill is limited to using the following tools:
Review the `docs/architecture.md` file in a repository and create or update it to match organizational standards. This skill deeply analyzes the codebase to ensure architecture documentation is accurate, complete, and reflects the actual implementation. Works for all repository types and languages.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Review the docs/architecture.md file in a repository and create or update it to match organizational standards. This skill deeply analyzes the codebase to ensure architecture documentation is accurate, complete, and reflects the actual implementation. Works for all repository types and languages.
You MUST maintain an analysis checklist throughout execution. At each step, record what was found. This ensures consistent, reproducible results.
Before starting, create this tracking structure and update it as you progress:
=== ANALYSIS CHECKPOINT LOG ===
[ ] Step 1: Repository Information
- organization: (pending)
- repository: (pending)
- has_architecture_doc: (pending)
- has_docs_dir: (pending)
- doc_last_modified: (pending)
- code_last_modified: (pending)
[ ] Step 2: Exemption Check
- existing_exemption: (pending)
- exempt_type_detected: (pending)
[ ] Step 3: Project Type Detection
- project_type: (pending)
- ml_frameworks: (pending)
- has_model_files: (pending)
[ ] Step 4/5: Deep Codebase Analysis (complete ALL applicable sub-checks)
For Standard Projects:
[ ] 4.1 Architecture Diagram - diagrams_found: (pending), referenced_in_doc: (pending)
[ ] 4.2 Software Units - modules_in_code: (pending), modules_in_doc: (pending), missing_from_doc: (pending)
[ ] 4.3 SOUP Validation - soup_json_exists: (pending), packages_in_lockfile: (pending), packages_in_soup: (pending), missing: (pending), stale: (pending)
[ ] 4.4 Critical Algorithms - algorithms_found: (pending), documented: (pending), undocumented: (pending)
[ ] 4.5 Risk Controls - auth_patterns: (pending), validation_patterns: (pending), error_handling: (pending), logging: (pending)
For ML/DL Projects:
[ ] 5.1 Datasets - datasets_found: (pending), documented: (pending)
[ ] 5.2 Data Preprocessing - preprocessing_found: (pending), documented: (pending)
[ ] 5.3 Data Splits - splits_found: (pending), documented: (pending)
[ ] 5.4 Model Architecture - models_found: (pending), documented: (pending)
[ ] 5.5 Model Training - training_config_found: (pending), documented: (pending)
[ ] 5.6 Model Evaluation - metrics_found: (pending), documented: (pending)
[ ] 5.7 Model Deployment - deployment_found: (pending), documented: (pending)
[ ] Step 6: Document Structure Validation
- h1_title_correct: (pending)
- required_sections_present: (pending)
- section_order_correct: (pending)
- toc_links_valid: (pending)
[ ] Step 7: Report Generated
- all_checks_completed: (pending)
- issues_found: (pending)
=== END CHECKPOINT LOG ===
COMPLETION REQUIREMENT: Before generating the final report, you MUST verify that ALL applicable checkpoints show actual values (not "pending"). If any checkpoint is still "pending", go back and complete that analysis step.
EVIDENCE REQUIREMENT: For every check, you MUST record:
A bare "PASS" without evidence is not acceptable. If you cannot provide evidence, the check is incomplete.
DO NOT SKIP STEPS. Even if an earlier check seems to suggest no issues, you MUST complete ALL steps. Issues are often only revealed when cross-referencing multiple sources.
Before any code analysis, read the entire docs/architecture.md (if it exists) and extract every factual claim that needs verification:
cat docs/architecture.md 2>/dev/null
Create a claims inventory listing every verifiable claim in the document:
This claims inventory becomes your verification checklist for Steps 4-5. Every claim must be checked against actual code.
There are two types of architecture documents based on project type:
Required H2 sections:
## Table of Contents
## Architecture diagram
## Software units
## Software of Unknown Provenance
## Critical algorithms
## Risk controls
For machine learning and deep learning projects, required H2 sections:
## Table of Contents
## Datasets
## Data Preprocessing
## Data Splits
## Model Architecture
## Model Training
## Model Evaluation
## Software of Unknown Provenance
## Risk controls
## Model Deployment
This skill uses MCP tools when available and falls back gracefully if they are unavailable or return errors.
Prefer MCP tools (mcp__github__*) when available. If MCP tools are not available (tool not found errors), fall back to the gh CLI.
| Operation | MCP Tool | CLI Fallback |
|---|---|---|
| Get repo metadata | mcp__github__get_file_contents (path: /) for top-level structure; for richer metadata use the CLI fallback | gh repo view --json owner,name,visibility,description |
| Get file contents | mcp__github__get_file_contents | cat <file> |
| Get repo owner/name | Parse from git remote get-url origin | gh repo view --json owner,name |
Use mcp__context7__resolve-library-id then mcp__context7__query-docs to look up current documentation for libraries and frameworks found in the project. If Context7 is unavailable or returns errors (quota exceeded, timeouts), fall back to WebSearch and then mcp__fetch__fetch to retrieve documentation from official sources. Do not let Context7 failures block the review.
Run these commands to collect repository metadata:
# Get organization and repository name (fallback if MCP tools unavailable)
gh repo view --json owner,name,visibility,description
# Check if docs/architecture.md exists
ls -la docs/architecture.md 2>/dev/null || echo "No docs/architecture.md found"
# Check if docs directory exists
ls -la docs/ 2>/dev/null || echo "No docs directory found"
# Get last modified date of architecture.md vs source code
git log -1 --format="%ci" -- docs/architecture.md 2>/dev/null || echo "N/A"
git log -1 --format="%ci" -- src lib app pkg internal cmd 2>/dev/null | head -5
Store these values:
organization: The owner/organization namerepository: The repository namehas_architecture_doc: true/falsehas_docs_dir: true/falsedoc_last_modified: Date of last architecture.md changecode_last_modified: Date of most recent source code changeSome repository types do not require architecture documentation. Detect these and create an exemption file instead of nonsensical documentation.
# Check if already marked as not required
head -5 docs/architecture.md 2>/dev/null | grep -q "Architecture documentation is not required" && echo "EXEMPT" ||
echo "NOT_EXEMPT"
If the file already contains the exemption marker, stop here - no further action needed.
Homebrew Taps:
# Check for Homebrew tap pattern
gh repo view --json name --jq '.name' | grep -qE "^homebrew-" && echo "HOMEBREW_TAP"
ls -la Formula/ Casks/ 2>/dev/null
Claude Code Plugins:
# Check for Claude Code plugin
ls -la .claude-plugin/plugin.json skills/ commands/ 2>/dev/null
Configuration/Dotfiles Repositories:
# Check if repo is mostly config files
find . -maxdepth 2 -type f \( -name "*.yml" -o -name "*.yaml" -o -name "*.json" -o -name "*.toml" -o -name ".*" \) 2>/dev/null |
wc -l
find . -maxdepth 2 -type f \( -name "*.py" -o -name "*.js" -o -name "*.ts" -o -name "*.go" -o -name "*.rs" -o -name "*.rb" -o -name "*.java" \) 2>/dev/null |
wc -l
Documentation-Only Repositories:
# Check if repo is only documentation
find . -maxdepth 3 -type f -name "*.md" 2>/dev/null | wc -l
find . -maxdepth 3 -type f \( -name "*.py" -o -name "*.js" -o -name "*.ts" -o -name "*.go" -o -name "*.rs" -o -name "*.rb" \) 2>/dev/null |
wc -l
GitHub Profile Repositories:
# Check if repo name matches owner (profile README repo)
OWNER=$(gh repo view --json owner --jq '.owner.login')
NAME=$(gh repo view --json name --jq '.name')
[ "$OWNER" = "$NAME" ] && echo "PROFILE_REPO"
GitHub Actions:
# Check for GitHub Action
ls -la action.yml action.yaml 2>/dev/null
cat action.yml action.yaml 2>/dev/null | grep -q "runs:" && echo "GITHUB_ACTION"
Terraform Modules:
# Check for Terraform module (no main application)
ls -la *.tf modules/ 2>/dev/null
find . -name "*.tf" -not -path "*/.terraform/*" 2>/dev/null | head -5
Ansible Roles/Playbooks:
# Check for Ansible
ls -la playbooks/ roles/ tasks/ handlers/ ansible.cfg 2>/dev/null
Kubernetes/Helm Charts:
# Check for Helm chart or K8s manifests only
ls -la Chart.yaml values.yaml templates/ 2>/dev/null
find . -name "*.yaml" -path "*/templates/*" 2>/dev/null | head -5
Meta/Organization Repositories:
# Check for org-wide config repos
gh repo view --json name --jq '.name' | grep -qiE "^\.github$|^meta$|^org-|^team-|^-config$|-settings$" && echo "META_REPO"
| Type | Detection | Reason |
|---|---|---|
| Homebrew Tap | homebrew-* name, Formula/ or Casks/ dirs | Package distribution, no application logic |
| Claude Code Plugin | .claude-plugin/, skills/, commands/ dirs | Plugin config/prompts, no application logic |
| Dotfiles/Config | >80% config files, no source code | Configuration only |
| Documentation | Only .md files, no source code | No software architecture |
| GitHub Profile | Repo name matches owner | Profile README only |
| GitHub Action | action.yml with runs: | Simple action wrapper |
| Terraform Module | Only .tf files, no application | Infrastructure as code, not software |
| Ansible Role | playbooks/, roles/, tasks/ | Automation scripts, not software |
| Helm Chart | Chart.yaml, templates/ | K8s deployment config |
| Meta Repository | .github, meta, org-*, *-config | Org settings, no application |
If the repository matches an exempt type, create the exemption file:
mkdir -p docs
Exemption Template:
# Architecture Design
Architecture documentation is not required for this repository.
## Reason
This repository is a **{type}** which does not contain application software requiring architecture documentation.
### Repository Type: {type}
{Description of why this type doesn't need architecture docs}
## Documentation
For more information about this repository type, see:
{Link to relevant documentation}
## When This Might Change
Architecture documentation would be required if this repository evolves to include:
- Application source code with business logic
- Software components that interact with each other
- External dependencies that need to be documented (SOUP)
- Critical algorithms or risk controls
If the repository scope changes, remove this file and run the architecture review again.
Exemption Messages and Documentation Links by Type:
| Type | Message | Documentation |
|---|---|---|
| Homebrew Tap | Homebrew taps contain package formulae for distribution, not application source code. | Homebrew Taps |
| Claude Code Plugin | Claude Code plugins contain skill definitions and prompts, not application architecture. | Claude Code Extensions |
| Dotfiles/Config | This repository contains configuration files only, with no application logic to document. | N/A |
| Documentation | This repository contains documentation only, with no software architecture. | N/A |
| GitHub Profile | This is a GitHub profile README repository, not a software project. | GitHub Profile README |
| GitHub Action | GitHub Actions are simple workflow wrappers, not applications requiring architecture docs. | Creating Actions |
| Terraform Module | Terraform modules define infrastructure, not software architecture. | Terraform Modules |
| Ansible Role | Ansible roles define automation tasks, not software architecture. | Ansible Roles |
| Helm Chart | Helm charts define Kubernetes deployments, not software architecture. | Helm Charts |
| Meta Repository | Meta repositories contain organization settings, not software projects. | GitHub Organizations |
After creating the exemption file, STOP - do not proceed with architecture documentation steps.
Determine if this is a Machine Learning / Deep Learning project.
gh repo view --json name --jq '.name'
ML/DL indicators in repository name:
-ml, -dl, -ai-model, -modelsmachine-learning, deep-learningPython projects:
# Check requirements.txt
cat requirements.txt 2>/dev/null | grep -iE "tensorflow|pytorch|torch|keras|scikit-learn|sklearn|xgboost|lightgbm|transformers|huggingface|jax|mlflow|wandb|optuna|numpy|pandas|scipy"
# Check pyproject.toml
cat pyproject.toml 2>/dev/null | grep -iE "tensorflow|pytorch|torch|keras|scikit-learn|sklearn|xgboost|lightgbm|transformers|huggingface|jax|mlflow|wandb|optuna"
# Check poetry.lock or requirements for ML framework presence
cat poetry.lock requirements.txt 2>/dev/null | grep -iE "^(tensorflow|torch|keras|scikit-learn)==" | head -10
Node.js projects:
cat package.json 2>/dev/null | jq -r '.dependencies, .devDependencies | keys[]' 2>/dev/null | grep -iE "tensorflow|brain|ml5|synaptic"
# Check for ML-specific directories
ls -la models/ model/ training/ train/ data/ datasets/ notebooks/ checkpoints/ weights/ experiments/ 2>/dev/null
# Check for Jupyter notebooks
find . -maxdepth 3 -name "*.ipynb" 2>/dev/null | wc -l
# Check for model files
find . -maxdepth 3 \( -name "*.h5" -o -name "*.pkl" -o -name "*.pt" -o -name "*.pth" -o -name "*.onnx" -o -name "*.pb" -o -name "*.safetensors" \) 2>/dev/null |
head -5
# Search for ML patterns in Python files
grep -rl "model\.fit\|model\.train\|DataLoader\|tf\.keras\|torch\.nn\|sklearn\." --include="*.py" . 2>/dev/null | wc -l
Classify as ML/DL project if ANY of these are true:
models/, training/, datasets/ directories with contentOtherwise, classify as Standard project.
Store:
project_type: "ml_dl" or "standard"ml_frameworks: List of detected ML frameworkshas_model_files: true/false# Find existing diagram files
find . -maxdepth 3 \( -name "*.png" -o -name "*.svg" -o -name "*.drawio" -o -name "*.mmd" -o -name "*.mermaid" -o -name "*.puml" \) 2>/dev/null |
grep -iE "arch|diagram|overview|system|structure"
# Check if diagrams are referenced in architecture.md
grep -iE "\!\[.*\]\(.*\.(png|svg|drawio)\)" docs/architecture.md 2>/dev/null
grep -iE "```mermaid" docs/architecture.md 2>/dev/null
Verification:
Discover actual module structure:
# Python packages
find . -name "__init__.py" -not -path "*/venv/*" -not -path "*/.venv/*" -not -path "*/node_modules/*" 2>/dev/null |
sed 's|/[^/]*$||' | sort -u
# Node.js/TypeScript modules
cat package.json 2>/dev/null | jq -r '.main, .exports | if type == "object" then keys[] else . end' 2>/dev/null
ls -la src/ lib/ 2>/dev/null
# Go packages
find . -name "*.go" -not -path "*/vendor/*" 2>/dev/null | xargs -I {} dirname {} | sort -u
# Rust crates
find . -name "Cargo.toml" 2>/dev/null | xargs -I {} dirname {}
For each discovered module, extract:
# Python: Get module docstring and main classes/functions
head -30 {module}/__init__.py 2>/dev/null
grep -E "^class |^def |^async def " {module}/*.py 2>/dev/null | head -20
# Node.js: Get exports
grep -E "^export |^module\.exports" {module}/index.{js,ts} {module}.{js,ts} 2>/dev/null | head -20
# Go: Get package doc and exported functions
head -20 {module}/*.go 2>/dev/null | grep -E "^package |^// |^func [A-Z]"
Cross-reference with documentation (MANDATORY - do not skip):
IMPORTANT: The source of truth for SOUP data is soup.json (not soup.md). The soup.md file is auto-generated from soup.json and must never be edited directly. All validation and changes must target soup.json.
Verify soup.json exists:
ls -la docs/soup.json soup.json 2>/dev/null || echo "No soup.json found"
Extract dependency list from lock files for comparison:
# Python
cat poetry.lock 2>/dev/null | grep -E "^name = " | sed 's/name = "//;s/"//' | head -50
cat requirements.txt 2>/dev/null | grep -v "^#" | cut -d'=' -f1 | cut -d'>' -f1 | cut -d'<' -f1 | head -50
# Node.js
cat package.json 2>/dev/null | jq -r '.dependencies, .devDependencies | keys[]' 2>/dev/null | head -50
# Ruby
cat Gemfile.lock 2>/dev/null | grep -E "^ [a-z]" | awk '{print $1}' | head -50
# Go
cat go.mod 2>/dev/null | grep -E "^\t" | awk '{print $1}' | head -50
Validate soup.json content against actual code usage:
Step 1: Read soup.json and extract all package entries:
# Read soup.json to see all documented packages with their Risk Level, Requirements, and Verification Reasoning
cat docs/soup.json soup.json 2>/dev/null
Step 2: For EACH package in soup.json, validate the three fields:
You MUST validate EVERY package, not a sample. For each package, record:
For each package entry, run these commands to verify accuracy:
# Find how the package is actually used in the codebase
grep -rn "require.*{package}\|import.*{package}\|from {package}\|use {package}" --include="*.py" --include="*.js" --include="*.ts" --include="*.rb" --include="*.go" --include="*.rs" . 2>/dev/null | grep -v node_modules | grep -v vendor | head -20
Then validate:
Requirements field: Does the stated purpose match the actual usage found above?
Risk Level: Is it appropriate for what the package does?
| Package Type | Expected Risk Level |
|---|---|
| Auth, crypto, security | High |
| Network, HTTP, API clients | High |
| Database, data storage | High |
| File system access | Medium |
| Logging, monitoring | Medium |
| UI, formatting, colors | Low |
| Dev tools, linters, test utilities | Low |
Verification Reasoning: Does it explain why THIS package was chosen?
Step 3: Check completeness and staleness:
Cross-reference with architecture.md:
Discover algorithm implementations:
# Search for algorithm-related files
find . -name "*algorithm*" -o -name "*crypto*" -o -name "*hash*" -o -name "*sort*" -o -name "*search*" -o -name "*calculate*" -o -name "*compute*" -o -name "*process*" -o -name "*engine*" 2>/dev/null |
grep -v node_modules | grep -v venv
# Search for cryptographic operations
grep -rn "crypto\|encrypt\|decrypt\|hash\|hmac\|sha\|md5\|aes\|rsa" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" --include="*.rs" . 2>/dev/null |
grep -v node_modules | grep -v venv | head -20
# Search for complex mathematical operations
grep -rn "matrix\|vector\|gradient\|derivative\|integral\|fourier\|transform" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" --include="*.rs" . 2>/dev/null |
grep -v node_modules | grep -v venv | head -20
# Search for custom data structures
grep -rn "class.*Tree\|class.*Graph\|class.*Queue\|class.*Stack\|class.*Heap" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" --include="*.rs" . 2>/dev/null |
head -20
For each discovered algorithm, extract details:
Cross-reference with documentation:
Discover security measures:
# Authentication/Authorization patterns
grep -rn "auth\|login\|session\|token\|jwt\|oauth\|permission\|role\|acl" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" . 2>/dev/null |
grep -v node_modules | grep -v test | head -20
# Input validation patterns
grep -rn "validate\|sanitize\|escape\|filter\|whitelist\|blacklist" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" . 2>/dev/null |
grep -v node_modules | head -20
# Error handling patterns
grep -rn "try:\|catch\|except\|error\|throw\|panic\|recover" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" . 2>/dev/null |
grep -v node_modules | grep -v test | wc -l
# Logging patterns
grep -rn "log\.\|logger\.\|logging\.\|console\.log\|fmt\.Print" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" . 2>/dev/null |
grep -v node_modules | grep -v test | head -20
Check for security configurations:
# Environment variables
grep -rn "process\.env\|os\.environ\|os\.Getenv\|env::" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" --include="*.rs" . 2>/dev/null |
grep -v node_modules | head -20
# Security headers/middleware
grep -rn "helmet\|cors\|csrf\|xss\|rate.limit\|security" --include="*.py" --include="*.js" --include="*.ts" --include="*.go" . 2>/dev/null |
grep -v node_modules | head -10
Discover dataset definitions:
# Find dataset classes/loaders
grep -rn "class.*Dataset\|DataLoader\|tf\.data\|torch\.utils\.data" --include="*.py" . 2>/dev/null | grep -v venv |
head -20
# Find data directories and files
find data datasets raw processed -type f 2>/dev/null | head -30
ls -la data/ datasets/ 2>/dev/null
# Extract dataset statistics
wc -l data/*.csv datasets/*.csv 2>/dev/null
find data datasets -name "*.json" -exec wc -l {} \; 2>/dev/null | head -10
For each dataset, extract:
Cross-reference with documentation:
Discover preprocessing code:
# Find preprocessing functions/classes
grep -rn "def preprocess\|def transform\|def normalize\|def augment\|def clean\|class.*Transform\|class.*Preprocess" --include="*.py" . 2>/dev/null |
grep -v venv | head -20
# Find preprocessing pipelines
grep -rn "Pipeline\|Compose\|Sequential.*transform" --include="*.py" . 2>/dev/null | grep -v venv | head -10
For each preprocessing step, extract:
Cross-reference with documentation:
Discover split implementation:
# Find train/test split code
grep -rn "train_test_split\|split\|StratifiedKFold\|KFold\|random_split" --include="*.py" . 2>/dev/null | grep -v venv |
head -15
# Extract split ratios from code
grep -rn "test_size\|val_size\|train_size\|split.*=" --include="*.py" . 2>/dev/null | grep -v venv | head -15
# Check for split configuration files
cat config.yaml config.yml config.json 2>/dev/null | grep -iE "split|train|val|test"
Cross-reference with documentation:
Discover model definitions:
# Find model classes
grep -rn "class.*Model\|class.*Net\|class.*Network\|nn\.Module\|tf\.keras\.Model" --include="*.py" . 2>/dev/null |
grep -v venv | head -20
# Find model configuration
cat model_config.json model_config.yaml config/model.* 2>/dev/null
For each model, extract architecture details:
# Read model class definition (first 100 lines)
# For each model file found above, read it to extract:
# - Layer definitions
# - Forward pass logic
# - Input/output shapes
# Extract layer specifications from code
grep -rn "nn\.Linear\|nn\.Conv\|Dense\|Conv2D\|LSTM\|Transformer\|Attention" --include="*.py" . 2>/dev/null |
grep -v venv | head -30
# Check for model summary/print
grep -rn "model\.summary\|print.*model\|torchsummary" --include="*.py" . 2>/dev/null | grep -v venv | head -5
Cross-reference with documentation:
Discover training configuration:
# Find training scripts
find . -name "train*.py" -o -name "*training*.py" -o -name "main.py" 2>/dev/null | grep -v venv
# Extract hyperparameters from code
grep -rn "learning_rate\|lr\|batch_size\|epochs\|optimizer\|Adam\|SGD\|loss" --include="*.py" . 2>/dev/null |
grep -v venv | head -30
# Check for config files
cat config.yaml config.yml config.json training_config.* hyperparameters.* 2>/dev/null | head -50
# Find argument parsers for hyperparameters
grep -rn "add_argument.*lr\|add_argument.*batch\|add_argument.*epoch" --include="*.py" . 2>/dev/null | head -15
Extract actual training parameters:
Cross-reference with documentation:
Discover evaluation code:
# Find evaluation scripts/functions
find . -name "eval*.py" -o -name "*evaluate*.py" -o -name "test*.py" 2>/dev/null | grep -v venv | grep -v __pycache__
# Extract metrics used
grep -rn "accuracy\|precision\|recall\|f1\|auc\|roc\|confusion\|mse\|mae\|loss" --include="*.py" . 2>/dev/null |
grep -v venv | head -30
# Find metric computation
grep -rn "sklearn\.metrics\|torchmetrics\|tf\.keras\.metrics" --include="*.py" . 2>/dev/null | grep -v venv | head -15
# Check for saved evaluation results
find . -name "*results*.json" -o -name "*metrics*.json" -o -name "*eval*.json" 2>/dev/null | head -5
cat results.json metrics.json evaluation_results.json 2>/dev/null | head -30
Cross-reference with documentation:
Discover deployment configuration:
# Find deployment files
ls -la deploy/ deployment/ serving/ inference/ 2>/dev/null
find . -name "Dockerfile*" -o -name "docker-compose*" -o -name "*deploy*" -o -name "*serve*" 2>/dev/null |
grep -v node_modules | head -15
# Find inference code
grep -rn "def predict\|def inference\|@app\.route\|@api\|FastAPI\|Flask" --include="*.py" . 2>/dev/null | grep -v venv |
head -15
# Check for model serving configs
cat serve.yaml serving.yaml deployment.yaml kubernetes/*.yaml 2>/dev/null | head -50
# Find hardware requirements
grep -rn "cuda\|gpu\|device\|cpu\|memory" --include="*.py" --include="*.yaml" --include="*.yml" . 2>/dev/null |
grep -v venv | head -15
Cross-reference with documentation:
If docs/architecture.md exists, validate its structure.
head -5 docs/architecture.md
grep "^# " docs/architecture.md | head -1
Expected: # Architecture Design (exactly this)
grep "^## " docs/architecture.md
For Standard projects, must start with (in order):
## Table of Contents
## Architecture diagram
## Software units
## Software of Unknown Provenance
## Critical algorithms
## Risk controls
For ML/DL projects, must start with (in order):
## Table of Contents
## Datasets
## Data Preprocessing
## Data Splits
## Model Architecture
## Model Training
## Model Evaluation
## Software of Unknown Provenance
## Risk controls
## Model Deployment
Additional H2 sections may appear after the required ones.
# Extract TOC links
grep -E "^\s*-\s*\[.*\]\(#" docs/architecture.md
Verify each link resolves to an actual heading in the document.
MANDATORY PRE-REPORT VERIFICATION:
Before generating the report, you MUST:
If you skipped any step, the review is incomplete and results will be inconsistent.
After deep analysis, provide a detailed report:
## Architecture Documentation Review Report
### Analysis Checkpoint Log
{Include your completed checkpoint log here - ALL values must be filled in, none should say "pending"}
### Repository Info
- **Organization:** {org}
- **Repository:** {repo}
- **Project Type:** {standard/ml_dl}
- **Document Status:** {exists/missing}
- **Last Doc Update:** {date}
- **Last Code Update:** {date}
- **Documentation Freshness:** {CURRENT/STALE - code changed since last doc update}
### Structure Checks
- [ ] H1 title "# Architecture Design": {PASS/FAIL - found: "{actual}"}
- [ ] Required H2 sections present: {PASS/FAIL}
- [ ] Section order correct: {PASS/FAIL}
- [ ] Table of Contents links valid: {PASS/FAIL}
### Content Accuracy Checks
#### {For Standard: "Architecture Diagram" / For ML: "Datasets"}
- **Status:** {PASS/FAIL/NEEDS UPDATE/MISSING}
- **Issues:**
- {Specific issue 1}
- {Specific issue 2}
- **Discovered in code:** {what was actually found}
- **Documented:** {what's currently in docs}
{Repeat for each section}
#### Software of Unknown Provenance
- **Status:** {PASS/FAIL/NEEDS UPDATE}
- **soup.json exists:** {yes/no}
- **architecture.md references soup.md:** {yes/no - flag if duplicating content}
- **Total dependencies in lock files:** {n}
- **Documented in soup.json:** {n}
- **Missing from soup.json:** {list}
- **In soup.json but not in code:** {list}
- **Inaccurate Requirements fields:** {list packages where stated purpose doesn't match actual code usage}
- **Misclassified Risk Levels:** {list packages with inappropriate risk level for their function}
- **Weak Verification Reasoning:** {list packages with generic reasoning like "popular library"}
### Summary
- **Sections accurate:** {n}/{total}
- **Sections need update:** {n}
- **Sections missing:** {n}
- **Critical issues:** {list of high-priority fixes}
### Proposed Changes
{Show exact changes needed with before/after for each section}
Ask the user before making changes:
"I found the following issues with docs/architecture.md. Would you like me to fix them?"
First create the docs directory if needed:
mkdir -p docs
# Architecture Design
## Table of Contents
- [Architecture diagram](#architecture-diagram)
- [Software units](#software-units)
- [Software of Unknown Provenance](#software-of-unknown-provenance)
- [Critical algorithms](#critical-algorithms)
- [Risk controls](#risk-controls)
## Architecture diagram
{Include or reference architecture diagram - create if missing}

### System Overview
{High-level description based on discovered modules and their interactions}
### Component Interactions
{Description of how components interact - based on imports/dependencies analysis}
## Software units
{For each discovered module:}
### {Module Name}
**Purpose:** {Extracted from docstring or inferred from code}
**Location:** `{actual/path/to/module}`
**Key Components:**
- `{ClassName}`: {description from docstring}
- `{function_name}`: {description from docstring}
**Internal Dependencies:**
- {Other modules this depends on}
**External Dependencies:**
- {Third-party packages used}
## Software of Unknown Provenance
See [soup.md](soup.md) for the complete list of third-party dependencies.
**Verification:** Cross-reference soup.md entries against actual code usage to ensure accuracy:
### Risk Level
Classify the potential harm if the library has a vulnerability (per IEC 62304):
| Level | Definition |
|-------|------------|
| Low | Cannot lead to harm |
| Medium | Can lead to reversible harm |
| High | Can lead to irreversible harm |
### Requirements
Answer: "Why do you need this library in your project?"
Examples:
- "HTTP client for REST API communication"
- "CLI argument parsing and validation"
- "YAML/JSON configuration file parsing"
- "Dependency" (for transitive dependencies only)
### Verification Reasoning
Answer: "Why did you select this library among alternatives?"
Examples:
- "Industry standard with active maintenance and security updates"
- "Official SDK provided by the service vendor"
- "Recommended by framework documentation"
- "Dependency" (for transitive dependencies only)
### Validation Checks
1. **Accuracy:** Verify each package's Requirements field matches its actual usage in the codebase (e.g., an AWS SDK should not say "image processing")
2. **Completeness:** All packages in lock files must be in soup.json
3. **Staleness:** Packages removed from lock files must be removed from soup.json
4. **Risk Level:** Verify risk classifications are appropriate (e.g., crypto/auth libraries should be High)
**Note:** `soup.md` is auto-generated from `soup.json`. All edits must be made to `soup.json`.
## Critical algorithms
{For each discovered algorithm:}
### {Algorithm/Function Name}
**Purpose:** {From docstring or inferred}
**Location:** `{actual/path/to/file}` in `{ClassName}` or `{function_name}`
**Implementation:**
{Brief description of how it works}
**Complexity:** {If documented or inferrable}
**Security Considerations:** {If applicable}
## Risk controls
### Security Measures
{Based on discovered security patterns:}
- **Authentication:** {Discovered auth mechanisms}
- **Authorization:** {Discovered authz patterns}
- **Input Validation:** {Discovered validation}
- **Encryption:** {Discovered crypto usage}
### Error Handling
{Based on discovered error handling patterns}
### Logging & Monitoring
{Based on discovered logging patterns}
### Failure Modes
| Failure Mode | Impact | Mitigation |
|--------------|--------|------------|
| {Inferred from error handling} | {Impact} | {Mitigation} |
# Architecture Design
## Table of Contents
- [Datasets](#datasets)
- [Data Preprocessing](#data-preprocessing)
- [Data Splits](#data-splits)
- [Model Architecture](#model-architecture)
- [Model Training](#model-training)
- [Model Evaluation](#model-evaluation)
- [Software of Unknown Provenance](#software-of-unknown-provenance)
- [Risk controls](#risk-controls)
- [Model Deployment](#model-deployment)
## Datasets
### Data Sources
| Dataset | Source | Size | Format |
|---------|--------|------|--------|
{For each discovered dataset:} | {name} | {source if found} | {actual size} | {format} |
### Data Description
{Based on discovered dataset classes and data files}
**Features:**
{Extracted from data loading code}
**Labels:**
{Extracted from data loading code}
### Data Statistics
{Based on actual data file analysis}
## Data Preprocessing
### Preprocessing Pipeline
{Based on discovered preprocessing code:}
1. **{Step from code}**: {Description}
- Implementation: `{file}:{function}`
- Parameters: {extracted parameters}
### Data Transformations
| Transformation | Purpose | Implementation |
|----------------|---------|----------------|
{For each discovered transform:} | {transform_name} | {from docstring} | `{file}` in `{class/function}` |
### Data Augmentation
{Based on discovered augmentation code}
## Data Splits
### Split Configuration
| Split | Ratio | Size | Method |
|-------|-------|------|--------|
| Training | {from code}% | {n} samples | {method} |
| Validation | {from code}% | {n} samples | {method} |
| Test | {from code}% | {n} samples | {method} |
### Split Implementation
**Location:** `{file}` in `{function_name}`
**Method:** {random/stratified/temporal/custom}
**Random Seed:** {if found}
## Model Architecture
### Architecture Overview
{Based on discovered model class}
**Model Type:** {CNN/RNN/Transformer/etc.}
**Framework:** {PyTorch/TensorFlow/etc.}
### Architecture Diagram
{Generate or reference based on model structure}
### Layer Specifications
| Layer | Type | Parameters | Output Shape |
|-------|------|------------|--------------|
{For each layer discovered in model:} | {layer_name} | {layer_type} | {params} | {shape if inferrable} |
### Model Configuration
**Location:** `{model_file}` in `{ClassName}`
~~~python
{Actual model class signature and key layers}
~~~
### Input/Output Specifications
- **Input:** {shape, dtype from code}
- **Output:** {shape, dtype from code}
## Model Training
### Training Configuration
| Parameter | Value | Source |
|-----------|-------|--------|
| Optimizer | {actual optimizer} | `{file}` in `{function/class}` |
| Learning Rate | {actual lr} | `{file}` in `{function/class}` |
| Batch Size | {actual batch_size} | `{file}` in `{function/class}` |
| Epochs | {actual epochs} | `{file}` in `{function/class}` |
| Loss Function | {actual loss} | `{file}` in `{function/class}` |
| LR Scheduler | {if found} | `{file}` in `{function/class}` |
### Training Script
**Location:** `{training_script}`
### Training Procedure
{Based on actual training loop analysis}
### Checkpointing
{Based on discovered checkpoint saving code}
## Model Evaluation
### Evaluation Metrics
| Metric | Implementation | Latest Value |
|--------|----------------|--------------|
| {metric_name} | `{file}` in `{function/class}` | {from results file if exists} |
### Evaluation Script
**Location:** `{eval_script}`
### Benchmark Results
{From discovered results files}
| Dataset | Metric | Value | Date |
|---------|--------|-------|------|
| {dataset} | {metric} | {value} | {date} |
## Software of Unknown Provenance
See [soup.md](soup.md) for the complete list of third-party dependencies including ML frameworks and data processing libraries.
**Verification:** Cross-reference soup.json entries against actual code usage. See the Standard Project Template above for Risk Level, Requirements, and Verification Reasoning guidelines. Note that `soup.md` is auto-generated from `soup.json`; all edits must target `soup.json`.
## Risk controls
### Model Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Model drift | {assess} | {assess} | {from code} |
| Data leakage | {assess} | {assess} | {from code} |
| Overfitting | {assess} | {assess} | {from code} |
### Data Risks
{Based on data handling code analysis}
### Operational Risks
{Based on deployment code analysis}
## Model Deployment
### Deployment Architecture
{Based on discovered deployment configs}
### Inference Implementation
**Location:** `{inference_file}`
**Entry Point:** `{function/endpoint}`
### Hardware Requirements
| Requirement | Specification | Source |
|-------------|---------------|--------|
| GPU | {from code} | `{file}` |
| Memory | {from code/config} | `{file}` |
| Storage | {estimated} | - |
### Serving Configuration
{From discovered serving configs}
### Monitoring
{Based on discovered monitoring/logging code}
Before completing, verify:
# Architecture DesignAfter making changes to docs/architecture.md, run the linters skill to ensure the file passes all markdown linting rules:
/co-dev:run-linters
Fix any linting errors before considering the task complete.
/co-dev:run-linters after modifying docs/architecture.md