Search everything...

Skill

knowledge-build

Orchestrates parallel knowledge base generation from codebases using spatial analysis, map-reduce architecture, and sub-agents for concepts, architecture, modules, and patterns. Supports feature learning from archived features.

Bash

Git

documentation

automation

npx claudepluginhub rp1-run/rp1

Tool Access

This skill is limited to using the following tools:

Bash(echo *)Bash(rp1 *)

Preview

Extract these parameters from the user's input:

SKILL.md

Similar Skills

knowledge-build

Orchestrates parallel knowledge base generation using spatial analysis, map-reduce architecture, incremental updates, and feature learning from archived features.

2 tools

rp1-base

kb-discover

Analyzes source code to discover and extract implicit knowledge—architecture patterns, conventions, API contracts, config structures, codebase rules—into KB articles.

ai-knowledge

knowledge-graph

174

Manages persistent knowledge graph for specs by caching agent discoveries, codebase analysis, patterns, components, and APIs. Use to remember findings across sessions, validate task dependencies, and query prior work.

3 files6 tools

developer-kit

Stats

Parent Repo Stars11

Parent Repo Forks3

Last CommitMar 14, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

knowledge-build | rp1-base | ClaudePluginHub

Back to Skills

Skill

knowledge-build

From rp1-base

npx claudepluginhub rp1-run/rp1

Tool Access

This skill is limited to using the following tools:

Bash(echo *)Bash(rp1 *)

Preview

Extract these parameters from the user's input:

SKILL.md

Knowledge Builder - Parallel KB Generation Orchestrator

Parameters

Extract these parameters from the user's input:

Parameter	Required	Default	Description
`FEATURE_ID`	No	-	Feature ID to incorporate learnings from an archived feature into KB

Environment values (resolve via shell):

RP1_ROOT: !rp1 agent-tools rp1-root-dir (extract data.root from JSON response)

This command orchestrates parallel knowledge base generation using a map-reduce architecture

CRITICAL: This is an ORCHESTRATOR command, not a thin wrapper. This command must handle parallel execution coordination, result aggregation, and state management.

Architecture Overview

Phase 1 (Sequential):  Spatial Analyzer -> Categorized file lists
Phase 2 (Parallel):    4 Analysis Agents -> JSON outputs (concept, arch, module, pattern)
Phase 3 (Sequential):  Command -> Merge JSON -> Generate index.md -> Write KB files

Key Design: The main orchestrator generates index.md directly (not via sub-agent) because:

It has visibility into all 4 sub-agent outputs
It can aggregate key facts into a "jump off" entry point
index.md must contain file manifest with accurate line counts from generated files

Execution Instructions

DO NOT ask for user approval. Execute immediately.

Feature Learning Mode

If FEATURE_ID is provided, this is a feature learning build that captures knowledge from an archived feature. Skip Phase 0 entirely (no git commit parsing needed).

Locate archived feature:

FEATURE_PATH = {{$RP1_ROOT}}/work/archives/features/{FEATURE_ID}/

If not found, check active features:

FEATURE_PATH = {{$RP1_ROOT}}/work/features/{FEATURE_ID}/

If neither exists, error:

Feature not found: {FEATURE_ID}
Checked: {{$RP1_ROOT}}/work/archives/features/{FEATURE_ID}/
        {{$RP1_ROOT}}/work/features/{FEATURE_ID}/

Read feature documentation:
- {FEATURE_PATH}/requirements.md - What was built
- {FEATURE_PATH}/design.md - How it was designed
- {FEATURE_PATH}/field-notes.md - Learnings and discoveries (if exists)
- {FEATURE_PATH}/tasks.md - Implementation details with files modified

Extract files modified from tasks.md: Parse implementation summaries to build FILES_MODIFIED list:

Look for patterns:
- **Files**: `src/file1.ts`, `src/file2.ts`
- **Files Modified**: ...

Extract all file paths into FILES_MODIFIED array.

Extract feature context: Build a FEATURE_CONTEXT object containing:
- Feature ID and path
- Key requirements (summarized)
- Architectural decisions from design.md
- All discoveries from field-notes.md
- Implementation patterns used
- files_modified: FILES_MODIFIED array
Jump directly to Phase 1 (Spatial Analysis):
- Pass FILES_MODIFIED to spatial analyzer instead of git diff
- Spatial analyzer categorizes these specific files
- No git commit comparison needed

Spatial analyzer prompt (Feature Learning Mode):

FEATURE_LEARNING mode. Categorize these files modified during feature implementation:
FILES: {{stringify(FILES_MODIFIED)}}

Rank each file 0-5, categorize by KB section (index_files, concept_files, arch_files, module_files).
Return JSON with categorized files.

Sub-agent prompts include:

FEATURE_CONTEXT: {{stringify(feature_context)}}
MODE: FEATURE_LEARNING

Incorporate learnings from this completed feature:
- Update patterns.md with implementation patterns discovered
- Update architecture.md if new architectural patterns emerged
- Update modules.md with new components/dependencies
- Update concept_map.md with new domain concepts

Focus on files that were modified: {{stringify(FILES_MODIFIED)}}

Phase 0: Change Detection and Diff Analysis

NOTE: Skip this phase entirely if FEATURE_ID is provided (Feature Learning Mode).

Check for existing KB state:
- Check if {{$RP1_ROOT}}/context/state.json exists
- If exists, read the git_commit field from state.json
Check current git commit:
- Run: git rev-parse HEAD to get current commit hash
- Compare with git_commit from state.json (if exists)
Determine build strategy:

CASE A: No changes detected (state.json exists AND git commit unchanged):
- ACTION: Skip build entirely (no-op)
- MESSAGE: "KB is up-to-date (commit {{commit_hash}}). No regeneration needed. KB is automatically loaded by agents when needed."
CASE A-MONOREPO: No changes in this service (monorepo: git commit changed but no changes in CODEBASE_ROOT):
- ACTION: Skip build BUT update state.json with new commit
- REASON: In monorepo, global commit moves even when this service unchanged. Update commit reference to avoid checking larger diff ranges in future.
- Update state.json:
  - Read existing state.json
  - Update only the git_commit field to new commit hash
  - Keep all other fields unchanged (strategy, repo_type, files_analyzed, etc.)
  - Write updated state.json
- MESSAGE: "No changes in this service since last build. Updated commit reference ({{old_commit}} -> {{new_commit}}). KB is automatically loaded by agents when needed."
CASE B: First-time build (no state.json):
- ACTION: Full analysis mode - proceed to Phase 1
- MESSAGE: "First-time KB generation with parallel analysis (10-15 min)"
- MODE: Full scan (spatial analyzer processes all files)
CASE C: Incremental update (state.json exists AND commit changed AND files changed in CODEBASE_ROOT):
- ACTION: Incremental analysis mode - get changed files with diffs
- Read monorepo metadata from state.json AND local values from meta.json:
```
# Read shareable state
repo_type=$(jq -r '.repo_type // "single-project"' {{$RP1_ROOT}}/context/state.json)

# Read local values from meta.json (with fallback to state.json for backward compatibility)
if [ -f "{{$RP1_ROOT}}/context/meta.json" ]; then
  repo_root=$(jq -r '.repo_root // "."' {{$RP1_ROOT}}/context/meta.json)
  current_project_path=$(jq -r '.current_project_path // "."' {{$RP1_ROOT}}/context/meta.json)
else
  # Backward compatibility: read from state.json if meta.json doesn't exist
  repo_root=$(jq -r '.repo_root // "."' {{$RP1_ROOT}}/context/state.json)
  current_project_path=$(jq -r '.current_project_path // "."' {{$RP1_ROOT}}/context/state.json)
fi
```
- Get changed files list:
```
# If monorepo, run git diff from repo root and filter to current project
if [ "$repo_type" = "monorepo" ]; then
  cd "$repo_root"
  # Get all changed files
  all_changes=$(git diff --name-only {{old_commit}} {{new_commit}})

  # Filter to current project (skip filtering if root project)
  if [ "$current_project_path" = "." ] || [ "$current_project_path" = "" ]; then
    # Root project - include all files
    echo "$all_changes"
  else
    # Subdirectory project - filter to project path
    echo "$all_changes" | grep "^${current_project_path}"
  fi
else
  # Single-project - get all changes
  git diff --name-only {{old_commit}} {{new_commit}}
fi
```
- Check if any files changed in scope:
  - If NO changes found -> Go to CASE A-MONOREPO (update commit only)
  - If changes found -> Continue with incremental analysis
- Check change set size (prevent token limit issues):
```
changed_file_count=$(echo "$changed_files" | wc -l)
if [ $changed_file_count -gt 50 ]; then
  echo "Large change set ($changed_file_count files changed). Using FULL mode for reliability."
  # Fall back to FULL mode (skip getting diffs)
  MODE="FULL"
else
  MODE="INCREMENTAL"
fi
```
- MESSAGE:
  - If MODE=FULL: "Large change set ({{changed_file_count}} files). Full analysis (10-15 min)"
  - If MODE=INCREMENTAL: "Changes detected since last build ({{old_commit}} -> {{new_commit}}). Analyzing {{changed_file_count}} changed files (2-5 min)"
- Get detailed diffs for each changed file (only if MODE=INCREMENTAL):
```
# Only if incremental mode (< 50 files)
git diff {{old_commit}} {{new_commit}} -- <filepath>
```
- Store diffs: Create FILE_DIFFS JSON mapping filepath -> diff content (only if MODE=INCREMENTAL)
- Filter changed files: Apply EXCLUDE_PATTERNS, filter to relevant extensions
- Store changed files list: Will be passed to spatial analyzer
- MODE: INCREMENTAL (< 50 files) or FULL (>= 50 files)

Phase 1: Spatial Analysis (Sequential)

Spawn spatial analyzer agent:

For full build (CASE B):

{% dispatch_agent "rp1-base:kb-spatial-analyzer" %} FULL SCAN mode. Scan all files in repository at {{CODEBASE_ROOT}}, rank files 0-5, categorize by KB section. Return JSON with index_files, concept_files, arch_files, module_files arrays. {% enddispatch_agent %}

For incremental build (CASE C):

{% dispatch_agent "rp1-base:kb-spatial-analyzer" %} INCREMENTAL mode. Only categorize these changed files: {{changed_files_list}}. Rank each file 0-5, categorize by KB section (index_files, concept_files, arch_files, module_files). Return JSON with categorized changed files. {% enddispatch_agent %}
Parse spatial analyzer output:
- Extract JSON from agent response
- Validate structure: must have repo_type, monorepo_projects, total_files_scanned, index_files, concept_files, arch_files, module_files, local_meta
- Store shareable metadata: repo_type, monorepo_projects
- Store local metadata from local_meta: repo_root, current_project_path (will be written to meta.json)
- For incremental: files_scanned should match changed_file_count
- Check that at least one category has files (some categories may be empty in incremental)
Handle spatial analyzer failure:
- If agent crashes or returns invalid JSON: Log error with details
- If categorization is completely empty: Log error
- Provide troubleshooting guidance

Phase 2: Map Phase (Parallel Execution)

Spawn 4 analysis agents in parallel (CRITICAL: Use a SINGLE message with 4 Task tool calls):

Agent 1 - Concept Extractor:

{% dispatch_agent "rp1-base:kb-concept-extractor" %} MODE={{mode}}. Extract domain concepts for concept_map.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(concept_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_concept_files)}}{{endif}}. Return JSON with concepts, terminology, relationships. {% enddispatch_agent %}

Agent 2 - Architecture Mapper:

{% dispatch_agent "rp1-base:kb-architecture-mapper" %} MODE={{mode}}. Map system architecture for architecture.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(arch_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_arch_files)}}{{endif}}. Return JSON with patterns, layers, diagram. {% enddispatch_agent %}

Agent 3 - Module Analyzer:

{% dispatch_agent "rp1-base:kb-module-analyzer" %} MODE={{mode}}. Analyze modules for modules.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(module_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_module_files)}}{{endif}}. Return JSON with modules, components, dependencies. {% enddispatch_agent %}

Agent 4 - Pattern Extractor:

{% dispatch_agent "rp1-base:kb-pattern-extractor" %} MODE={{mode}}. Extract implementation patterns for patterns.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(concept_files + module_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_pattern_files)}}{{endif}}. Return JSON with patterns (<=150 lines when rendered). {% enddispatch_agent %}
Collect agent outputs:
- Wait for all 4 agents to complete
- Parse JSON from each agent response
- Validate JSON structure for each output
Handle partial failures:

If 1 agent fails:
- Continue with remaining 3 successful agents
- Generate placeholder content for failed section:
  - concept_map.md failed -> "# Error extracting concepts - run full rebuild"
  - architecture.md failed -> "# Error mapping architecture - see logs"
  - modules.md failed -> "# Error analyzing modules - run full rebuild"
  - patterns.md failed -> "# Error extracting patterns - run full rebuild"
- Include warning in final report: "Partial KB generated (1 agent failed: )"
- Write partial KB files (index.md always generated by orchestrator + 3 successful agent files + 1 placeholder)
- Exit with partial success (still usable KB)
If 2+ agents fail:
- Log all errors with specific agent names and error messages
- Do NOT write partial KB (too incomplete to be useful)
- Provide troubleshooting guidance:
  - Check file permissions
  - Verify git repository is valid
  - Try running again (may be transient failure)
- Exit with error message: "ERROR: KB generation failed (X agents failed)"
- Exit code: 1

Phase 3: Reduce Phase (Merge and Write)

Load KB templates:
```
Use Skill tool with:
skill: rp1-base:knowledge-base-templates
```
- Load templates for: index.md, concept_map.md, architecture.md, modules.md, patterns.md
Merge agent data into templates (concept_map, architecture, modules, patterns):

concept_map.md:
- Use concept-extractor JSON data
- Fill template sections: core concepts, terminology, relationships, patterns
- Add concept boundaries
architecture.md:
- Use architecture-mapper JSON data
- Fill template sections: patterns, layers, interactions, integrations
- Insert Mermaid diagram from JSON
modules.md:
- Use module-analyzer JSON data
- Fill template sections: modules, components, dependencies, metrics
- Add responsibility matrix
patterns.md:
- Use pattern-extractor JSON data
- Fill template sections: 6 core patterns, conditional patterns (if detected)
- Verify output is <=150 lines
- Omit conditional sections if not detected
Validate Mermaid diagrams:
```
Use Skill tool with:
skill: rp1-base:mermaid
```
- Validate diagram from architecture.md
- If invalid: Log warning, use fallback simple diagram or omit
Generate index.md directly (orchestrator-owned, not agent):

The orchestrator generates index.md as the "jump off" entry point by aggregating data from all 4 sub-agents.

Follow the index.md generation instructions in the knowledge-base-templates skill:
- See "Index.md Generation (Orchestrator-Owned)" section in SKILL.md
- Aggregation process: extract data from each sub-agent's JSON output
- Calculate file manifest: get line counts after writing other KB files
- Template placeholder mapping: fill template with aggregated data

Write KB files:

Use Write tool to write:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md

Phase 4: State Management

Aggregate metadata:
- Combine metadata from spatial analyzer + 4 analysis agents
- Calculate total files analyzed
- Extract languages and frameworks
- Calculate metrics (module count, component count, concept count)

Generate state.json (shareable metadata - safe to commit/share):

{
  "strategy": "parallel-map-reduce",
  "repo_type": "{{repo_type}}",
  "monorepo_projects": ["{{project1}}", "{{project2}}"],
  "generated_at": "{{ISO timestamp}}",
  "git_commit": "{{git rev-parse HEAD}}",
  "files_analyzed": {{total_files}},
  "languages": ["{{lang1}}", "{{lang2}}"],
  "metrics": {
    "modules": {{module_count}},
    "components": {{component_count}},
    "concepts": {{concept_count}}
  }
}

Generate meta.json (local values - should NOT be committed/shared):
```
{
  "repo_root": "{{repo_root}}",
  "current_project_path": "{{current_project_path}}"
}
```
NOTE: meta.json contains local paths that may differ per team member. This file should be added to .gitignore.

Write state files:

Use Write tool to write:
- {{$RP1_ROOT}}/context/state.json
- {{$RP1_ROOT}}/context/meta.json

Phase 5: Error Handling

Error Conditions:

Spatial analyzer fails or returns invalid JSON
2 or more analysis agents fail
Template loading fails
Write operations fail repeatedly
Git commands fail (unable to detect commit hash)

Error Handling Procedure:

Log clear error message indicating which phase/component failed
Provide specific details about what went wrong
List attempted operations and their results
Provide actionable guidance for resolution:
- Check git repository status if git commands failed
- Verify file permissions if write operations failed
- Check agent logs if spatial analyzer or analysis agents failed
Report error to user with troubleshooting steps

Final Report:

Knowledge Base Generated Successfully

Strategy: Parallel map-reduce
Repository: {{repo_type}}
Files Analyzed: {{total_files}}

KB Files Written:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md
- {{$RP1_ROOT}}/context/state.json (shareable metadata)
- {{$RP1_ROOT}}/context/meta.json (local paths - add to .gitignore)

Next steps:
- KB is automatically loaded by agents when needed (no manual /knowledge-load required)
- Subsequent runs will use same parallel approach (10-15 min)
- Incremental updates (changed files only) are faster (2-5 min)
- Add meta.json to .gitignore to prevent sharing local paths

Final Report (Feature Learning Mode):

Feature Learnings Captured

Feature: {{FEATURE_ID}}
Source: {{FEATURE_PATH}}

Learnings Incorporated:
- patterns.md: {{N}} new patterns from implementation
- architecture.md: {{N}} architectural decisions
- modules.md: {{N}} new components/dependencies
- concept_map.md: {{N}} domain concepts

KB Files Updated:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md

The knowledge from feature "{{FEATURE_ID}}" has been captured into the KB.
Future agents will benefit from these learnings.

Additional Parameters

Parameter	Default	Purpose
RP1_ROOT	`.rp1/`	Root directory for KB artifacts
CODEBASE_ROOT	`.`	Repository root to analyze
EXCLUDE_PATTERNS	`node_modules/,.git/,build/,dist/`	Patterns to exclude from scanning

Critical Execution Notes

Change detection first: Always check Phase 0 - compare git commit hash to skip if unchanged
Do NOT iterate: Execute workflow ONCE, no refinement
Parallel spawning: Spawn 4 agents in SINGLE message with multiple Task calls
Index.md ownership: Orchestrator generates index.md directly (not via sub-agent)
Error handling: Provide clear error messages with troubleshooting steps if failures occur
No user interaction: Complete entire workflow autonomously
Set expectations: Inform user builds take 10-15 minutes (or instant if no changes)

Output Discipline

CRITICAL - Keep Output Concise:

Do ALL internal work in tags (NOT visible to user)
Do NOT output verbose phase-by-phase progress ("Now doing Phase 1...", "Spawning agents...", etc.)
Do NOT explain internal logic or decision-making process
Only output 3 things:
1. Initial status: Build mode message (CASE A/B/C)
2. High-level progress (optional): "Analyzing... (Phase X/5)" every 2-3 minutes
3. Final report: Success message with KB files written (see Final Report above)

Example of CORRECT output:

First-time KB generation with parallel analysis (10-15 min)
Analyzing... (Phase 2/5)
Knowledge Base Generated Successfully
[Final Report as shown above]

Example of INCORRECT output (DO NOT DO THIS):

Checking for state.json...
state.json not found, proceeding with first-time build
Running git rev-parse HEAD to get commit...
Commit is 475b03e...
Spawning kb-spatial-analyzer agent...
Parsing spatial analyzer output...
Found 90 files in index_files category...
Now spawning 4 parallel agents...
Spawning kb-concept-extractor...
Spawning kb-architecture-mapper...
Spawning kb-module-analyzer...
etc. (too verbose!)

Expected Performance

No changes detected:

Instant (no-op)
Single-project: Commit unchanged -> Skip entirely
Monorepo: Commit changed but no changes in this service -> Update state.json commit only

First-time build (no state.json - full analysis):

10-15 minutes
Spatial analyzer scans all files
5 parallel agents analyze all relevant files
Generates complete KB

Incremental update (commit changed - changed files only):

2-5 minutes (much faster!)
Git diff identifies changed files
Spatial analyzer categorizes only changed files
5 parallel agents load existing KB + analyze only changed files
Updates KB with changes only
Preserves all existing good content

Similar Skills

knowledge-build

Orchestrates parallel knowledge base generation using spatial analysis, map-reduce architecture, incremental updates, and feature learning from archived features.

2 tools

rp1-base

kb-discover

Analyzes source code to discover and extract implicit knowledge—architecture patterns, conventions, API contracts, config structures, codebase rules—into KB articles.

ai-knowledge

knowledge-graph

174

3 files6 tools

developer-kit

Stats

Parent Repo Stars11

Parent Repo Forks3

Last CommitMar 14, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Knowledge Builder - Parallel KB Generation Orchestrator

Parameters

Extract these parameters from the user's input:

Parameter	Required	Default	Description
`FEATURE_ID`	No	-	Feature ID to incorporate learnings from an archived feature into KB

Environment values (resolve via shell):

RP1_ROOT: !rp1 agent-tools rp1-root-dir (extract data.root from JSON response)

This command orchestrates parallel knowledge base generation using a map-reduce architecture

CRITICAL: This is an ORCHESTRATOR command, not a thin wrapper. This command must handle parallel execution coordination, result aggregation, and state management.

Architecture Overview

Phase 1 (Sequential):  Spatial Analyzer -> Categorized file lists
Phase 2 (Parallel):    4 Analysis Agents -> JSON outputs (concept, arch, module, pattern)
Phase 3 (Sequential):  Command -> Merge JSON -> Generate index.md -> Write KB files

Key Design: The main orchestrator generates index.md directly (not via sub-agent) because:

It has visibility into all 4 sub-agent outputs
It can aggregate key facts into a "jump off" entry point
index.md must contain file manifest with accurate line counts from generated files

Execution Instructions

DO NOT ask for user approval. Execute immediately.

Feature Learning Mode

If FEATURE_ID is provided, this is a feature learning build that captures knowledge from an archived feature. Skip Phase 0 entirely (no git commit parsing needed).

Locate archived feature:

FEATURE_PATH = {{$RP1_ROOT}}/work/archives/features/{FEATURE_ID}/

If not found, check active features:

FEATURE_PATH = {{$RP1_ROOT}}/work/features/{FEATURE_ID}/

If neither exists, error:

Feature not found: {FEATURE_ID}
Checked: {{$RP1_ROOT}}/work/archives/features/{FEATURE_ID}/
        {{$RP1_ROOT}}/work/features/{FEATURE_ID}/

Read feature documentation:
- {FEATURE_PATH}/requirements.md - What was built
- {FEATURE_PATH}/design.md - How it was designed
- {FEATURE_PATH}/field-notes.md - Learnings and discoveries (if exists)
- {FEATURE_PATH}/tasks.md - Implementation details with files modified

Extract files modified from tasks.md: Parse implementation summaries to build FILES_MODIFIED list:

Look for patterns:
- **Files**: `src/file1.ts`, `src/file2.ts`
- **Files Modified**: ...

Extract all file paths into FILES_MODIFIED array.

Extract feature context: Build a FEATURE_CONTEXT object containing:
- Feature ID and path
- Key requirements (summarized)
- Architectural decisions from design.md
- All discoveries from field-notes.md
- Implementation patterns used
- files_modified: FILES_MODIFIED array
Jump directly to Phase 1 (Spatial Analysis):
- Pass FILES_MODIFIED to spatial analyzer instead of git diff
- Spatial analyzer categorizes these specific files
- No git commit comparison needed

Spatial analyzer prompt (Feature Learning Mode):

FEATURE_LEARNING mode. Categorize these files modified during feature implementation:
FILES: {{stringify(FILES_MODIFIED)}}

Rank each file 0-5, categorize by KB section (index_files, concept_files, arch_files, module_files).
Return JSON with categorized files.

Sub-agent prompts include:

FEATURE_CONTEXT: {{stringify(feature_context)}}
MODE: FEATURE_LEARNING

Incorporate learnings from this completed feature:
- Update patterns.md with implementation patterns discovered
- Update architecture.md if new architectural patterns emerged
- Update modules.md with new components/dependencies
- Update concept_map.md with new domain concepts

Focus on files that were modified: {{stringify(FILES_MODIFIED)}}

Phase 0: Change Detection and Diff Analysis

NOTE: Skip this phase entirely if FEATURE_ID is provided (Feature Learning Mode).

Check for existing KB state:
- Check if {{$RP1_ROOT}}/context/state.json exists
- If exists, read the git_commit field from state.json
Check current git commit:
- Run: git rev-parse HEAD to get current commit hash
- Compare with git_commit from state.json (if exists)
Determine build strategy:

CASE A: No changes detected (state.json exists AND git commit unchanged):
- ACTION: Skip build entirely (no-op)
- MESSAGE: "KB is up-to-date (commit {{commit_hash}}). No regeneration needed. KB is automatically loaded by agents when needed."
CASE A-MONOREPO: No changes in this service (monorepo: git commit changed but no changes in CODEBASE_ROOT):
- ACTION: Skip build BUT update state.json with new commit
- REASON: In monorepo, global commit moves even when this service unchanged. Update commit reference to avoid checking larger diff ranges in future.
- Update state.json:
  - Read existing state.json
  - Update only the git_commit field to new commit hash
  - Keep all other fields unchanged (strategy, repo_type, files_analyzed, etc.)
  - Write updated state.json
- MESSAGE: "No changes in this service since last build. Updated commit reference ({{old_commit}} -> {{new_commit}}). KB is automatically loaded by agents when needed."
CASE B: First-time build (no state.json):
- ACTION: Full analysis mode - proceed to Phase 1
- MESSAGE: "First-time KB generation with parallel analysis (10-15 min)"
- MODE: Full scan (spatial analyzer processes all files)
CASE C: Incremental update (state.json exists AND commit changed AND files changed in CODEBASE_ROOT):
- ACTION: Incremental analysis mode - get changed files with diffs
- Read monorepo metadata from state.json AND local values from meta.json:
```
# Read shareable state
repo_type=$(jq -r '.repo_type // "single-project"' {{$RP1_ROOT}}/context/state.json)

# Read local values from meta.json (with fallback to state.json for backward compatibility)
if [ -f "{{$RP1_ROOT}}/context/meta.json" ]; then
  repo_root=$(jq -r '.repo_root // "."' {{$RP1_ROOT}}/context/meta.json)
  current_project_path=$(jq -r '.current_project_path // "."' {{$RP1_ROOT}}/context/meta.json)
else
  # Backward compatibility: read from state.json if meta.json doesn't exist
  repo_root=$(jq -r '.repo_root // "."' {{$RP1_ROOT}}/context/state.json)
  current_project_path=$(jq -r '.current_project_path // "."' {{$RP1_ROOT}}/context/state.json)
fi
```
- Get changed files list:
```
# If monorepo, run git diff from repo root and filter to current project
if [ "$repo_type" = "monorepo" ]; then
  cd "$repo_root"
  # Get all changed files
  all_changes=$(git diff --name-only {{old_commit}} {{new_commit}})

  # Filter to current project (skip filtering if root project)
  if [ "$current_project_path" = "." ] || [ "$current_project_path" = "" ]; then
    # Root project - include all files
    echo "$all_changes"
  else
    # Subdirectory project - filter to project path
    echo "$all_changes" | grep "^${current_project_path}"
  fi
else
  # Single-project - get all changes
  git diff --name-only {{old_commit}} {{new_commit}}
fi
```
- Check if any files changed in scope:
  - If NO changes found -> Go to CASE A-MONOREPO (update commit only)
  - If changes found -> Continue with incremental analysis
- Check change set size (prevent token limit issues):
```
changed_file_count=$(echo "$changed_files" | wc -l)
if [ $changed_file_count -gt 50 ]; then
  echo "Large change set ($changed_file_count files changed). Using FULL mode for reliability."
  # Fall back to FULL mode (skip getting diffs)
  MODE="FULL"
else
  MODE="INCREMENTAL"
fi
```
- MESSAGE:
  - If MODE=FULL: "Large change set ({{changed_file_count}} files). Full analysis (10-15 min)"
  - If MODE=INCREMENTAL: "Changes detected since last build ({{old_commit}} -> {{new_commit}}). Analyzing {{changed_file_count}} changed files (2-5 min)"
- Get detailed diffs for each changed file (only if MODE=INCREMENTAL):
```
# Only if incremental mode (< 50 files)
git diff {{old_commit}} {{new_commit}} -- <filepath>
```
- Store diffs: Create FILE_DIFFS JSON mapping filepath -> diff content (only if MODE=INCREMENTAL)
- Filter changed files: Apply EXCLUDE_PATTERNS, filter to relevant extensions
- Store changed files list: Will be passed to spatial analyzer
- MODE: INCREMENTAL (< 50 files) or FULL (>= 50 files)

Phase 1: Spatial Analysis (Sequential)

Spawn spatial analyzer agent:

For full build (CASE B):

{% dispatch_agent "rp1-base:kb-spatial-analyzer" %} FULL SCAN mode. Scan all files in repository at {{CODEBASE_ROOT}}, rank files 0-5, categorize by KB section. Return JSON with index_files, concept_files, arch_files, module_files arrays. {% enddispatch_agent %}

For incremental build (CASE C):

{% dispatch_agent "rp1-base:kb-spatial-analyzer" %} INCREMENTAL mode. Only categorize these changed files: {{changed_files_list}}. Rank each file 0-5, categorize by KB section (index_files, concept_files, arch_files, module_files). Return JSON with categorized changed files. {% enddispatch_agent %}
Parse spatial analyzer output:
- Extract JSON from agent response
- Validate structure: must have repo_type, monorepo_projects, total_files_scanned, index_files, concept_files, arch_files, module_files, local_meta
- Store shareable metadata: repo_type, monorepo_projects
- Store local metadata from local_meta: repo_root, current_project_path (will be written to meta.json)
- For incremental: files_scanned should match changed_file_count
- Check that at least one category has files (some categories may be empty in incremental)
Handle spatial analyzer failure:
- If agent crashes or returns invalid JSON: Log error with details
- If categorization is completely empty: Log error
- Provide troubleshooting guidance

Phase 2: Map Phase (Parallel Execution)

Spawn 4 analysis agents in parallel (CRITICAL: Use a SINGLE message with 4 Task tool calls):

Agent 1 - Concept Extractor:

{% dispatch_agent "rp1-base:kb-concept-extractor" %} MODE={{mode}}. Extract domain concepts for concept_map.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(concept_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_concept_files)}}{{endif}}. Return JSON with concepts, terminology, relationships. {% enddispatch_agent %}

Agent 2 - Architecture Mapper:

{% dispatch_agent "rp1-base:kb-architecture-mapper" %} MODE={{mode}}. Map system architecture for architecture.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(arch_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_arch_files)}}{{endif}}. Return JSON with patterns, layers, diagram. {% enddispatch_agent %}

Agent 3 - Module Analyzer:

{% dispatch_agent "rp1-base:kb-module-analyzer" %} MODE={{mode}}. Analyze modules for modules.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(module_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_module_files)}}{{endif}}. Return JSON with modules, components, dependencies. {% enddispatch_agent %}

Agent 4 - Pattern Extractor:

{% dispatch_agent "rp1-base:kb-pattern-extractor" %} MODE={{mode}}. Extract implementation patterns for patterns.md. Repository type: {{repo_type}}. Files to analyze (JSON): {{stringify(concept_files + module_files)}}. {{if mode==INCREMENTAL}}File diffs (JSON): {{stringify(file_diffs_for_pattern_files)}}{{endif}}. Return JSON with patterns (<=150 lines when rendered). {% enddispatch_agent %}
Collect agent outputs:
- Wait for all 4 agents to complete
- Parse JSON from each agent response
- Validate JSON structure for each output
Handle partial failures:

If 1 agent fails:
- Continue with remaining 3 successful agents
- Generate placeholder content for failed section:
  - concept_map.md failed -> "# Error extracting concepts - run full rebuild"
  - architecture.md failed -> "# Error mapping architecture - see logs"
  - modules.md failed -> "# Error analyzing modules - run full rebuild"
  - patterns.md failed -> "# Error extracting patterns - run full rebuild"
- Include warning in final report: "Partial KB generated (1 agent failed: )"
- Write partial KB files (index.md always generated by orchestrator + 3 successful agent files + 1 placeholder)
- Exit with partial success (still usable KB)
If 2+ agents fail:
- Log all errors with specific agent names and error messages
- Do NOT write partial KB (too incomplete to be useful)
- Provide troubleshooting guidance:
  - Check file permissions
  - Verify git repository is valid
  - Try running again (may be transient failure)
- Exit with error message: "ERROR: KB generation failed (X agents failed)"
- Exit code: 1

Phase 3: Reduce Phase (Merge and Write)

Load KB templates:
```
Use Skill tool with:
skill: rp1-base:knowledge-base-templates
```
- Load templates for: index.md, concept_map.md, architecture.md, modules.md, patterns.md
Merge agent data into templates (concept_map, architecture, modules, patterns):

concept_map.md:
- Use concept-extractor JSON data
- Fill template sections: core concepts, terminology, relationships, patterns
- Add concept boundaries
architecture.md:
- Use architecture-mapper JSON data
- Fill template sections: patterns, layers, interactions, integrations
- Insert Mermaid diagram from JSON
modules.md:
- Use module-analyzer JSON data
- Fill template sections: modules, components, dependencies, metrics
- Add responsibility matrix
patterns.md:
- Use pattern-extractor JSON data
- Fill template sections: 6 core patterns, conditional patterns (if detected)
- Verify output is <=150 lines
- Omit conditional sections if not detected
Validate Mermaid diagrams:
```
Use Skill tool with:
skill: rp1-base:mermaid
```
- Validate diagram from architecture.md
- If invalid: Log warning, use fallback simple diagram or omit
Generate index.md directly (orchestrator-owned, not agent):

The orchestrator generates index.md as the "jump off" entry point by aggregating data from all 4 sub-agents.

Follow the index.md generation instructions in the knowledge-base-templates skill:
- See "Index.md Generation (Orchestrator-Owned)" section in SKILL.md
- Aggregation process: extract data from each sub-agent's JSON output
- Calculate file manifest: get line counts after writing other KB files
- Template placeholder mapping: fill template with aggregated data

Write KB files:

Use Write tool to write:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md

Phase 4: State Management

Aggregate metadata:
- Combine metadata from spatial analyzer + 4 analysis agents
- Calculate total files analyzed
- Extract languages and frameworks
- Calculate metrics (module count, component count, concept count)

Generate state.json (shareable metadata - safe to commit/share):

{
  "strategy": "parallel-map-reduce",
  "repo_type": "{{repo_type}}",
  "monorepo_projects": ["{{project1}}", "{{project2}}"],
  "generated_at": "{{ISO timestamp}}",
  "git_commit": "{{git rev-parse HEAD}}",
  "files_analyzed": {{total_files}},
  "languages": ["{{lang1}}", "{{lang2}}"],
  "metrics": {
    "modules": {{module_count}},
    "components": {{component_count}},
    "concepts": {{concept_count}}
  }
}

Generate meta.json (local values - should NOT be committed/shared):
```
{
  "repo_root": "{{repo_root}}",
  "current_project_path": "{{current_project_path}}"
}
```
NOTE: meta.json contains local paths that may differ per team member. This file should be added to .gitignore.

Write state files:

Use Write tool to write:
- {{$RP1_ROOT}}/context/state.json
- {{$RP1_ROOT}}/context/meta.json

Phase 5: Error Handling

Error Conditions:

Spatial analyzer fails or returns invalid JSON
2 or more analysis agents fail
Template loading fails
Write operations fail repeatedly
Git commands fail (unable to detect commit hash)

Error Handling Procedure:

Log clear error message indicating which phase/component failed
Provide specific details about what went wrong
List attempted operations and their results
Provide actionable guidance for resolution:
- Check git repository status if git commands failed
- Verify file permissions if write operations failed
- Check agent logs if spatial analyzer or analysis agents failed
Report error to user with troubleshooting steps

Final Report:

Knowledge Base Generated Successfully

Strategy: Parallel map-reduce
Repository: {{repo_type}}
Files Analyzed: {{total_files}}

KB Files Written:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md
- {{$RP1_ROOT}}/context/state.json (shareable metadata)
- {{$RP1_ROOT}}/context/meta.json (local paths - add to .gitignore)

Next steps:
- KB is automatically loaded by agents when needed (no manual /knowledge-load required)
- Subsequent runs will use same parallel approach (10-15 min)
- Incremental updates (changed files only) are faster (2-5 min)
- Add meta.json to .gitignore to prevent sharing local paths

Final Report (Feature Learning Mode):

Feature Learnings Captured

Feature: {{FEATURE_ID}}
Source: {{FEATURE_PATH}}

Learnings Incorporated:
- patterns.md: {{N}} new patterns from implementation
- architecture.md: {{N}} architectural decisions
- modules.md: {{N}} new components/dependencies
- concept_map.md: {{N}} domain concepts

KB Files Updated:
- {{$RP1_ROOT}}/context/index.md
- {{$RP1_ROOT}}/context/concept_map.md
- {{$RP1_ROOT}}/context/architecture.md
- {{$RP1_ROOT}}/context/modules.md
- {{$RP1_ROOT}}/context/patterns.md

The knowledge from feature "{{FEATURE_ID}}" has been captured into the KB.
Future agents will benefit from these learnings.

Additional Parameters

Parameter	Default	Purpose
RP1_ROOT	`.rp1/`	Root directory for KB artifacts
CODEBASE_ROOT	`.`	Repository root to analyze
EXCLUDE_PATTERNS	`node_modules/,.git/,build/,dist/`	Patterns to exclude from scanning

Critical Execution Notes

Change detection first: Always check Phase 0 - compare git commit hash to skip if unchanged
Do NOT iterate: Execute workflow ONCE, no refinement
Parallel spawning: Spawn 4 agents in SINGLE message with multiple Task calls
Index.md ownership: Orchestrator generates index.md directly (not via sub-agent)
Error handling: Provide clear error messages with troubleshooting steps if failures occur
No user interaction: Complete entire workflow autonomously
Set expectations: Inform user builds take 10-15 minutes (or instant if no changes)

Output Discipline

CRITICAL - Keep Output Concise:

Do ALL internal work in tags (NOT visible to user)
Do NOT output verbose phase-by-phase progress ("Now doing Phase 1...", "Spawning agents...", etc.)
Do NOT explain internal logic or decision-making process
Only output 3 things:
1. Initial status: Build mode message (CASE A/B/C)
2. High-level progress (optional): "Analyzing... (Phase X/5)" every 2-3 minutes
3. Final report: Success message with KB files written (see Final Report above)

Example of CORRECT output:

First-time KB generation with parallel analysis (10-15 min)
Analyzing... (Phase 2/5)
Knowledge Base Generated Successfully
[Final Report as shown above]

Example of INCORRECT output (DO NOT DO THIS):

Checking for state.json...
state.json not found, proceeding with first-time build
Running git rev-parse HEAD to get commit...
Commit is 475b03e...
Spawning kb-spatial-analyzer agent...
Parsing spatial analyzer output...
Found 90 files in index_files category...
Now spawning 4 parallel agents...
Spawning kb-concept-extractor...
Spawning kb-architecture-mapper...
Spawning kb-module-analyzer...
etc. (too verbose!)

Expected Performance

No changes detected:

Instant (no-op)
Single-project: Commit unchanged -> Skip entirely
Monorepo: Commit changed but no changes in this service -> Update state.json commit only

First-time build (no state.json - full analysis):

10-15 minutes
Spatial analyzer scans all files
5 parallel agents analyze all relevant files
Generates complete KB

Incremental update (commit changed - changed files only):

2-5 minutes (much faster!)
Git diff identifies changed files
Spatial analyzer categorizes only changed files
5 parallel agents load existing KB + analyze only changed files
Updates KB with changes only
Preserves all existing good content