Unified four-tier memory system for AI agents. Tier 1 Semantic (Serena+Forgetful search), Tier 2 Episodic (session replay), Tier 3 Causal (decision patterns). Enables memory-first architecture per ADR-007.
Searches a four-tier memory system to investigate code architecture decisions before making changes.
/plugin marketplace add rjmurillo/ai-agents/plugin install project-toolkit@ai-agentsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Unified memory operations across four tiers for AI agents.
# Check system health
python3 .claude/skills/memory/scripts/test_memory_health.py
# Search memory (Tier 1)
python3 .claude/skills/memory/scripts/search_memory.py "git hooks"
# Extract episode from session (Tier 2)
python3 .claude/skills/memory/scripts/extract_session_episode.py ".agents/sessions/2026-01-01-session-126.json"
# Update causal graph (Tier 3)
python3 .claude/skills/memory/scripts/update_causal_graph.py
| Scenario | Use Memory Router? | Alternative |
|---|---|---|
| Script needs memory | Yes | - |
| Agent needs deep context | No | context-retrieval agent |
| Human at CLI | No | /memory-search command |
| Cross-project semantic search | No | Forgetful MCP directly |
See context-retrieval agent for complete decision tree.
Core Insight: Memory-first architecture implements Chesterton's Fence principle for AI agents.
"Do not remove a fence until you know why it was put up" - G.K. Chesterton
Translation for agents: Do not change code/architecture/protocol until you search memory for why it exists.
Without memory search (removing fence without investigation):
With memory search (Chesterton's Fence investigation):
search_memory.py "validation logic edge case"When you encounter something you want to change:
| Change Type | Memory Search Required |
|---|---|
| Remove ADR constraint | search_memory.py "[constraint name]" |
| Bypass protocol | search_memory.py "[protocol name] why" |
| Delete >100 lines | search_memory.py "[component] purpose" |
| Refactor complex code | search_memory.py "[component] edge case" |
| Change workflow | search_memory.py "[workflow] rationale" |
Tier 1 (Semantic): Facts, patterns, constraints
Tier 2 (Episodic): Past session outcomes
Tier 3 (Causal): Decision patterns
Before changing existing systems, you MUST:
python3 .claude/skills/memory/scripts/search_memory.py "[topic]"Why BLOCKING: <50% compliance with "check memory first" guidance. Making it BLOCKING achieves 100% compliance (same pattern as session protocol gates).
Verification: Session logs must show memory search BEFORE decisions, not after.
See .agents/analysis/chestertons-fence.md for:
Key takeaway: Memory IS your investigation tool. It contains the "why" that Chesterton's Fence requires you to discover.
This skill implements progressive disclosure principles from Anthropic and claude-mem.ai research through three-layer architecture.
| Layer | Tool | Cost | When to Use |
|---|---|---|---|
| Index | search_memory.py | ~100-500 tokens | Always start here |
| Details | mcp__serena__read_memory | ~500-10K tokens | After index confirms relevance |
| Deep Dive | Follow cross-references | Variable | For complete understanding |
# Count tokens before retrieval (informed ROI decision)
python3 .claude/skills/memory/scripts/count_memory_tokens.py .serena/memories/memory-index.md
# Output: memory-index.md: 2,450 tokens
Caching: SHA-256 hash-based cache in .serena/.token-cache.json provides 10-100x speedup on repeated queries.
See: scripts/README-count-tokens.md
# Pre-commit hook: enforce atomicity thresholds
python3 .claude/skills/memory/scripts/test_memory_size.py .serena/memories --pattern "*.md"
# Exit 0 (pass) or 1 (fail) with decomposition recommendations
Thresholds (from memory-size-001-decomposition-thresholds):
See: scripts/README-test-size.md
Progressive Disclosure: List names → Read details → Deep dive on cross-references. Prevents loading 9,500 tokens when only 1,200 are relevant (87% waste reduction).
Just-in-Time Retrieval: Serena-first with Forgetful augmentation. High precision through lexical search before expensive semantic operations.
Size Enforcement: Atomic memories prevent token waste. One retrievable concept per file.
For full analysis, see: .agents/analysis/context-engineering.md
Use this skill when the user says:
search memory for semantic search across tierscheck memory health for system statusextract episode from session for session replayupdate causal graph for pattern trackingcount memory tokens for budget analysis| Operation | Script | Key Parameters |
|---|---|---|
| Search facts/patterns | search_memory.py | query, --lexical-only, --max-results |
| Extract episode | extract_session_episode.py | session_log_path, --output-path |
| Update patterns | update_causal_graph.py | --episode-path, --dry-run |
| Health check | test_memory_health.py | --format (json/table) |
| Benchmark performance | measure_memory_performance.py | --serena-only, --format |
| Convert index links | convert_index_table_links.py | --memory-path, --dry-run |
| Cross-reference | invoke_memory_cross_reference.py | --memory-path, --threshold |
| Improve graph density | improve_memory_graph_density.py | --memory-path, --dry-run |
What do you need?
│
├─► Current facts, patterns, or rules?
│ └─► TIER 1: search_memory.py
│
├─► What happened in a specific session?
│ └─► TIER 2: Episode JSON in .agents/memory/episodes/
│
├─► Need to store new knowledge?
│ ├─ From completed session? → extract_session_episode.py
│ └─ Factual knowledge? → using-forgetful-memory skill
│
├─► Update decision patterns?
│ └─► TIER 3: update_causal_graph.py
│
└─► Not sure which tier?
└─► Start with TIER 1 (search_memory.py), escalate if insufficient
| Anti-Pattern | Do This Instead |
|---|---|
| Skipping memory search | Always search before multi-step reasoning |
| Tier confusion | Follow decision tree explicitly |
| Forgetful dependency | Use --lexical-only fallback |
| Stale causal graph | Run update_causal_graph.py after extractions |
| Incomplete extraction | Only extract from COMPLETED sessions |
| Document | Content |
|---|---|
| quick-start.md | Common workflows |
| skill-reference.md | Detailed script parameters |
| tier-selection-guide.md | When to use each tier |
| memory-router.md | ADR-037 router architecture |
| reflexion-memory.md | ADR-038 episode/causal schemas |
| troubleshooting.md | Error recovery |
| benchmarking.md | Performance targets |
| agent-integration.md | Multi-agent patterns |
| Data | Location |
|---|---|
| Serena memories | .serena/memories/*.md |
| Forgetful memories | HTTP MCP (vector DB) |
| Episodes | .agents/memory/episodes/*.json |
| Causal graph | .agents/memory/causality/causal-graph.json |
| Operation | Verification |
|---|---|
| Search completed | Result count > 0 OR logged "no results" |
| Episode extracted | JSON file in .agents/memory/episodes/ |
| Graph updated | Stats show nodes/edges added |
| Health check | All tiers show "available: true" |
python3 .claude/skills/memory/scripts/test_memory_health.py --format table
Determine the memory tier and run the appropriate script.
Verify results are non-empty and relevant to the query context.
Return structured results to the caller with source attribution.
| Script | Purpose | Exit Codes |
|---|---|---|
search_memory.py | Tier 1 semantic search across Serena and Forgetful | 0=success, 1=error |
count_memory_tokens.py | Token counting with tiktoken caching | 0=success, 1=error |
test_memory_size.py | Memory atomicity validation | 0=pass, 1=violations |
test_memory_health.py | System health dashboard | 0=success |
extract_session_episode.py | Episode extraction from session logs | 0=success, 1=error |
update_causal_graph.py | Causal graph pattern tracking | 0=success, 1=error |
measure_memory_performance.py | Serena/Forgetful benchmark | 0=success, 1=error |
| Skill | When to Use Instead |
|---|---|
using-forgetful-memory | Deep Forgetful operations (create, update, link) |
curating-memories | Memory maintenance (obsolete, deduplicate) |
exploring-knowledge-graph | Multi-hop graph traversal |
Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.