Memory Recall Plugin
Multi-dimension context recall for Claude Code. Automatically surfaces relevant memories, skills, tools, and agent types on every user message.
Features
- 4 hooks: recall (UserPromptSubmit/SubagentStart), memory save (Stop), pair programmer (PostToolUse)
- 4 recall dimensions: memory files, skills, tools (MCP + deferred), agent types
- 3 backends per dimension: reminder (zero-cost), agentic (Haiku selection), embedding (local RAG)
- Pair programmer: evaluates agent actions against user preferences, past experience, strategic direction
- Memory save: auto-saves conversation knowledge to Memory Bank after each turn
- Configurable sync/async: each hook can run synchronously or asynchronously
- 4 skills:
/dream (consolidation), /remember (quick save), /setup (config), /diagnose (troubleshooting)
Installation
claude plugin marketplace add t2ance/memory-recall-plugin
claude plugin install memory-recall@memory-recall
Then configure via /setup or manually in ~/.claude/settings.json:
{
"pluginConfigs": {
"memory-recall@memory-recall": {
"options": {
"memory": "agentic",
"skills": "agentic",
"tools": "agentic",
"agents": "agentic"
}
}
}
}
Each dimension accepts: "off", "reminder", "agentic", or "embedding".
Configuration Reference
Recall options:
| Option | Description | Default |
|---|
memory / skills / tools / agents | Backend per dimension: off, reminder, agentic, embedding | reminder (memory), off (others) |
agentic_mode | parallel (one call/dim) or merged (single call) | parallel |
{dim}_input | What selector sees: title_desc or full | title_desc |
{dim}_output | What gets injected: title_desc or full | full (memory), title_desc (others) |
model | Agentic model: haiku / sonnet / opus | haiku |
context_messages | Recent messages for search context | 5 |
context_max_chars | Max chars of conversation context | 2000 |
max_content_chars | Global cap on total injected content | 9000 |
recall_effort | Effort for recall calls: low or "" | "" |
Embedding options:
| Option | Description | Default |
|---|
embedding_model | HuggingFace model name | intfloat/multilingual-e5-small |
embedding_python | Python path with sentence-transformers | ~/miniconda3/envs/memory-recall/bin/python |
embedding_threshold | Cosine similarity threshold | 0.85 |
embedding_top_k | Max results per dimension | 3 |
embedding_device | cpu or cuda | cpu |
Memory save options:
| Option | Description | Default |
|---|
auto_save_enabled | Enable auto-save after each turn | true |
auto_save_targets | native (project), global, or both | native |
auto_save_context_turns | Conversation turns for analysis | 3 |
auto_save_effort | Effort level for save calls | "" |
Pair programmer options:
| Option | Description | Default |
|---|
pp_enabled | Enable pair programmer | false |
pp_model | Model for evaluation | haiku |
pp_sample_rate | Probability of evaluating each tool call (0-1) | 1.0 |
pp_cooldown_s | Min seconds between evaluations | 0 |
pp_context_messages | Recent messages for trajectory | 5 |
pp_context_max_chars | Max conversation context chars | 3000 |
pp_effort | Effort level | "" |
pp_max_tool_input_chars | Max tool input chars in trajectory | 2000 |
pp_max_tool_output_chars | Max tool output chars in trajectory | 1000 |
pp_max_recall_files | Max memory files to recall | 5 |
pp_max_memory_file_chars | Max chars per recalled memory file | 2000 |
Async options:
| Option | Description | Default |
|---|
recall_async | Run recall hook asynchronously | false |
memory_save_async | Run memory save hook asynchronously | true |
pp_async | Run pair programmer hook asynchronously | true |
How It Works
Recall (UserPromptSubmit / SubagentStart)
On every user message and sub-agent spawn, the hook:
- Discovers available resources per enabled dimension (file scan + hardcoded fallback)
- Recalls relevant items using the configured backend (parallel for agentic)
- Injects results as
additionalContext into the model's context
Sub-agents and teammates also receive recall context. On SubagentStart, the hook extracts the parent agent's prompt from the transcript and runs the full recall pipeline.
Memory Save (Stop)
After each assistant turn, the hook:
- Extracts recent conversation turns from transcript
- Calls Haiku to decide what knowledge to persist (ADD/UPDATE/DELETE/NOOP)
- Writes memory files and updates MEMORY.md index