From memory-recall
Interactive setup for memory-recall plugin. Configure recall dimensions (memory, skills, tools, agents) and backends (reminder/agentic/embedding). Use when the user says /setup or wants to configure the memory-recall plugin.
npx claudepluginhub t2ance/memory-recall-pluginThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Implements structured self-debugging workflow for AI agent failures: capture errors, diagnose patterns like loops or context overflow, apply contained recoveries, and generate introspection reports.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Guide the user through configuring the memory-recall plugin. 4 dimensions x 3 backends.
Plugin options in ~/.claude/settings.json under:
{
"pluginConfigs": {
"memory-recall@memory-recall": {
"options": {
"memory": "reminder",
"skills": "off",
"tools": "off",
"agents": "off",
...
}
}
}
}
The hook reads these as CLAUDE_PLUGIN_OPTION_* env vars. Changes require session restart or /reload-plugins.
Read ~/.claude/settings.json and extract pluginConfigs.memory-recall@memory-recall.options. Show current settings. If missing, all at defaults.
Use AskUserQuestion to ask about each recall dimension. Each dimension can be independently set to one of 4 backends:
| Backend | Description | Requirements | Cost |
|---|---|---|---|
off | Disabled. No recall for this dimension. | None | Free |
reminder | Lists all available resources. Agent decides what to use. | None | Free |
agentic | Haiku selects the most relevant resources. Precise. | claude-agent-sdk | ~$0.003/dim/query |
embedding | Local vector similarity search. Fast after setup. | conda env + sentence-transformers | Free after setup |
| Dimension | What it recalls | Discovery |
|---|---|---|
memory | Memory topic files from project + global memory dirs | Scans ~/.claude/projects/*/memory/ and global memory |
skills | Slash commands (plugin skills + built-in skills) | Scans plugin cache + hardcoded built-in list |
tools | MCP servers + CC deferred tools (WebFetch, WebSearch, etc.) | Reads settings.json mcpServers + hardcoded deferred tools |
agents | Agent types (general-purpose, Explore, debugger, etc.) | Scans .claude/agents/ + plugin agents + hardcoded built-in |
Recommend starting with memory: agentic and one or two other dimensions on agentic. All agentic dimensions run in parallel (~5s total latency regardless of count).
If any dimension uses agentic:
model: Which model? Options: haiku (fast/cheap, recommended), sonnet (smarter but costlier), opus (most capable, most expensive). Default: haiku.agentic_mode: parallel (one call per dimension, better quality) or merged (single call for all, faster). Default: parallel.Per-dimension granularity (ask if user wants fine-grained control):
Each dimension has independent input and output granularity:
{dim}_input: What the selection backend (Haiku/embedding) sees when choosing resources.{dim}_output: What gets injected into the main model's context after selection.Both accept title_desc or full:
| Value | As input (what selector sees) | As output (what main model gets) |
|---|---|---|
title_desc | Resource name + one-line description | Name + reason (from Haiku's recommendation) |
full | File content (truncated to 500 chars) | Full file content injected into context |
Concrete examples:
memory_output: full (default) = selected memory files are read in full and injectedmemory_output: title_desc = only file paths injected, model must Read files itselfskills_output: full = selected skills' SKILL.md content injectedskills_output: title_desc (default) = only skill name + recommendation reasonDefaults: all {dim}_input default to title_desc. memory_output defaults to full, all others to title_desc.
All 8 configs: memory_input, memory_output, skills_input, skills_output, tools_input, tools_output, agents_input, agents_output.
N/A combinations (setting has no effect):
| Backend | {dim}_input | {dim}_output |
|---|---|---|
reminder | N/A (no selection step, returns everything) | Works |
agentic | Works | Works |
embedding (memory) | N/A (daemon always searches full content) | Works |
embedding (non-memory) | Works | Works |
If any dimension uses embedding:
embedding_model: HuggingFace model name. Default: intfloat/multilingual-e5-small.embedding_python: Path to Python with sentence-transformers. Default: ~/miniconda3/envs/memory-recall/bin/python.embedding_threshold: Cosine similarity threshold (0-1). Default: 0.85.embedding_top_k: Max results per dimension (1-10). Default: 3.embedding_device: cpu or cuda. Default: cpu.Shared options (ask for any non-off dimension):
context_messages: Recent conversation messages for search context (0-20). Default: 5.context_max_chars: Max chars of context (0-10000). Default: 2000.max_content_chars: Global cap on total injected content across all dimensions (1000-10000). Default: 9000. This is a shared budget -- dimensions are processed in order (memory, skills, tools, agents), and each dimension's output decrements the remaining budget. If memory uses 7000 chars, only 2000 chars remain for the other dimensions.Only ask about options the user wants to customize. Defaults are fine for most users.
Read ~/.claude/settings.json, merge new options into pluginConfigs.memory-recall@memory-recall.options, write back. Preserve all other settings. Create keys if missing.
If any dimension uses embedding:
Create conda env (skip if exists):
conda create -n memory-recall python=3.11 -y
Install dependencies:
conda run -n memory-recall pip install torch --index-url https://download.pytorch.org/whl/cpu
conda run -n memory-recall pip install sentence-transformers numpy
For embedding_device: cuda: conda run -n memory-recall pip install torch (with CUDA).
Pre-download model:
conda run -n memory-recall python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('MODEL_NAME')"
Start daemon (it auto-starts on first use, but can pre-start):
PLUGIN_DATA="${CLAUDE_PLUGIN_DATA:-$HOME/.claude/plugins/data/memory-recall-memory-recall}"
EMBEDDING_MODEL="MODEL_NAME" EMBEDDING_DEVICE="DEVICE" nohup ~/miniconda3/envs/memory-recall/bin/python "$(find ~/.claude -path '*/memory-recall/hooks/embedding_daemon.py' | head -1)" >> "${PLUGIN_DATA}/daemon.log" 2>&1 &
agentic: python3 -c "from claude_agent_sdk import query; print('OK')"
embedding: Test daemon connectivity via socket query.
reminder: No verification needed.
Show configured dimensions and backends. Remind user to restart session or /reload-plugins for changes to take effect.
If something isn't working after setup, tell the user to run /diagnose for interactive troubleshooting.
The plugin writes hook execution status to JSON files. To see this status in your Claude Code statusLine, integrate the reading logic into your statusLine script.
Each hook (recall, memory_save, pair_programmer) writes its execution state (running/done/error) to:
~/.claude/plugins/data/memory-recall-memory-recall/status/<session_id>/<hook>.json
Your statusLine script reads these files and appends one line per hook showing its current state, timing, cost, and summary.
Read the user's existing statusLine configuration from ~/.claude/settings.json (the statusLine.command field). Then read the script it points to and append the plugin status reading logic.
Key requirements for the reading logic:
session_id from stdin JSON (.session_id)*.json files from ~/.claude/plugins/data/memory-recall-memory-recall/status/<session_id>/agent_id): always displayagent_id): only display if state == "running"started_at is older than 120 seconds, show as "timeout"exit 0 at end). CC discards output on non-zero exit.Create a new script at ~/.claude/statusline-memory-recall.sh that reads the status files and outputs them. Then set statusLine.command in ~/.claude/settings.json to bash ~/.claude/statusline-memory-recall.sh.
recall: 3 items recalled | 0.42s $0.003 haiku 14:32:05 x42
memory_save: running... (started 14:33:01)
pair_programmer[Explore]: running... (started 14:34:01)