Skill

setup

Interactive setup for memory-recall plugin. Configure recall dimensions (memory, skills, tools, agents) and backends (reminder/agentic/embedding). Use when the user says /setup or wants to configure the memory-recall plugin.

Install

npx claudepluginhub t2ance/memory-recall-plugin

Tool Access

This skill uses the workspace's default tool permissions.

SKILL.md

Similar Skills

agent-harness-construction

Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.

ecc

149.1k

agent-introspection-debugging

Implements structured self-debugging workflow for AI agent failures: capture errors, diagnose patterns like loops or context overflow, apply contained recoveries, and generate introspection reports.

ecc

149.1k

agent-eval

Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.

ecc

149.1k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitApr 10, 2026

Actions

View Source View Plugin View on GitHub View README

Memory-Recall Plugin Setup (v3.0)

Guide the user through configuring the memory-recall plugin. 4 dimensions x 3 backends.

Configuration location

Plugin options in ~/.claude/settings.json under:

{
  "pluginConfigs": {
    "memory-recall@memory-recall": {
      "options": {
        "memory": "reminder",
        "skills": "off",
        "tools": "off",
        "agents": "off",
        ...
      }
    }
  }
}

The hook reads these as CLAUDE_PLUGIN_OPTION_* env vars. Changes require session restart or /reload-plugins.

Step 1: Read current config

Read ~/.claude/settings.json and extract pluginConfigs.memory-recall@memory-recall.options. Show current settings. If missing, all at defaults.

Step 2: Configure each dimension

Use AskUserQuestion to ask about each recall dimension. Each dimension can be independently set to one of 4 backends:

Backend	Description	Requirements	Cost
`off`	Disabled. No recall for this dimension.	None	Free
`reminder`	Lists all available resources. Agent decides what to use.	None	Free
`agentic`	Haiku selects the most relevant resources. Precise.	`claude-agent-sdk`	~$0.003/dim/query
`embedding`	Local vector similarity search. Fast after setup.	conda env + sentence-transformers	Free after setup

Dimensions

Dimension	What it recalls	Discovery
`memory`	Memory topic files from project + global memory dirs	Scans `~/.claude/projects/*/memory/` and global memory
`skills`	Slash commands (plugin skills + built-in skills)	Scans plugin cache + hardcoded built-in list
`tools`	MCP servers + CC deferred tools (WebFetch, WebSearch, etc.)	Reads settings.json mcpServers + hardcoded deferred tools
`agents`	Agent types (general-purpose, Explore, debugger, etc.)	Scans .claude/agents/ + plugin agents + hardcoded built-in

Recommend starting with memory: agentic and one or two other dimensions on agentic. All agentic dimensions run in parallel (~5s total latency regardless of count).

Step 3: Backend-specific options

If any dimension uses agentic:

model: Which model? Options: haiku (fast/cheap, recommended), sonnet (smarter but costlier), opus (most capable, most expensive). Default: haiku.
agentic_mode: parallel (one call per dimension, better quality) or merged (single call for all, faster). Default: parallel.

Per-dimension granularity (ask if user wants fine-grained control):

Each dimension has independent input and output granularity:

{dim}_input: What the selection backend (Haiku/embedding) sees when choosing resources.
{dim}_output: What gets injected into the main model's context after selection.

Both accept title_desc or full:

Value	As input (what selector sees)	As output (what main model gets)
`title_desc`	Resource name + one-line description	Name + reason (from Haiku's recommendation)
`full`	File content (truncated to 500 chars)	Full file content injected into context

Concrete examples:

memory_output: full (default) = selected memory files are read in full and injected
memory_output: title_desc = only file paths injected, model must Read files itself
skills_output: full = selected skills' SKILL.md content injected
skills_output: title_desc (default) = only skill name + recommendation reason

Defaults: all {dim}_input default to title_desc. memory_output defaults to full, all others to title_desc.

All 8 configs: memory_input, memory_output, skills_input, skills_output, tools_input, tools_output, agents_input, agents_output.

N/A combinations (setting has no effect):

Backend	`{dim}_input`	`{dim}_output`
`reminder`	N/A (no selection step, returns everything)	Works
`agentic`	Works	Works
`embedding` (memory)	N/A (daemon always searches full content)	Works
`embedding` (non-memory)	Works	Works

If any dimension uses embedding:

embedding_model: HuggingFace model name. Default: intfloat/multilingual-e5-small.
embedding_python: Path to Python with sentence-transformers. Default: ~/miniconda3/envs/memory-recall/bin/python.
embedding_threshold: Cosine similarity threshold (0-1). Default: 0.85.
embedding_top_k: Max results per dimension (1-10). Default: 3.
embedding_device: cpu or cuda. Default: cpu.

Shared options (ask for any non-off dimension):

context_messages: Recent conversation messages for search context (0-20). Default: 5.
context_max_chars: Max chars of context (0-10000). Default: 2000.
max_content_chars: Global cap on total injected content across all dimensions (1000-10000). Default: 9000. This is a shared budget -- dimensions are processed in order (memory, skills, tools, agents), and each dimension's output decrements the remaining budget. If memory uses 7000 chars, only 2000 chars remain for the other dimensions.

Only ask about options the user wants to customize. Defaults are fine for most users.

Step 4: Apply config

Read ~/.claude/settings.json, merge new options into pluginConfigs.memory-recall@memory-recall.options, write back. Preserve all other settings. Create keys if missing.

Step 5: Environment setup (embedding only)

If any dimension uses embedding:

Create conda env (skip if exists):

conda create -n memory-recall python=3.11 -y

Install dependencies:

conda run -n memory-recall pip install torch --index-url https://download.pytorch.org/whl/cpu
conda run -n memory-recall pip install sentence-transformers numpy

For embedding_device: cuda: conda run -n memory-recall pip install torch (with CUDA).

Pre-download model:

conda run -n memory-recall python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('MODEL_NAME')"

Start daemon (it auto-starts on first use, but can pre-start):

PLUGIN_DATA="${CLAUDE_PLUGIN_DATA:-$HOME/.claude/plugins/data/memory-recall-memory-recall}"
EMBEDDING_MODEL="MODEL_NAME" EMBEDDING_DEVICE="DEVICE" nohup ~/miniconda3/envs/memory-recall/bin/python "$(find ~/.claude -path '*/memory-recall/hooks/embedding_daemon.py' | head -1)" >> "${PLUGIN_DATA}/daemon.log" 2>&1 &

Step 6: Verify

agentic: python3 -c "from claude_agent_sdk import query; print('OK')"

embedding: Test daemon connectivity via socket query.

reminder: No verification needed.

Step 7: Summary

Show configured dimensions and backends. Remind user to restart session or /reload-plugins for changes to take effect.

Troubleshooting

If something isn't working after setup, tell the user to run /diagnose for interactive troubleshooting.

StatusLine Integration (Hook Visibility)

The plugin writes hook execution status to JSON files. To see this status in your Claude Code statusLine, integrate the reading logic into your statusLine script.

What it does

Each hook (recall, memory_save, pair_programmer) writes its execution state (running/done/error) to:

~/.claude/plugins/data/memory-recall-memory-recall/status/<session_id>/<hook>.json

Your statusLine script reads these files and appends one line per hook showing its current state, timing, cost, and summary.

How to integrate

Read the user's existing statusLine configuration from ~/.claude/settings.json (the statusLine.command field). Then read the script it points to and append the plugin status reading logic.

Key requirements for the reading logic:

Extract session_id from stdin JSON (.session_id)
Read all *.json files from ~/.claude/plugins/data/memory-recall-memory-recall/status/<session_id>/
For each file, parse the JSON and format one line:
- Main agent (no agent_id): always display
- Subagent (has agent_id): only display if state == "running"
Color coding: green for done, yellow for running, red for error/timeout
Stale detection: if state is "running" and started_at is older than 120 seconds, show as "timeout"
The script MUST exit with code 0 (add exit 0 at end). CC discards output on non-zero exit.
Plugin status lines appear after the user's existing statusLine output.

If user has no statusLine configured

Create a new script at ~/.claude/statusline-memory-recall.sh that reads the status files and outputs them. Then set statusLine.command in ~/.claude/settings.json to bash ~/.claude/statusline-memory-recall.sh.

Display format example

recall: 3 items recalled | 0.42s $0.003 haiku 14:32:05 x42
memory_save: running... (started 14:33:01)
pair_programmer[Explore]: running... (started 14:34:01)