Personal semantic memory for Claude — store, search, and recall information across sessions using local vector embeddings
npx claudepluginhub anelcanto/recallPersonal semantic memory for Claude — store, search, and recall information across sessions
A personal semantic memory system. Store, search, and manage memories locally using vector embeddings.
Everything runs on your machine: FastAPI server, Qdrant vector database (Docker), and Ollama for local embeddings. Zero cost, full privacy.
pip install recall-cli
brew tap anelcanto/recall-cli
brew install recall-cli
git clone https://github.com/anelcanto/recall.git
cd recall
./install.sh
Or the quick version:
make install
brew install ollama && ollama pull nomic-embed-text)brew install uv)# One-time setup (creates ~/.recall/.env, starts Qdrant)
recall init
# Start the API server (in a terminal, keep it running)
recall serve
# In another terminal
recall add "The quick brown fox" --tag test
recall search "fox"
recall list
recall status
recall init # Set up config + start Qdrant
recall serve [--host 127.0.0.1] [--port 8100] [--no-qdrant] # Start API server
recall add "text" --tag work --source cli [--dedupe-key "..."]
recall search "query" --top-k 10 [--no-text] [--output table|json]
recall ingest <file> [--format lines|jsonl] [--source name] [--auto-dedupe]
recall list [--limit 20] [--cursor ...] [--output table|json]
recall delete <id>
recall status
| Variable | Default | Description |
|---|---|---|
RECALL_API_URL | http://127.0.0.1:8100 | API server URL |
RECALL_API_TOKEN | (none) | Bearer token for auth |
recall CLI --> FastAPI server (:8100) --> Qdrant (Docker :6333)
|
v
Ollama (:11434)
nomic-embed-text
nomic-embed-textUser config lives in ~/.recall/.env. Qdrant data persists in a Docker volume.
| Method | Path | Description |
|---|---|---|
POST | /memory | Store a memory |
POST | /search | Semantic search |
POST | /ingest | Batch import |
GET | /memories | List with pagination |
DELETE | /memory/{id} | Delete a memory |
GET | /health | Service health check |
recall ships as a Claude Code plugin — Claude can store and search your memories directly during conversations, with no manual CLI commands needed.
/plugin marketplace add anelcanto/recall
/plugin install recall@recall
That's it. Claude Code handles everything else.
If you prefer to wire up just the MCP server without the plugin system:
pip install recall-cli
Then add it to .mcp.json at your project root:
{
"mcpServers": {
"recall": {
"type": "stdio",
"command": "recall-mcp"
}
}
}
recall serve must be running before Claude can use the MCP tools:
recall serve
| Tool | Description |
|---|---|
store_memory(text, tags?, source?, dedupe_key?) | Store a new memory. Returns ID. |
search_memories(query, top_k?) | Semantic search. Returns scored results. |
list_memories(limit?) | List recent memories. |
delete_memory(memory_id) | Delete a memory by ID. |
check_health() | Check if recall API, Qdrant, and Ollama are up. |
Once connected, just talk to Claude naturally:
"Remember that I prefer using uv for Python projects"
→ Claude calls store_memory(...)
"What do you know about my React setup?"
→ Claude calls search_memories("React setup")
"Show me all my memories"
→ Claude calls list_memories()
"Forget the last thing you stored"
→ Claude calls delete_memory(id)
From the terminal:
# Refresh the marketplace index first
claude plugin marketplace update recall
# Then update the plugin
claude plugin update recall@recall
Or from inside Claude Code, use the /recall:update command to check for a newer version on PyPI and upgrade.
The MCP server reads config from ~/.recall/.env (created by recall init) and respects the same environment variables as the CLI:
| Variable | Default | Description |
|---|---|---|
RECALL_API_URL | http://127.0.0.1:8100 | API server URL |
RECALL_API_TOKEN | (none) | Bearer token for auth |