claude-mem-lite

Lightweight persistent memory system for Claude Code. Automatically captures coding observations, decisions, and bug fixes during sessions, then provides full-text search to recall them later.

Built as an MCP server + Claude Code hooks. Zero external services, single SQLite database, minimal overhead.

Why claude-mem-lite?

A ground-up redesign of claude-mem, replacing its heavyweight architecture with a smarter, leaner approach.

Architecture comparison

	claude-mem (original)	claude-mem-lite
LLM calls	Every tool use triggers a Sonnet call	Only on episode flush (5-10 ops batched)
LLM input	Raw `tool_input` + `tool_output` JSON	Pre-processed action summaries
Conversation	Multi-turn, accumulates full history	Stateless single-turn extraction
Noise filtering	LLM decides via "WHEN TO SKIP" prompt	Deterministic code-level Tier 1 filter
Runtime	Long-running worker process (1.8MB .cjs)	On-demand spawn, exits immediately
Dependencies	Bun + Python/uv + Chroma vector DB	Node.js only (3 npm packages)
Source size	~2.3MB compiled bundles	~50KB readable source
Data directory	`~/.claude-mem/`	`~/.claude-mem-lite/` (hidden, auto-migrates)

Token & cost efficiency

For a typical 50-tool-call session:

	claude-mem	claude-mem-lite	Ratio
LLM calls	~50 (every tool use)	~5-8 (per episode)	7-10x fewer
Tokens per call	1,000-5,000 (raw JSON + history)	200-500 (summaries only)	5-10x smaller
Total tokens	~100K-250K	~1K-4K	50-100x less
Model cost	Sonnet ($3/$15 per M)	Haiku ($0.25/$1.25 per M)	12x cheaper
Combined savings			600x+ lower cost

Quality comparison

Dimension	Winner	Why
Classification accuracy	Tie	Both produce correct type/title/narrative
Noise filtering	lite	Code-level filtering is deterministic; LLM "WHEN TO SKIP" is unreliable
Observation coherence	lite	Episode batching groups related edits into one coherent observation
Code-level detail	original	Sees full diffs, but rarely useful for memory search
Search recall	Tie	Users search semantic concepts ("auth bug"), not code lines
Hook latency	lite	Async background workers; original blocks 2-5s per hook

Design philosophy

The original sends everything to the LLM and hopes it filters well. claude-mem-lite filters first with code, then sends only what matters to a smaller model. This is not a downgrade; it's a smarter architecture that produces equivalent search quality at a fraction of the cost.

Features

Automatic capture -- Hooks into Claude Code lifecycle (PostToolUse, SessionStart, Stop, UserPromptSubmit) to record observations without manual effort
Hybrid search -- FTS5 BM25 + TF-IDF vector cosine similarity, merged via Reciprocal Rank Fusion (RRF). FTS5 handles keyword matching; 512-dim TF-IDF vectors capture semantic similarity for recall beyond exact terms
Timeline browsing -- Navigate observations chronologically with anchor-based context windows
Episode batching -- Groups related file operations into coherent episodes before LLM encoding
Error-triggered recall -- Automatically searches memory when Bash errors occur, surfacing relevant past fixes
Proactive file history -- When editing a file, automatically shows relevant past observations for that file
Session summaries -- LLM-generated summaries at session end (via background workers using claude -p)
Project-scoped context -- Injects recent memory into CLAUDE.md and session startup for immediate context
Observation types -- Categorized as decision, bugfix, feature, refactor, discovery, or change
Importance grading -- LLM assigns 1-3 importance levels (routine / notable / critical) to each observation
Observation relations -- Bidirectional links between related observations based on file overlap
User prompt capture -- Records user prompts via UserPromptSubmit hook for intent tracking
Read file tracking -- Tracks files read during sessions for richer episode context
Zero data loss -- If LLM fails, observations are saved with degraded (inferred) metadata instead of being discarded
Two-tier dedup -- Jaccard similarity (5-minute window) + MinHash signatures (7-day cross-session window) prevent duplicates
Synonym expansion -- Abbreviations like K8s, DB, auth automatically expand to full forms in FTS5 search (100+ pairs including CJK↔EN cross-language mappings)

English | 中文

claude-mem-lite

Lightweight persistent memory system for Claude Code. Automatically captures coding observations, decisions, and bug fixes during sessions, then provides full-text search to recall them later.

Built as an MCP server + Claude Code hooks. Zero external services, single SQLite database, minimal overhead.

Why claude-mem-lite?

A ground-up redesign of claude-mem, replacing its heavyweight architecture with a smarter, leaner approach.

Architecture comparison

	claude-mem (original)	claude-mem-lite
LLM calls	Every tool use triggers a Sonnet call	Only on episode flush (5-10 ops batched)
LLM input	Raw `tool_input` + `tool_output` JSON	Pre-processed action summaries
Conversation	Multi-turn, accumulates full history	Stateless single-turn extraction
Noise filtering	LLM decides via "WHEN TO SKIP" prompt	Deterministic code-level Tier 1 filter
Runtime	Long-running worker process (1.8MB .cjs)	On-demand spawn, exits immediately
Dependencies	Bun + Python/uv + Chroma vector DB	Node.js only (3 npm packages)
Source size	~2.3MB compiled bundles	~50KB readable source
Data directory	`~/.claude-mem/`	`~/.claude-mem-lite/` (hidden, auto-migrates)

Token & cost efficiency

For a typical 50-tool-call session:

	claude-mem	claude-mem-lite	Ratio
LLM calls	~50 (every tool use)	~5-8 (per episode)	7-10x fewer
Tokens per call	1,000-5,000 (raw JSON + history)	200-500 (summaries only)	5-10x smaller
Total tokens	~100K-250K	~1K-4K	50-100x less
Model cost	Sonnet ($3/$15 per M)	Haiku ($0.25/$1.25 per M)	12x cheaper
Combined savings			600x+ lower cost

Quality comparison

Dimension	Winner	Why
Classification accuracy	Tie	Both produce correct type/title/narrative
Noise filtering	lite	Code-level filtering is deterministic; LLM "WHEN TO SKIP" is unreliable
Observation coherence	lite	Episode batching groups related edits into one coherent observation
Code-level detail	original	Sees full diffs, but rarely useful for memory search
Search recall	Tie	Users search semantic concepts ("auth bug"), not code lines
Hook latency	lite	Async background workers; original blocks 2-5s per hook

Design philosophy

Features

Automatic capture -- Hooks into Claude Code lifecycle (PostToolUse, SessionStart, Stop, UserPromptSubmit) to record observations without manual effort
Hybrid search -- FTS5 BM25 + TF-IDF vector cosine similarity, merged via Reciprocal Rank Fusion (RRF). FTS5 handles keyword matching; 512-dim TF-IDF vectors capture semantic similarity for recall beyond exact terms
Timeline browsing -- Navigate observations chronologically with anchor-based context windows
Episode batching -- Groups related file operations into coherent episodes before LLM encoding
Error-triggered recall -- Automatically searches memory when Bash errors occur, surfacing relevant past fixes
Proactive file history -- When editing a file, automatically shows relevant past observations for that file
Session summaries -- LLM-generated summaries at session end (via background workers using claude -p)
Project-scoped context -- Injects recent memory into CLAUDE.md and session startup for immediate context
Observation types -- Categorized as decision, bugfix, feature, refactor, discovery, or change
Importance grading -- LLM assigns 1-3 importance levels (routine / notable / critical) to each observation
Observation relations -- Bidirectional links between related observations based on file overlap
User prompt capture -- Records user prompts via UserPromptSubmit hook for intent tracking
Read file tracking -- Tracks files read during sessions for richer episode context
Zero data loss -- If LLM fails, observations are saved with degraded (inferred) metadata instead of being discarded
Two-tier dedup -- Jaccard similarity (5-minute window) + MinHash signatures (7-day cross-session window) prevent duplicates
Synonym expansion -- Abbreviations like K8s, DB, auth automatically expand to full forms in FTS5 search (100+ pairs including CJK↔EN cross-language mappings)

Help us improve

Help us improve

Help us improve

claude-mem-lite

Popularity

Health & Quality

Confidence

What's Inside

Help us improve

README

claude-mem-lite

Why claude-mem-lite?

Architecture comparison

Token & cost efficiency

Quality comparison

Design philosophy

Features

Similar Plugins

evermem

memsearch

codemem

recall

claude-cognis

cortex

More by sdsrss

code-graph-mcp

gsd

Help us improve

claude-mem-lite

Why claude-mem-lite?

Architecture comparison

Token & cost efficiency

Quality comparison

Design philosophy

Features