npx claudepluginhub xiaoconstantine/sgrepSmart semantic + hybrid code search for local repositories
Share bugs, ideas, or general feedback.
Semantic + hybrid code search that complements ripgrep and ast-grep.
┌─────────────────────────────────────────────────────────────────┐
│ ripgrep (rg) │ ast-grep (sg) │ sgrep │
│ ───────────── │ ────────────── │ ────── │
│ Exact text/regex │ AST patterns │ Semantic + hybrid │
│ "findUser" │ $fn($args) │ "auth validation" │
└─────────────────────────────────────────────────────────────────┘
Coding agents (Amp, Claude Code, Cursor) waste tokens on failed grep attempts when searching for concepts rather than exact strings. sgrep understands what you mean, not just what you type.
# ❌ Agent tries 10+ grep patterns, burns 2000 tokens
rg "authenticate" && rg "auth" && rg "login" && rg "session" ...
# ✅ One semantic query, 50 tokens
sgrep "how does user authentication work"
brew tap XiaoConstantine/tap
brew install sgrep
curl -fsSL https://raw.githubusercontent.com/XiaoConstantine/sgrep/main/install.sh | bash
go install github.com/XiaoConstantine/sgrep/cmd/sgrep@latest
git clone https://github.com/XiaoConstantine/sgrep.git
cd sgrep
# Default build (uses libSQL with DiskANN vector search)
go build -o sgrep ./cmd/sgrep
# Alternative: sqlite-vec backend
go build -tags=sqlite_vec -o sgrep ./cmd/sgrep
Requirements: llama.cpp (for the embedding server)
brew install llama.cpp # macOS
# or build from source: https://github.com/ggerganov/llama.cpp
go get github.com/XiaoConstantine/sgrep@latest
# One-time setup: downloads embedding model (~130MB)
sgrep setup
# Index your codebase (auto-starts embedding server)
sgrep index .
# Semantic search (quick)
sgrep "error handling for database connections"
# Hybrid + ColBERT (recommended - best accuracy)
sgrep --hybrid --colbert "JWT token validation logic"
sgrep --hybrid --colbert "how are API rate limits implemented"
# Hybrid with custom weights
sgrep --hybrid --colbert "authentication middleware" --semantic-weight 0.5 --bm25-weight 0.5
# Watch mode (background indexing)
sgrep watch .
The embedding server starts automatically when needed and stays running as a daemon.
Search across conversations from Claude Code, Codex CLI, Cursor, and OpenCode.
# Index conversations (auto-starts embedding server)
sgrep conv index
# Index a single agent
sgrep conv index --source claude
sgrep conv index --source codex
sgrep conv index --source cursor
sgrep conv index --source opencode
# Watch mode (auto-index new conversations)
sgrep conv index --watch
# Search conversations
sgrep conv "authentication"
sgrep conv "JWT token" --hybrid
sgrep conv "database migration" --agent claude --since 7d
# View, export, or resume a session
sgrep conv view <session_id>
sgrep conv export <session_id> -o conversation.md
sgrep conv resume <session_id>
# Extract context for injection into new session
sgrep conv context <session_id>
# Copy to clipboard
sgrep conv copy <session_id>
# Check index status
sgrep conv status
Watch mode monitors conversation directories for all agents and automatically indexes new sessions as they're created. This ensures your conversation search stays up-to-date without manual re-indexing.
Conversations are stored at ~/.sgrep/conversations/conv.db. Re-running
sgrep conv index backfills missing embeddings for existing sessions.
Hybrid search combines semantic understanding with lexical matching (BM25) for improved accuracy. This helps when:
# Default: semantic-only search
sgrep "authentication"
# Hybrid: semantic (60%) + BM25 (40%) - default weights
sgrep --hybrid "authentication"
# Custom weights: more emphasis on exact matches
sgrep --hybrid --semantic-weight 0.4 --bm25-weight 0.6 "parseAST"
Note: Hybrid search requires building with FTS5 support (see From Source). The FTS5 index is created automatically on first hybrid search - no re-indexing needed.
sgrep uses a sophisticated multi-stage retrieval pipeline for maximum accuracy: