NextPlaid & ColGREP
NextPlaid is a multi-vector search engine. ColGREP is semantic code search, built on it.
ColGREP
·
NextPlaid
·
Models
ColGREP
Semantic code search for your terminal and your coding agents. Searches combine regex filtering with semantic ranking. All local, your code never leaves your machine.
Quick start
Install:
# Homebrew (macOS / Linux)
brew install lightonai/tap/colgrep
# Shell installer
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | sh
Build the index:
colgrep init /path/to/project # specific project
colgrep init # current directory
Search:
colgrep "database connection pooling"
That's it. No server, no API, no dependencies. ColGREP is a single Rust binary with everything baked in. colgrep init builds the index for the first time. After that, every search detects file changes and updates the index automatically before returning results.
Regex meets semantics:
colgrep -e "async.*await" "error handling"
Change the model
The default is lightonai/LateOn-Code-edge. Switch to any other ColBERT-style model on HuggingFace:
# Persist as the default (existing indexes for other models are kept)
colgrep set-model lightonai/LateOn-Code
# One-shot override for a single query or init
colgrep --model lightonai/LateOn-Code "database connection pooling"
# See which model an index was built with
colgrep status
# Private HuggingFace model
HF_TOKEN=hf_xxx colgrep set-model myorg/private-model
Each (project, model) pair has its own index directory, so switching models never corrupts existing indexes and you can flip back and forth without re-indexing. colgrep clear scopes to the active model; colgrep clear --all wipes every index.
Agent integrations
| Tool | Install |
|---|
| Claude Code | colgrep --install-claude-code |
| OpenCode | colgrep --install-opencode |
| Codex | colgrep --install-codex |
| Hermes | colgrep --install-hermes |
Restart your agent after installing. Claude Code has full hooks support. OpenCode, Codex, and Hermes integrations are basic for now, PRs welcome.
How it works
flowchart TD
A["Your codebase"] --> B["Tree-sitter"]
B --> C["Structured representation"]
C --> D["LateOn-Code-edge · 17M"]
D --> E["NextPlaid"]
E --> F["Search"]
B -.- B1["Parse functions, methods, classes"]
C -.- C1["Signature, params, calls, docstring, code"]
D -.- D1["Multi-vector embedding per code unit · runs on CPU"]
E -.- E1["Rust index binary · quantized · memory-mapped · incremental"]
F -.- F1["grep-compatible flags · SQLite filtering · semantic ranking
100% local, your code never leaves your machine"]
style A fill:#4a90d9,stroke:#357abd,color:#fff
style B fill:#50b86c,stroke:#3d9956,color:#fff
style C fill:#50b86c,stroke:#3d9956,color:#fff
style D fill:#e8913a,stroke:#d07a2e,color:#fff
style E fill:#e8913a,stroke:#d07a2e,color:#fff
style F fill:#9b59b6,stroke:#8445a0,color:#fff
style B1 fill:none,stroke:#888,stroke-dasharray:5 5,color:#888
style C1 fill:none,stroke:#888,stroke-dasharray:5 5,color:#888
style D1 fill:none,stroke:#888,stroke-dasharray:5 5,color:#888
style E1 fill:none,stroke:#888,stroke-dasharray:5 5,color:#888
style F1 fill:none,stroke:#888,stroke-dasharray:5 5,color:#888
What the model sees. Each code unit is converted to structured text before embedding:
# Function: fetch_with_retry
# Signature: def fetch_with_retry(url: str, max_retries: int = 3) -> Response
# Description: Fetches data from a URL with retry logic.
# Parameters: url, max_retries
# Returns: Response
# Calls: range, client.get
# Variables: i, e
# Uses: client, RequestError
# File: src/utils/http_client.py
def fetch_with_retry(url: str, max_retries: int = 3) -> Response:
"""Fetches data from a URL with retry logic."""
for i in range(max_retries):
try:
return client.get(url)
except RequestError as e:
if i == max_retries - 1:
raise e
This structured input gives the model richer signal than raw code alone.
Documentation: install variants, performance tuning, all flags and options → ColGREP documentation
Why multi-vector?