RAG-CLI v2.0
DO NOT USE THIS TOOL FOR ANTHROPIC / CLAUDE - SEE BELOW
Just a heads-up, turns out Anthropic / Claude does not like it when you avoid token usage cost by routing traffic to the CLI tool from them. This shadow banned me from their platform when I was on their $200 a month plan. They refuse to respond after months of submitting an appeal, etc, and no project I worked on violated any aspect of their Terms. After research, I see many people have been banned on similar cases. You have been warned.
Local Retrieval-Augmented Generation system for Claude Code with Multi-Agent Framework integration.
A production-ready Claude Code plugin that combines ChromaDB vector embeddings with intelligent document retrieval and Multi-Agent Framework (MAF) orchestration for context-aware development assistance.
Project Status
Current Version: 2.0.0
Status: Production Ready (with known limitations documented in KNOWN_ISSUES.md)
Key Features:
- ChromaDB-based vector storage with HNSW indexing
- Hybrid search combining semantic and keyword matching
- Multi-Agent Framework for intelligent query routing
- Zero external API costs for document processing
- Comprehensive plugin system (hooks, MCP server, slash commands)
Alternative Project: For a standalone CLI experience with extended features, see dt-cli. Both projects are actively maintained and can be used together.
Overview
RAG-CLI is a production-ready local Retrieval-Augmented Generation system that enhances your development workflow by providing instant access to your project documentation, codebase context, and external resources. It works seamlessly with Claude Code as a native plugin, eliminating the need for external API calls while processing documents locally with enterprise-grade security and performance.
Why Use RAG-CLI?
- Zero API Overhead: Process documents locally without incurring API costs
- Instant Context: Get relevant documentation in milliseconds instead of manual searches
- Improved Code Quality: Make better decisions with context-aware assistance
- Complete Privacy: All document processing stays on your machine
- Developer Focused: Optimized for development workflows and Claude Code integration
Features
- Local-First Architecture: Everything runs locally except Claude API calls
- Fast Performance: <100ms vector search, <5s end-to-end responses
- Hybrid Search: Combines semantic vector search with keyword matching for superior accuracy
- Claude Code Integration: Seamless plugin for enhanced development workflow
- Multi-Format Support: Process MD, PDF, DOCX, HTML, and TXT files
- Real-Time Monitoring: TCP server with PowerShell interface for system observability
- Background File Watching: Automatic document indexing with watchdog library (debounced events)
- Multi-Agent Orchestration: Intelligent routing between RAG and code analysis agents
- Production Ready: Comprehensive error handling, logging, and monitoring
Installation Guide
Prerequisites
- Python: 3.8 or higher (tested with 3.13)
- RAM: 4GB minimum (8GB recommended for large document sets)
- Disk Space: 2GB for dependencies + space for document vectors
- Claude Code: Latest version (for plugin mode)
- Anthropic API Key: Optional (only for standalone mode)
System Requirements
RAG-CLI runs efficiently on:
- Windows 10+ / macOS / Linux
- Laptops with limited resources (scales gracefully)
- Cloud instances and Docker containers
- CI/CD pipelines
Installation Methods
Method 1: Claude Code Marketplace (Recommended)
The easiest way to get RAG-CLI as a Claude Code plugin:
# In Claude Code terminal
/plugin marketplace add https://github.com/ItMeDiaTech/rag-cli.git
/plugin install rag-cli
Then restart Claude Code. The plugin will activate automatically with zero configuration.
Benefits:
- Automatic installation of all dependencies
- Plugin manages its own lifecycle
- No API key needed (uses Claude Code internally)
- One-command updates via
/plugin update rag-cli
Method 2: Manual Installation from Source
For development, testing, or custom configuration:
# Clone the repository
git clone https://github.com/ItMeDiaTech/rag-cli.git
cd rag-cli
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Verify installation
python -c "from rag_cli.core import embeddings; print('Installation successful!')"
Method 3: Development Installation
For contributing to RAG-CLI:
# Clone and install in editable mode
git clone https://github.com/ItMeDiaTech/rag-cli.git
cd rag-cli
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install with development dependencies
pip install -e ".[dev]"