From claude-patent-creator-standalone
Manages MPEP search index lifecycle: downloads USPTO PDFs, extracts/chunks text, generates embeddings, builds FAISS/BM25 indexes, verifies health, optimizes, and updates. Use for setup, maintenance, or troubleshooting.
npx claudepluginhub robthepcguy/claude-patent-creator --plugin claude-patent-creator-standaloneThis skill uses the workspace's default tool permissions.
Expert system for managing MPEP search index lifecycle: PDF downloads, index building, maintenance, updates, optimization.
Diagnoses and resolves Claude Patent Creator issues including MCP server failures, GPU detection, BigQuery authentication, index builds, import errors, search quality, and performance problems.
Ingests PDF datasheets or reference manuals into the embedded docs search index via ingest_docs tool. Reports chunks ingested and tables found.
Indexes local files and performs BM25, vector (Ollama), hybrid search, and reranking using qmd CLI. Supports MCP mode for file retrieval in conversations.
Share bugs, ideas, or general feedback.
Expert system for managing MPEP search index lifecycle: PDF downloads, index building, maintenance, updates, optimization.
FOR CLAUDE: All dependencies installed, system operational.
Building/rebuilding MPEP index, corruption/missing files, optimization, adding content, troubleshooting.
PDFs Not Present -> Download (2-5 min, 500MB)
-> Extract & Parse (500MB data)
-> Generate Embeddings (5-10 min GPU, 35-65 min CPU)
-> Build FAISS + BM25 Indexes
-> Index Ready (mcp_server/index/)
-> Maintenance (Verify -> Optimize -> Update)
Check Status:
ls pdfs/ # Should show mpep-*.pdf, consolidated_laws.pdf, consolidated_rules.pdf
Download PDFs:
patent-creator download-mpep
# Or: python install.py (Select "Download MPEP PDFs")
Verify Integrity:
python -c "
import fitz
from pathlib import Path
for pdf in Path('pdfs').glob('*.pdf'):
try:
doc = fitz.open(pdf)
print(f'[OK] {pdf.name}: {len(doc)} pages')
doc.close()
except Exception as e:
print(f'[X] {pdf.name}: ERROR - {e}')
"
patent-creator rebuild-index
# Or: python mcp_server/server.py --rebuild-index
Timeline:
Total: 5-15 min (GPU) or 35-65 min (CPU)
Custom Build:
from mcp_server.mpep_search import MPEPIndex
index = MPEPIndex(use_hyde=False)
index.build_index(
chunk_size=500,
overlap=50,
batch_size=32 # Reduce to 16/8 if OOM
)
# Check files
ls -lh mcp_server/index/
# Expected: mpep_index.faiss (~150MB), mpep_metadata.json (~80MB), mpep_bm25.pkl (~60MB)
# Verify health
patent-creator health
# Should show: [OK] MPEP Index: Ready (12,543 chunks)
# Manual test
python -c "
from mcp_server.mpep_search import MPEPIndex
index = MPEPIndex()
print(f'Chunks: {len(index.chunks)}')
results = index.search('claim definiteness', top_k=3)
print(f'Search results: {len(results)}')
"
When to Rebuild:
Rebuild Process:
# Backup (optional)
cp -r mcp_server/index mcp_server/index_backup_$(date +%Y%m%d)
# Rebuild
patent-creator rebuild-index
# Verify
patent-creator health
# Remove backup if successful
rm -rf mcp_server/index_backup_*
# Download new PDF
wget https://www.uspto.gov/web/offices/pac/mpep/mpep-2900.pdf -O pdfs/mpep-2900.pdf
# Rebuild (includes new section)
patent-creator rebuild-index
Note: Incremental updates not supported. Full rebuild required.
| Command | Purpose |
|---|---|
patent-creator download-mpep | Download MPEP PDFs |
patent-creator rebuild-index | Build/rebuild search index |
patent-creator health | Check index health |
ls -lh mcp_server/index/ | View index files |
Best Practices: