From neely-brain-dump
Elasticsearch-based distributed file search across all cluster nodes. Use when searching for files, finding duplicates, or querying storage metadata.
npx claudepluginhub built-simple/claude-brain-dump-repo --plugin neely-brain-dumpThis skill is limited to using the following tools:
Elasticsearch + FSCrawler deployment for searching files across the entire Proxmox cluster.
Guides Elasticsearch index mapping design, Query DSL (match, term, bool, aggregations), bulk indexing, cluster management, and performance tuning for full-text search and complex queries.
Interact with Elasticsearch and Kibana via curl REST API for querying (Query DSL), indexing, CRUD, index management, mappings, aggregations, cluster health, ILM, ES|QL, dashboards, OpenTelemetry patterns, and troubleshooting.
Searches documents, codebases, and knowledge bases using BM25 keyword, semantic vector, hybrid, graph, and multi retrieval modes for dependencies, relationships, and references.
Share bugs, ideas, or general feedback.
Elasticsearch + FSCrawler deployment for searching files across the entire Proxmox cluster.
┌──────────────────────┐
│ Elasticsearch │
│ 192.168.1.122:9200 │
│ (CT501 Giratina) │
└──────────┬───────────┘
┌───────────────────┼───────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Giratina │ │ Talon │ │ Victini │
│ 1 Crawler │ │ 3 Crawlers │ │ 3 Crawlers │
│ RAID6 │ │ 5.5TB │ │ 29TB │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
┌──────▼──────┐ ┌──────▼──────┐
│ Hoopa │ │ Silvally │
│ 1 Crawler │ │ 1 Crawler │
└─────────────┘ └─────────────┘
Elasticsearch: http://192.168.1.122:9200 Total Storage Indexed: ~18.5TB Total Documents: 3.4M+ files Active Crawlers: 9
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"match": {"file.filename": "document.pdf"}},
"size": 20
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"wildcard": {"path.real": "*Legal*"}}
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"wildcard": {"file.filename": "*.pdf"}}
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"range": {"file.filesize": {"gte": 1073741824}}},
"sort": [{"file.filesize": {"order": "desc"}}]
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"range": {"file.last_modified": {"gte": "now-7d"}}},
"sort": [{"file.last_modified": {"order": "desc"}}]
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"duplicate_sizes": {
"terms": {"field": "file.filesize", "min_doc_count": 2, "size": 100}
}
}
}'
curl -s "http://192.168.1.122:9200/*-storage/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"match": {"file.filename": "your-search-term"}}
}'
| Crawler | Path | Index | Documents |
|---|---|---|---|
| raid6-storage | /mnt/raid6 | raid6-storage | ~265 files |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| talon-nvme-storage | /mnt/nvme-storage | 931GB (88%) | 218K+ |
| talon-pmc-data | /mnt/pmc_data | 1.9TB (86%) | 576K+ |
| talon-t9 | /mnt/t9 | 3.7TB (100%) | 2.3M+ |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| victini-storage | /mnt/storage | 22TB (8.2TB used) | 253K+ |
| victini-ext4-drive | /mnt/storage/ext4_drive | 3.6TB (2.3TB) | Growing |
| victini-new-volume | /mnt/storage/new_volume | 3.7TB (2.4TB) | Growing |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| hoopa-storage | /mnt/network_transfer | 393GB (90GB) | 750+ |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| silvally-storage | /mnt/raid6 | 832GB | 3 folders |
curl "http://192.168.1.122:9200/_cat/indices?v"
curl "http://192.168.1.122:9200/_cat/indices/*-storage*?v&h=index,docs.count,store.size&s=index"
curl "http://192.168.1.122:9200/_cluster/health?pretty"
curl "http://192.168.1.122:9200/talon-t9/_count?pretty"
ssh root@192.168.1.X "systemctl status fscrawler*"
ssh root@192.168.1.X "systemctl restart fscrawler-NAME"
ssh root@192.168.1.X "journalctl -u fscrawler-NAME -f"
Each indexed file has this metadata:
{
"file": {
"filename": "example.pdf",
"extension": "pdf",
"filesize": 1048576,
"indexing_date": "2025-12-05T08:00:00.000Z",
"last_modified": "2025-12-01T10:30:00.000Z"
},
"path": {
"real": "/mnt/storage/expansion/Legal/example.pdf",
"root": "/mnt/storage",
"virtual": "/expansion/Legal/example.pdf"
},
"meta": {
"title": "Example Document",
"author": "John Doe"
}
}
mkdir -p /root/.fscrawler/new-crawler-name
cat > /root/.fscrawler/new-crawler-name/_settings.yaml << 'EOF'
---
name: "new-crawler-name"
fs:
url: "/path/to/storage"
update_rate: "30m"
indexed_chars: "0"
add_filesize: true
continue_on_error: true
remove_deleted: true
excludes:
- "*/node_modules/*"
- "*/.git/*"
- "*/.cache/*"
- "*.tmp"
- "*.log"
ocr:
enabled: false
elasticsearch:
nodes:
- url: "http://192.168.1.122:9200"
index: "node-name-storage"
bulk_size: 500
EOF
cat > /etc/systemd/system/fscrawler-new-name.service << 'EOF'
[Unit]
Description=FSCrawler - New Storage Indexer
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/opt/fscrawler
ExecStart=/opt/fscrawler/bin/fscrawler new-crawler-name --loop 999
Restart=on-failure
RestartSec=30
Environment="FS_JAVA_OPTS=-Xms512m -Xmx1g"
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now fscrawler-new-name
Normal for single-node. Fix with:
curl -X PUT "http://192.168.1.122:9200/_all/_settings" -H 'Content-Type: application/json' -d'
{"index": {"number_of_replicas": 0}}'
systemctl stop fscrawler-NAME
curl -X DELETE "http://192.168.1.122:9200/index-name"
rm -rf /root/.fscrawler/crawler-name/.fscrawler
systemctl start fscrawler-NAME
Last Updated: December 5, 2025