Personal Claude Code plugin marketplace
npx claudepluginhub kengbailey/bailey-marketplaceMemory compression system for Claude Code - persist context across sessions
Tune llama-server for optimal performance and GPU utilization. Analyzes GPU VRAM, model architecture (dense/MoE), and generates launch commands for maximum tok/s.
Share bugs, ideas, or general feedback.
Personal Claude Code plugin marketplace.
Add this marketplace to Claude Code:
/plugin marketplace add <owner>/bailey-claude-marketplace
Then install individual plugins:
/plugin install <plugin-name>@bailey-marketplace
| Plugin | Description | Source |
|---|---|---|
claude-mem | Persistent memory system for Claude Code. Captures tool usage, compresses observations with AI, and re-injects relevant context into future sessions. | External (thedotmack/claude-mem) |
llama-tune | Tune llama-server for optimal performance and GPU utilization. Supports dense and MoE models. | In-repo |
Persistent memory across Claude Code sessions. Automatically captures everything Claude does, compresses it with AI, and provides continuity in future sessions.
Auto-installed dependencies (installed on first run):
Runtime:
localhost:37777http://localhost:37777~/.claude-mem/Install:
/plugin install claude-mem@bailey-marketplace
Tunes llama-server (llama.cpp) launch parameters for maximum tok/s on your hardware. Auto-detects GPU VRAM, CPU cores, and system RAM. Inspects GGUF model files to determine architecture (dense vs MoE), then calculates optimal flags including KV cache quantization, flash attention, expert offloading (MoE), and partial GPU layer placement.
Features:
llama-ggufSkill: /llama-tune <model.gguf> [--ctx SIZE] [--slots N] [--port PORT] [--launch]
Install:
/plugin install llama-tune@bailey-marketplace
In-repo plugins go in the plugins/ directory. External plugins are referenced by source in .claude-plugin/marketplace.json.
MIT