Audio engineering & production primitives — mic-bound voice profiling, EQ preset suggestion with A/B auditions, compression, de-essing, normalization, VAD segmentation, mastering, tagging, and podcast assembly. Persists mic profiles, presets, and auditions to a versioned user-data directory.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin audio-productionRegister a new microphone profile in the plugin's user-data directory. Captures mic metadata, extracts a 3-min sample from a source recording, runs voice profiling, and (optionally) seeds default presets bound to this mic.
Run a full audio-engineering chain on a file — highpass, EQ, de-essing, compression, optional loudness normalisation — in a single ffmpeg invocation. Use a saved preset or build a chain inline.
Apply a saved EQ + dynamics preset to an audio file via ffmpeg. Reads the preset JSON from the plugin's user-data directory, builds a filter chain, and writes a processed copy alongside the input.
Concatenate intro + body + outro (with optional crossfade) into a single episode master (podcast workspace)
Apply a saved preset to a 1-minute clip from the bound mic's reference sample (or any audio file) and emit before/after WAVs side-by-side for A/B listening.
Quick peak or RMS normalization — lightweight alternative to full EBU R128 loudnorm
Embed cover art into an existing MP3 without re-encoding the audio
Measure an audio file's integrated LUFS, true peak, and loudness range without modifying it
Apply dynamic range compression to an audio file via ffmpeg's acompressor. Use for taming peaks and increasing perceived loudness on spoken-word, vocal, or podcast material. Standalone — does not require a saved preset.
Concatenate multiple audio files (with optional crossfade) into a single output
Convert audio between formats (WAV / FLAC / MP3 / Opus / AAC) with explicit bitrate and sample-rate control
Reduce sibilance ("s" / "sh" harshness) in a voice recording using a band-limited dynamic cut. ffmpeg-only proxy for a true sidechain de-esser — fast and good enough for most podcast and spoken-word material.
Remove background noise from an audio file. Local-first — DeepFilterNet (ML, validated) or ffmpeg afftdn (non-ML, instant). Use before mastering, before applying EQ presets, or whenever a recording has hum, hiss, room noise, or constant background sound that needs reducing.
Encode a mastered WAV to tagged MP3 with embedded cover art, placed in finished/ (podcast workspace)
Extract a fixed-duration sample (default 3 minutes) from a longer audio file for profiling. Auto-picks the loudest contiguous window or accepts an explicit start time. Produces a clean WAV at 48 kHz mono PCM.
Generate cover art (text-to-image or image-to-image) via Fal AI Nano Banana 2
List all registered microphone profiles in the plugin's user-data directory, with metadata and the presets bound to each.
List EQ + dynamics presets saved in the plugin's user-data directory. Shows name, use case, when created, and a one-line summary of the filter chain.
Move a finished episode from finished/ to uploaded/ with an ISO date stamp (podcast workspace)
Convert a mono file to stereo (duplicate, pan, or pseudo-stereo widening)
Scaffold a new episode folder under episodes/ with standard subfolders and a notes template (podcast workspace)
Normalize an audio file to a target LUFS (default -16, podcast standard) using ffmpeg two-pass loudnorm
Full production chain orchestrator — runs (optional denoise) → truncate-silence → EQ preset chain → loudnorm in one invocation. Two modes: clean (default, source is already clean) and noisy (adds DeepFilterNet denoise). Designed as the one-shot finisher for podcast and spoken-word recordings.
Prep an audio file for transcription pipelines (Whisper/Gemini) — optional VAD, then mono 16 kHz Opus @ 24 kbps
Analyse a mic's reference sample and write a spectral profile (F0, sibilance, mud, resonant peaks) into the plugin's user-data directory. Bound to a specific mic — re-run after changing mic, room, or capture chain.
Strip all metadata (tags, comments, cover art, encoder history) from an audio file for privacy or a clean slate
Downmix stereo (or multichannel) audio to mono
Translate a mic's voice analysis into an EQ + dynamics preset for a use case (podcast, vocals, spoken-word, broadcast). Saves the preset (mic-bound) and emits a 1-min A/B audition.
Generate episode title and description suggestions from a transcript or show notes
View or set audio metadata tags (ID3 / Vorbis / FLAC) and embed cover art
Trim leading and trailing silence from an audio file using ffmpeg silenceremove
Remove internal silences (gaps, pauses, dead air) throughout a recording using VAD. Different from trim-silence (which only strips the leading/trailing edges) and vad-segment (which splits into per-utterance files). Use to compact a long recording for podcast assembly or to tighten meandering takes.
Upscale a cover art image using Fal AI SeedVR image upscaler
Voice-activity-detect an audio file and emit a timing sidecar or per-utterance segment files
Use when Daniel wants to import call recordings from Cube Call Recorder (CCR) on his Android phone via ADB. Triggers on phrases like "import from cube", "grab the CCR recording", "pull yesterday's call from my phone", "import call recorder file", or any request to fetch an audio file produced by Cube Call Recorder from a connected Android device.
Use when the user wants to denoise audio using modern ML-based filtering. DeepFilterNet produces cleaner speech than traditional FFT/rnnoise methods — no watery/hollow artifacts. Wraps `deepFilter` binary for podcast/voice cleanup.
Use when the user wants automatic cue/chapter timestamps based on acoustic features (onsets, energy spikes, silence boundaries, beat positions, pitch changes) rather than transcript content. Wraps the `aubio` CLI tools to emit a sidecar JSON with timestamps and metadata. Complements `suggest-title-description` (which derives chapters from a transcript) by surfacing places to *look* in the audio that the transcript may not flag.
Use when Daniel wants to prep raw audio tracks for DistroKid upload — convert to FLAC, EBU R128 normalize, and group under distrokid/to-upload/<release>/. Triggers on "distrokid prep", "prep for distrokid", "prep these tracks for upload", "normalize and FLAC for distrokid".
Use when the user wants to quickly extract vocals (or isolated instrumentals) from a song or recording. Convenience wrapper around 2-stem `demucs` mode. Fastest stem separation for the common case.
Pre-generate the TTS announcement clips ("Sample 1", "Sample 2") that tune-preset and audition-preset stitch into combined comparison files so the user can identify which take is which without context-switching to a UI. Uses edge-tts (Microsoft Edge neural voices) by default; falls back to espeak-ng if edge-tts isn't installed or has no network. Idempotent — only re-runs when forced.
Provision the plugin's tools — system binaries via the host package manager, all Python tools into a plugin-owned uv venv at <data-dir>/venv/. Idempotent doctor — run before onboard or any time a command reports a missing dep. Never touches system Python or fights PEP 668.
Provision a new audio-production workspace on disk. Use when the user wants to start a new audio engineering project or podcast production repo. Accepts a workspace name and optional variant (audio-engineering | podcast). Scaffolds the workspace, personalises CLAUDE.md from the user's global memory, and (by default) creates a GitHub repo.
First-run setup for the audio-production plugin. Provisions the persistent user-data directory, registers the user's primary microphone, captures a 3-min sample, profiles it, seeds default EQ presets, and produces 1-min A/B auditions. Run once before using profile-voice, suggest-eq, or apply-preset. Re-run any time to refresh.
Use when the user wants to separate an audio file into vocal and instrumental stems (or 6-stem separation). Wraps `demucs` for isolating drums, bass, vocals, and other instruments. Common uses — karaoke prep, podcast remixing, music production.
Use when the user wants the silence-cut decisions as an editable timeline (Kdenlive / Final Cut Pro / Premiere / Shotcut XML) rather than a baked audio file, so they can review and adjust cuts in a video/audio editor before committing. Wraps `auto-editor` with a timeline export target.
Use when the user wants to remove silent sections from a recording with real cuts (not just collapsed gaps), producing a tightened audio file. Wraps `auto-editor` for podcast/voice cleanup. Distinct from `truncate-silence` — that collapses internal silences via ffmpeg `silenceremove`; this performs threshold-driven edit decisions with margin/padding and crossfades, more aggressive at finding good cut points.
Use when the user wants to speed up or slow down audio while preserving pitch — podcast tightening (1.05–1.15× is common), slow-talker correction, fitting an episode to a target duration, or de-chipmunking a sped-up source. Wraps `rubberband-cli` for high-quality time-stretch; falls back to ffmpeg `atempo` if rubberband isn't installed.
Iteratively A/B tune an EQ preset to the user's taste. Each round renders two 15-second variants of the user's mic sample (with side-by-side spectrograms), the user picks A or B (or describes what they want changed), and the skill perturbs the preset accordingly until the user is happy. Then saves the winner. Use after suggest-eq when the heuristic preset doesn't quite land.
Voice activity detection (VAD) — detect speech regions in an audio file and either emit a timing sidecar (JSON/CSV) or split the file into per-utterance segments. Use when the user wants to chunk long recordings for transcription, trim non-speech regions, find silence boundaries between speakers, or preprocess audio for a pipeline that expects short utterances. The cluster's namesake primitive — globally reachable from any cwd.
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Efficient skill management system with progressive discovery — 410+ production-ready skills across 33+ domains
Open-source, local-first Claude Code plugin for token reduction, context compression, and cost optimization using hybrid RAG retrieval (BM25 + vector search), reranking, AST-aware chunking, and compact context packets.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Manus-style persistent markdown files for planning, progress tracking, and knowledge storage. Works with Claude Code, Kiro, Clawd CLI, Gemini CLI, Cursor, Continue, Hermes, and 17+ AI coding assistants. Now with Arabic, German, Spanish, and Chinese (Simplified & Traditional) support.
Agent Skills for AI/ML tasks including dataset creation, model training, evaluation, and research paper publishing on Hugging Face Hub
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claim