From audio-production
Use when the user wants automatic cue/chapter timestamps based on acoustic features (onsets, energy spikes, silence boundaries, beat positions, pitch changes) rather than transcript content. Wraps the `aubio` CLI tools to emit a sidecar JSON with timestamps and metadata. Complements `suggest-title-description` (which derives chapters from a transcript) by surfacing places to *look* in the audio that the transcript may not flag.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin audio-productionThis skill uses the workspace's default tool permissions.
Acoustic cue detection — find timestamps where *something happens* in the audio, independent of what was said. Useful for:
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Share bugs, ideas, or general feedback.
Acoustic cue detection — find timestamps where something happens in the audio, independent of what was said. Useful for:
Do not use this skill when:
suggest-title-description.silence-cut / silence-cut-edl.onset (default) — generic note/event onsets. Best for spoken-word topic shifts.beat — beat detection. For music or rhythmic material.pitch — pitch track. Returns a stream, not discrete cues; reduce to cue points by detecting large jumps.default, energy, hfc, complex, phase, specdiff, kl, mkl, specflux. Default = default (HFC-based). For voice, energy or hfc work well; for music, complex or specflux.0.3. Lower = more cues (more sensitive); higher = only strongest events.2.0 for voice, 0.3 for music/beat.<input-stem>.cues.json next to the input.Verify the relevant aubio binary is on PATH (which aubioonset / aubiotrack / aubiopitch).
Build the command per mode:
onset:
aubioonset -i "<input>" -O <method> -t <threshold> -M <min-interval>
Output is one timestamp per line (seconds, float).
beat:
aubiotrack -i "<input>"
Output is one beat-time per line.
pitch:
aubiopitch -i "<input>"
Output is two columns: <time> <pitch-hz>. To reduce to cues, post-process: emit a cue every time the pitch changes by more than N semitones from the running median, separated by ≥ min-interval.
Parse the output and assemble JSON:
{
"input": "<absolute-path>",
"duration": <seconds>,
"mode": "<onset|beat|pitch>",
"params": { "method": "...", "threshold": 0.3, "min_interval": 2.0 },
"cues": [
{ "t": 12.34, "label": "onset" },
{ "t": 47.81, "label": "onset" },
...
],
"count": <n>
}
For pitch mode, label each cue with the detected pitch in Hz and a coarse note name (optional).
Write JSON to the output path.
<input> (<duration>s) → <output> (<n> cues, mode <mode>).-O energy and a higher -M (2–5s) — voice has many micro-onsets you don't want as cues./audio-production:denoise) — background noise inflates onset counts.assemble-episode for crossfade alignment, and by suggest-title-description as a complement to transcript-derived chapters.