From claude-transcription
Cluster unique voices in an audio recording and extract a short sample of each to a file, then prompt the user to label them. Feeds diarization in downstream transcription. Use when the user asks to identify speakers, extract voice samples, prep for diarization, or label voices.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin claude-transcriptionThis skill uses the workspace's default tool permissions.
Cluster distinct voices in a recording, save a representative clip of each, and ask the user to put names to voices. The labeled samples feed diarized transcription.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Cluster distinct voices in a recording, save a representative clip of each, and ask the user to put names to voices. The labeled samples feed diarized transcription.
pyannote.audio (best quality, requires HuggingFace token for pyannote/speaker-diarization-3.1)
uv run --with pyannote.audio --with torch --with torchaudio python <<'PY'
from pyannote.audio import Pipeline
import os, torchaudio
pipe = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1",
use_auth_token=os.environ["HF_TOKEN"])
diar = pipe("INPUT.wav")
# for each SPEAKER_XX, save a 5s clip from their first turn >5s
PY
resemblyzer (lighter, no auth) — fallback if no HF token.
speakers/<source_stem>_speaker_<id>.wav via ffmpeg.speakers/<source_stem>_speakers.json with {"speaker_1": {"sample": "...", "label": null}, ...}.User updates the JSON or renames the sample files. Downstream transcription skills (e.g. transcribe-assemblyai) consume _speakers.json for speaker name mapping.
Directory: speakers/ next to the source audio.
Files: <stem>_speaker_1.wav, <stem>_speaker_2.wav, ..., <stem>_speakers.json.