From video-editing
Clean up a raw whisper transcript (SRT or TXT) — strip filler words, fix recurring mistranscriptions via a per-project glossary, and (optionally) re-flow line breaks. Use when the user says "clean this transcript", "remove the ums", "fix the transcription", "tidy up the SRT before burning". Pairs with burn-subtitles (run before burning) and generate-deliverables.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin video-editingThis skill is limited to using the following tools:
Two-stage cleanup:
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Two-stage cleanup:
Output is a parallel file: <basename>.clean.srt (or .clean.txt). Original is never modified.
| Field | Default |
|---|---|
| Source | required (.srt or .txt) |
| Mode | heuristic (default) / llm / both |
| Glossary | <project>/subtitles/glossary.json if present; or path supplied inline |
| Output | sibling <basename>.clean.<ext> |
Default list (case-insensitive, word-boundary):
um, uh, erm, ah, like, you know, sort of, kind of, basically, literally, I mean, right
Don't blanket-strip — like and right have legitimate uses. Strip only when surrounded by sentence-internal commas / mid-cue ("..., um, ..."). For SRT input, operate per-cue text without touching the index or timing lines.
# Per-cue text only — preserve "N\n00:00:01,000 --> 00:00:03,000\n..." structure
awk 'BEGIN{RS="\n\n"; ORS="\n\n"} {
n=split($0, lines, "\n");
for(i=3;i<=n;i++) {
gsub(/\<(um|uh|erm)\>[ ,]*/, "", lines[i])
gsub(/ +/, " ", lines[i])
}
for(i=1;i<=n;i++) printf "%s%s", lines[i], (i==n?"":"\n")
}' "$SRC" > "$OUT"
Collapse adjacent identical short tokens: the the cat → the cat. Only collapse 1–4 character tokens (don't merge really really good).
glossary.json is a list of {wrong, right} pairs. Apply as literal substitutions (not regex) unless explicitly marked.
{
"substitutions": [
{ "wrong": "claud", "right": "Claude" },
{ "wrong": "ml flow", "right": "MLflow" },
{ "wrong": "kdenlive", "right": "Kdenlive", "case_insensitive": true }
]
}
Walk the array and apply with sed. For SRT, again confine to text lines.
When mode is llm or both, ask Claude to rewrite cue text in place. Constraints:
Process cues in batches of ~30 to keep context tight. After each batch, validate that cue count and timestamp lines are unchanged before writing.
Show before/after stats and a small sampled diff (5 random cues):
Source : in.srt — 142 cues, 8.2 KB
Output : in.clean.srt — 142 cues, 7.4 KB
Heuristic: removed 38 fillers, 6 stutters, applied 12 glossary subs
LLM polish: 142 cues rewritten
Sample diff (cue 27):
- Um, so the the thing is, you know, we want to..
+ The thing is, we want to..
Always validate cue count is unchanged. If it isn't, abort and keep .tmp for inspection.