AI-powered media transcription, translation, caption conversion, and audio-text alignment
npx claudepluginhub lattifai/omni-captions-skillsAI-powered media transcription, translation, and caption format conversion
Captions Made Easy — Claude Code Caption Skills
"I need bilingual captions for this Fireship vibe coding video https://youtube.com/watch?v=Tw18-4U7mts"
One sentence. Claude handles the download, transcription, and translation.
npx skills add https://github.com/lattifai/omni-captions-skills
Claude Code Plugin System:
/plugin marketplace add lattifai/omni-captions-skills
/plugin install omnicaptions@lattifai-omni-captions-skills
Local Development:
git clone https://github.com/lattifai/omni-captions-skills.git
claude --plugin-dir ./omni-captions-skills
❯ Make bilingual captions for this Fireship vibe coding video https://youtube.com/watch?v=Tw18-4U7mts
1
00:00:00,000 --> 00:00:03,200
Mass hysteria satisfies a deep human need.
群体性癔症满足了人类某种深层需求。
2
00:00:03,200 --> 00:00:07,440
Vibe coding is programming without actually writing any code yourself.
Vibe coding 就是不用自己写代码的编程方式。
| Skill | Description |
|---|---|
transcribe | YouTube/video → Markdown with timestamps |
translate | Translate captions, bilingual output supported |
convert | Convert between 30+ caption formats |
download | Download YouTube video/audio/captions |
LaiCut | Forced alignment, word-level timing accuracy |
Invoke via
/omnicaptions:transcribeor/omnicaptions-transcribe
Standard transcription gives "approximate" timestamps. LaiCut uses LattifAI Lattice-1 model to match text precisely to audio waveforms, achieving word-level accuracy.
Install LaiCut:
# Using uv (recommended, auto-configures package index)
uv pip install "omni-captions-skills[laicut]" --extra-index-url https://lattifai.github.io/pypi/simple/
# Using pip
pip install "omni-captions-skills[laicut]" --extra-index-url https://lattifai.github.io/pypi/simple/
Supported languages: English, Chinese, German, and mixed
Recommended workflow: Align before translate (translated text doesn't match original audio)
| Feature | API Key | Note |
|---|---|---|
| Translation | None required | Uses Claude by default, works out of the box |
| Transcription | Gemini API | Optional, only needed for transcription |
| LaiCut alignment | LattifAI API | Optional, only needed for precise alignment |
Gemini is only used for video transcription. When a video has no captions, you'll be prompted whether to transcribe — configure then. Translation uses Claude by default, works out of the box.
API keys are prompted automatically and saved to ~/.config/omnicaptions/config.json
# With captions: download → align → translate
omnicaptions download "https://youtube.com/watch?v=xxx"
omnicaptions LaiCut video.mp4 video.en.vtt -o video_LaiCut.srt
omnicaptions translate video_LaiCut.srt -l zh --bilingual
# Without captions: transcribe → align → translate
omnicaptions transcribe video.mp4
omnicaptions LaiCut video.mp4 video_GeminiUnd.md -o video_LaiCut.srt
omnicaptions translate video_LaiCut.srt -l zh --bilingual
Credits: @dotey for the transcription prompt | Built on lattifai-captions