Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By lattifai
Audio-text alignment, transcription, translation, karaoke, and subtitle toolkit. Built on the Agent Skills standard — works in Claude Code, Codex CLI, Gemini CLI, and any agent that loads SKILL.md files. Powered by the LattifAI Lattice-1 forced-alignment model.
npx claudepluginhub lattifai/lattifai-skills --plugin lattifai-skillsAlign existing captions to audio/video with word-level precision using the Lattice-1 model. Trigger when the user has both a media file AND a caption/transcript that need to be synchronized, or says "fix caption timing", "字幕对不上", "对齐字幕", "word-level timestamps", "karaoke timing", "timestamps are off". Do NOT trigger without existing text — use `/lai-transcribe` first.
Convert between 30+ caption/subtitle formats (SRT, VTT, ASS, JSON, TextGrid, LRC, FCPXML, Premiere, …) and shift timing. Trigger on "convert captions", "SRT to VTT", "转换字幕格式", "shift timing", "ASS styling", "karaoke effect", "导入Premiere", or any caption-format question. Do NOT trigger to fix timing accuracy (`/lai-align`) or translate (`/lai-translate`).
Identify speakers ("who said what") in aligned captions via pyannote.audio. Real speaker names come from the agent's own reasoning over transcript + context (default), with a CLI-LLM fallback for headless runs. Trigger on multi-speaker content (podcasts, interviews, meetings) or phrases like "diarize", "speaker detection", "说话人识别", "区分说话人", "label the speakers". Requires aligned captions — run `/lai-align` first.
Generate viral-ready karaoke / lyric-video subtitles with per-word highlighting in one command. Adaptive font size from media resolution. Ready-made style presets for TikTok, Reels, Shorts, 抖音, 小红书, YouTube, and cinematic long-form. Trigger on "karaoke 字幕", "卡拉OK", "lyric video", "逐词歌词", "按字高亮", "TikTok 字幕", "抖音字幕", "Reels caption", "Shorts caption", "小红书歌词", "viral subtitles", "爆款字幕", or when a song / podcast / video clip is given and per-word highlighted captions are asked for. For general caption format conversion without karaoke styling, use /lai-caption.
Install LattifAI and get a free trial API key. Trigger on first mention of LattifAI / `lai`, authentication errors, trial requests, or "how do I get started with alignment" / "我想做字幕对齐" before `lai --version` is verified. Do NOT trigger when already authenticated — route to `/lai-align`, `/lai-transcribe`, etc.
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
AI-powered media transcription, translation, and caption format conversion
Claude Code skill pack for Deepgram (24 skills)
Transcribe audio/video to SRT subtitles using ElevenLabs Scribe v2. Use for: transcription, subtitles, captions, SRT generation.
Universal Translator — media generation tools for audio, image, video, and visualization
Generate Korean podcast episodes from any source (URLs, tweets, articles, PDFs) with OpenAI TTS and auto-upload to YouTube
Extract subtitles/transcripts from YouTube videos via CLI or browser automation
AI-powered media transcription, translation, and caption format conversion
Audio-text alignment, transcription, translation, karaoke, and subtitle toolkit for Claude Code — and also installable in OpenAI Codex CLI and Gemini CLI. Powered by the LattifAI Lattice-1 forced-alignment model.
English | 中文
Each skill follows the Agent Skills standard — a self-contained SKILL.md per capability that any compatible agent can discover and load. Verified install paths for Claude Code, Codex CLI, and Gemini CLI are documented below.
9 composable skills cover the full pipeline from a raw YouTube URL to multilingual, per-word-highlighted, production-grade captions.
Pick the flow that matches your agent.
/plugin marketplace)This repo ships both a plugin manifest and a plugin marketplace at the root. Inside any Claude Code session:
/plugin marketplace add lattifai/lattifai-skills
/plugin install lattifai-skills@lattifai-skills
/reload-plugins
Or non-interactively:
claude plugin marketplace add lattifai/lattifai-skills
claude plugin install lattifai-skills@lattifai-skills
All skills become available under the plugin namespace:
/lattifai-skills:lai-karaoke https://youtu.be/VIDEO_ID
codex plugin marketplace)Codex CLI accepts the same <owner>/<repo> source format and reads the .claude-plugin/marketplace.json at the repo root, so registration is one command:
codex plugin marketplace add lattifai/lattifai-skills
The marketplace is registered in ~/.codex/config.toml and skills become available in your next Codex session — no separate install step.
gemini skills install)Gemini CLI installs skills directly from a git URL. Because the skills live under skills/<name>/SKILL.md (not at the repo root), pass --path skills to install all 9 in one call:
# Install all 9 skills from the bundle
gemini skills install https://github.com/lattifai/lattifai-skills --path skills
# Or for live development against a local checkout (auto-reloads on edit)
git clone https://github.com/lattifai/lattifai-skills.git
gemini skills link ./lattifai-skills/skills
SKILL.md directory)Each skills/<name>/ directory is a self-contained skill following the Agent Skills standard — copy whichever ones you want to wherever your agent watches:
git clone https://github.com/lattifai/lattifai-skills.git
mkdir -p .claude/skills && cp -r lattifai-skills/skills/lai-karaoke .claude/skills/
Common destinations:
| Agent | Drop-in path |
|---|---|
| Claude Code (project) | .claude/skills/<name>/SKILL.md |
| Claude Code (personal) | ~/.claude/skills/<name>/SKILL.md |
| Cursor / Continue / custom agents | wherever your agent loads skills from |
Edits to .claude/skills/ are picked up live within the current Claude Code session — no reload command needed. Other agents follow their own conventions (/reload-plugins in CC for plugin-dir / marketplace sources; gemini skills enable/disable for Gemini; restart for agents without hot-reload).
# Claude Code:
/plugin marketplace update lattifai-skills
# Codex CLI:
codex plugin marketplace upgrade lattifai-skills
# Gemini CLI:
gemini skills install https://github.com/lattifai/lattifai-skills --path skills
Different skills have different dependencies — install what you need.
Important: the
lattifaipackage depends onlattifai-core, which is hosted on the LattifAI PyPI mirror. Always include--extra-index-url https://lattifai.github.io/pypi/simple/when installing.
# pip (default, takes 5–15 min on first install)
pip install "lattifai[all]" --extra-index-url https://lattifai.github.io/pypi/simple/
# uv — recommended for ~10–15× faster install (≈1 min on broadband)
uv pip install "lattifai[all]" --extra-index-url https://lattifai.github.io/pypi/simple/
uv pip install --reinstall-package lattifai "lattifai[all]" \
--extra-index-url https://lattifai.github.io/pypi/simple/
# (The second uv command is a workaround for a known entry-point name
# conflict — re-installs lattifai last so its `lai` console script wins.
# See `/lai-setup` for details.)