Help us improve
Share bugs, ideas, or general feedback.
From pika
Translates and dubs a video into another language while preserving each speaker's voice, with optional lipsync and bilingual subtitles.
npx claudepluginhub pika-labs/pika-plugins --plugin pikaHow this skill is triggered — by the user, by Claude, or both
Slash command
/pika:language-swap <video-url> --to <language> [--no-lipsync] [--no-bgm] [--bilingual-subtitles]<video-url> --to <language> [--no-lipsync] [--no-bgm] [--bilingual-subtitles]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
<!-- source-of-truth: pika-claude-plugin/skills/language-swap -->
Translate captions into another language (or produce bilingual captions) while preserving segment count, timing, speaker labels, AND source punctuation density (no inserted em-dashes, parentheses, or bracketed glosses unless the source had them — downstream rendering shows every character). **Primary path uses this session's LLM directly — no API key, no model config.** Trigger on "translate captions", "翻译字幕", "翻译成中文/英文", "make bilingual subtitles", or "translate this" when working with caption files. CLI `lai translate run` is the secondary path for headless / oversized runs.
Translates video subtitles to any language (e.g., Hebrew, Arabic) via pipeline: transcribe audio, translate with context, refine semantically, embed RTL-safe subtitles.
Generate professional voiceover narration for a video with audio-video sync using Azure TTS by default, or Gemini 3.1 Flash TTS when configured. Use this skill whenever the user wants to add narration, voiceover, commentary, or voice dubbing to any video file — even if they just say "add audio to this video" or "make a narrated version." Also trigger when the user has a screen recording, demo, tutorial, or presentation video that needs a voice track. Trigger on Chinese requests like "视频配音", "给视频加旁白", "录屏解说", "视频加语音", "视频添加声音", "生成视频旁白", "自动配音", "视频解说词".
Share bugs, ideas, or general feedback.
Translate and dub a video into another language while preserving the original speaker's voice. Pipeline: dub (one worker call) → lipsync (default ON) → burn target-language captions or bilingual captions.
The dubbing worker does the heavy lifting in a single call: it transcribes, translates, preserves each speaker's voice server-side (no separate clone step), and returns a fully A/V-synced video — so there is no manual transcribe/clone/TTS/replace chain to manage and no duration-drift handling to do by hand.
Use this when the user wants different languages on different parts of one video (e.g. first half Spanish, second half Japanese), or wants to translate only some sections and keep the rest in the original voice. Both are the same thing: a timeline of segments, each tagged with a language; any uncovered range keeps the original audio.
mcp__plugin_pika_pika__dub_video takes a segments plan instead of target_language (pass exactly one — they are mutually exclusive):
mcp__plugin_pika_pika__dub_video(source_video_url=<video_url>, segments=[
{start_s: 0, end_s: 30, target_language: "es"},
{start_s: 30, end_s: 60, target_language: "ja"}
])
How to build the plan: the user needs to know where the content is before they can pick ranges, so transcribe first — extract the audio with mcp__claude_ai_pika__extract_audio_from_video, then mcp__claude_ai_pika__transcribe_audio(audio=<audio_url>, timestamps=true), show the user the timestamped segments, and let them say which time range goes to which language. Then assemble segments[] (seconds, ordered, non-overlapping) and make ONE dub_video call. There is no separate "video understanding" tool — the timestamped transcript is the understanding step.
Behavior of the segmented path:
dub_video call. You never clone or delete a voice yourself.references/language-coverage.md.target_language echoes the covered languages comma-joined (e.g. "spa,jpn"), and no single transcript_language is returned (the track is multi-language). Lipsync (Step 2, default ON, ≤5 min) still runs on the whole dubbed video. For captions (Step 3), use the returned multi-language subtitles[] in caption_mode="manual"; auto re-transcription can't pick a single language for a mixed track.If mcp__plugin_pika_pika__dub_video rejects segments (older deployment without segmented support), fall back to dubbing each range single-language and concatenating — but prefer the one-call segmented path when available.
--to <language>. Prefer language codes: es, fr, ja, de, pt-BR, zh-Hans. The dubbing worker accepts ISO/BCP-47-like tags and normalizes script/region subtags before calling ElevenLabs (for example zh-Hans → zh; zh-Hant-TW → zh).--no-lipsync to skip it when the source has no on-camera face or to avoid the meaningful cost (~$4/min on the sync-2-pro tier). Applies only to videos ≤5 min — edit_lipsync hard-caps at 300 s upstream, so longer sources auto-skip lipsync (see Step 2); the dub itself has no length limit.--no-bgm for a translate-only output: the worker drops the original music and keeps only the translated speech (drop_background_audio=true).--bilingual-subtitles or asks for "bilingual subtitles", "dual subtitles", "two-language captions", "original + translated subtitles", "双语字幕", or "原文+译文字幕".references/language-coverage.md. Do not proactively surface provider-specific language-list details in normal user replies.video_url: input — from positional argsource_input_url: original positional URL — preserved for diagnostics if video_url is rehostedtarget_language: text — from --to <language>with_lipsync: boolean — defaults true; false only when --no-lipsyncno_bgm: boolean — true when --no-bgm (maps to drop_background_audio=true)bilingual_subtitles: boolean — true when the user asks for bilingual / dual subtitlesdubbed_video_url: dubbed, A/V-synced video — produced by Step 1dub_subtitles: optional target-language timed subtitles from the dub result — consumed by Step 3source_subtitles: optional source-language timed subtitles from the dub result — consumed by Step 3 for bilingual captionsdub_transcript_srt: optional target-language SRT from the dub result — returned for review/debuggingsource_transcript_srt: optional source-language SRT from the dub result — returned for review/debuggingsource_transcript_language: optional source-language code from the dub resultlipsynced_video_url: dubbed video with mouth re-matched — produced by Step 2 (when lipsync runs)caption_target_video_url: final visual video URL before captions are burnedfinal_video_url: video with target-language captions burned in — produced by Step 3Required:
video_url — MUST be https://...--to <language> — target language (free-text or BCP-47 code)Optional:
--no-lipsync — skip the default mouth-matching step.--no-bgm — translate-only output; drop the original music/SFX bed.--bilingual-subtitles — burn source-language + target-language subtitle rows.Infer bilingual_subtitles=true from user wording even if the explicit flag is absent.
If --to is missing, STOP and prompt the user — UNLESS the user wants different languages on different parts, or to translate only some sections: that is the per-range segmented path (see "Segmented / multi-language dub" above), which uses a segments plan instead of --to.
For the segmented path, first build the time-range plan: extract the audio with mcp__claude_ai_pika__extract_audio_from_video, then transcribe it with timestamps via mcp__claude_ai_pika__transcribe_audio(audio=<audio_url>, timestamps=true), show the user the timestamped segments, and capture which time range maps to which language into segments[].
Outputs: video_url, target_language, with_lipsync (default true), no_bgm (default false), bilingual_subtitles (default false).
dubbed_video_url)Call mcp__plugin_pika_pika__dub_video with:
source_video_url — <video_url>target_language — <target_language> (ISO/BCP-47-like tag, e.g. es, pt-BR, zh-Hans)source_language — "auto"drop_background_audio — true only when no_bgm is set; otherwise omit (keeps the original music bed)In Claude plugin installs the tool is exposed as mcp__plugin_pika_pika__dub_video. If your host exposes the same Pika server under a different local namespace, call that fully-qualified local tool with the same arguments. The Claude.ai connector surface may lag this plugin-only tool, so do not assume the connector prefix has it.
mcp__plugin_pika_pika__dub_video is worker-backed: if the response comes back as {task_id, status}, poll mcp__plugin_pika_pika__task_status until completed, then read the dubbed video from the result (video_url for a video source; audio_url for an audio source). Also capture optional subtitles[], transcript_srt, and transcript_language — these are target-language transcript metadata the dub worker produced, consumed in Step 3.
For bilingual captions, also capture optional source_subtitles[], source_transcript_srt, and source_transcript_language. These source-language transcript fields are best-effort. The dubbed media is still valid when transcript fields are absent.
Source not worker-fetchable: if mcp__plugin_pika_pika__dub_video fails because the source URL cannot be fetched — especially HTTP 403 / 4xx, hotlink protection, UA-gated hosts (Wikimedia/news CDNs), or "Access Denied" errors — do not keep retrying the same call. Rehost first:
mcp__claude_ai_pika__upload_asset with the downloaded filename, MIME type, and exact byte size, then upload the bytes to the returned presigned URL.source_input_url = <original URL> and replace video_url with the returned Pika CDN public_url. Do not construct CDN URLs manually.video_url.If the client/host also cannot download the source bytes, stop and tell the user the host blocks direct fetch; ask them to upload the file or provide a different URL.
Outputs: dubbed_video_url, dub_subtitles, source_subtitles, dub_transcript_srt, source_transcript_srt, source_transcript_language.
lipsynced_video_url)Default ON. Skip entirely when --no-lipsync is passed (then Step 3 captions dubbed_video_url directly).
Hard 5-minute cap — check duration before calling. mcp__claude_ai_pika__edit_lipsync enforces a 300-second (5-minute) audio limit upstream (sync.so) and rejects anything longer with invalid_input before billing; every variant tier shares the same cap, so falling back through tiers does NOT help. If the dubbed video's duration_seconds (returned by Step 1) is > 300, skip lipsync entirely, go straight to Step 3 captioning dubbed_video_url, and tell the user lipsync isn't available past 5 minutes (the dub itself works at any length). Only run the lipsync call below when duration_seconds ≤ 300.
Cost heads-up first. Lipsync is the dominant cost (~$4/min on the v2-pro tier). Before calling it, estimate from the dubbed video's duration_seconds (returned by Step 1) — ceil(duration_seconds / 60) × $4 — and send the user a one-line heads-up, e.g. "Lipsync on — ~2 min video, est. ~$8 (pass --no-lipsync to skip). Starting now." Then proceed straight into the call; this is a heads-up, not an approval gate.
Call mcp__claude_ai_pika__edit_lipsync(video_url=<dubbed_video_url>) with no audio_url — the worker syncs to the dubbed video's own embedded translated audio. Do not extract the audio just to feed it back in. (variant defaults to v2-pro, with sync-3 / v2 as fallbacks.)
Outputs: lipsynced_video_url (read from url of response). When this step runs, Step 3 captions this video, not dubbed_video_url — otherwise the lip-matching is dropped.
final_video_url)Caption the final video so the output carries readable subtitles (matches the common "translate + subtitle" expectation). Set caption_target_video_url to lipsynced_video_url when lipsync ran (the default), or dubbed_video_url when --no-lipsync skipped it.
If this request is part of a Double video / split-screen comparison flow, build that Double video first and set caption_target_video_url to the final composed video URL. Do not burn captions onto only one panel before the Double video is composed; the bilingual caption burn should happen once, on the final visual output.
Call mcp__claude_ai_pika__add_captions once on caption_target_video_url.
When bilingual_subtitles=true, use manual bilingual mode if both tracks are available: call mcp__claude_ai_pika__add_captions(video_url=<caption_target_video_url>, caption_mode="manual", subtitles=<dub_subtitles>, secondary_subtitles=<source_subtitles>, language=<target_language>, secondary_language=<source_transcript_language if available>, secondary_subtitles_position="below", style="branded-space-mono", position="bottom"). The target-language (translated) row is the primary subtitles and renders on top; the source-language (original) row is the secondary reference and renders below it (secondary_subtitles_position="below") — after dubbing the translated speech is what's actually spoken, so it leads. It works for every dub worker provider branch as long as mcp__plugin_pika_pika__dub_video returns both subtitle tracks.
If bilingual_subtitles=true but source_subtitles is missing, fall back to target-language captions only and tell the user the source transcript was unavailable from the dubbing provider. Do not invent a source-language row by retranscribing the final dubbed audio; that audio is already in the target language.
For target-language-only captions, prefer the target-language subtitles the dub worker already returned: if dub_subtitles is non-empty, call mcp__claude_ai_pika__add_captions(video_url=<caption_target_video_url>, caption_mode="manual", subtitles=<dub_subtitles>, language=<target_language>, style="branded-space-mono", position="bottom"). Manual mode skips a duplicate transcription pass and preserves the dubbing provider's target-language text.
If dub_subtitles is missing, empty, or rejected by mcp__claude_ai_pika__add_captions, fall back to auto: call mcp__claude_ai_pika__add_captions(video_url=<caption_target_video_url>, caption_mode="auto", language=<target_language>, style="branded-space-mono", position="bottom"). Auto mode re-transcribes the dubbed audio; use it only as the fallback because it costs extra time and can introduce CJK/proper-noun drift.
Use style="branded-space-mono" unless the user asks for a punchier style (tiktok / hormozi / karaoke). Skip this step only if the user explicitly asked for audio-only dubbing with no captions.
Outputs: final_video_url (read from url of response).
Reply with final_video_url + the translated transcript (from dub_transcript_srt / the dub result) for user review.
Offer a bilingual-subtitle version. When this run burned target-language-only captions (bilingual_subtitles=false) and a source transcript is available (source_subtitles is non-empty), close the reply by asking whether the user also wants a dual-subtitle version, e.g. "Want a bilingual version with the original + translated subtitles stacked? I can add it." If they say yes, re-run Step 3 in bilingual manual mode on the same caption_target_video_url (the pre-caption visual video) — no re-dub or re-lipsync is needed, only the caption burn changes — and return the new final_video_url. Skip the offer when bilingual captions were already burned (bilingual_subtitles=true), or when source_subtitles is missing — without a source transcript a bilingual version can't be produced (the dubbed audio is already in the target language), so do not offer what can't be delivered.
| Class | Trigger | Mitigation | Fallback |
|---|---|---|---|
| Source URL not worker-fetchable | mcp__plugin_pika_pika__dub_video returns 403 / 4xx, hotlink / UA-gated fetch failure, or "Access Denied" for a public HTTPS URL | Download source bytes in the client/host environment, mcp__claude_ai_pika__upload_asset them to Pika, replace video_url with the Pika CDN URL, then retry Step 1 once | If local download also fails, ask the user to upload the file or provide a different URL |
| Extra target language | Target is Cantonese (yue / cantonese / zh-HK), Thai, Hebrew, Persian, Slovenian, Catalan, Norwegian Nynorsk, or Afrikaans | Supported — call mcp__plugin_pika_pika__dub_video with the target as usual; the original speaker's voice is kept | Background music isn't preserved for these languages (dubbed speech only) |
| Dub call fails (not fetchability) | mcp__plugin_pika_pika__dub_video errors for another reason — unsupported target language, provider/worker 5xx, status: failed from mcp__plugin_pika_pika__task_status | Surface the error to the user; if the message points at the language, check references/language-coverage.md and suggest a supported tag; otherwise suggest a retry. There is no manual chain to fall back to — dub is the single path | None — return the error, do not silently produce a non-dubbed video |
| Dub returns no speech | Silent video — nothing to translate | Surface to user: "no detectable speech in video — nothing to translate" | None |
| Original voice can't be kept | For the languages above, the source is too short or noisy to keep the original speaker's voice | Surface the error and ask the user for a cleaner / longer source clip | None — the dub fails rather than using a different voice |
| Lipsync source too long | Dubbed video >5 min — mcp__claude_ai_pika__edit_lipsync rejects with invalid_input (sync.so 300 s cap); all variant tiers share the cap so retrying won't help | Check duration_seconds from Step 1 first and skip lipsync when >300; caption dubbed_video_url directly and tell the user lipsync caps at 5 min | Dubbed video, no lip-match |
| Lipsync step fails | mcp__claude_ai_pika__edit_lipsync errors (no clear face track, provider 4xx) | Fall back through variant tiers (v2-pro → sync-3 → v2); if all fail, return the dubbed video without lip-matching and tell the user | Audio-replaced video, no lip-match |
| Captions wrong language | Step 3 auto-transcription mis-detects language | Pass explicit language tag; if dub_subtitles exists, use caption_mode="manual" with it instead of auto | Manual subtitles[] |
| Bilingual source row unavailable | User asked for bilingual subtitles but source_subtitles is absent | Use target-language captions and explain the source transcript was unavailable | Target-language captions only |
Primary target: Claude Code. Uses standard MCP tools only. Works on Codex / Cursor / Claude Desktop.