Help us improve
Share bugs, ideas, or general feedback.
From pika
Generates cinematic 1080p iOS app teaser videos from App Store screenshots, enhanced with GPT-image-2. Sources screens from App Store, websites, or user files, and outputs a beat-driven teaser with brand logo and COMING SOON overlay.
npx claudepluginhub pika-labs/pika-plugins --plugin pikaHow this skill is triggered — by the user, by Claude, or both
Slash command
/pika:app-sizzle <app-name-or-url> [screens=<app-store-url|website-url|paths>] [logo=<path-or-url>] [aspect=16:9|9:16|1:1]<app-name-or-url> [screens=<app-store-url|website-url|paths>] [logo=<path-or-url>] [aspect=16:9|9:16|1:1]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate a polished 15-second app teaser from real app screens. Each selected screen is passed through GPT-image-2 before Seedance so compressed captures become cleaner references without inventing UI.
Generates App Store preview videos using Remotion (React) with agent team for storyboarding, scripting, coding, review, and frame inspection.
Plans and produces demo videos, GIFs, screenshots, and Remotion programmatic videos for product launches. Covers scripting, recording, best practices, and platform formats.
Turns screen recordings into polished app demo videos with AI voiceover, device frames, background music, and FFmpeg-based assembly.
Share bugs, ideas, or general feedback.
Generate a polished 15-second app teaser from real app screens. Each selected screen is passed through GPT-image-2 before Seedance so compressed captures become cleaner references without inventing UI.
Generation contract: use resolution="1080p", duration=15, and sound=True. Skip fast=true because it caps Seedance at 720p. The skill owns duration and sound so the user only has to supply app identity, screens, logo, and aspect ratio.
The visual aesthetic is derived from the app's personality — not defaulted to liquid glass. The agent reads the app's soul from its icon, screenshots, and category, then chooses a treatment. The user provides the app identity and assets; the agent decides everything else (mode, prompt, camera, style).
Primary: mcp__pika__generate_reference_video(provider="seedance", resolution="1080p") with 3–5 screenshots + the app icon/logo as the final reference.
Fallback to provider="kling", quality_mode="pro" (= 1080p) when:
partner_validation_failed (celebrity faces, screen-recording UI)insufficient_balanceseedance timed out after ...Do not treat generated-audio moderation as an immediate Kling fallback. See the Seedance generated-audio moderation recovery runbook in Generate Video first.
Kling prompt uses <<<image_1>>> … <<<image_5>>> tokens instead of @Image1 … @Image5. Drop the resolution param (Kling uses quality_mode instead). See Gotchas.
If invoked with empty args and no relevant prior context, print this menu verbatim and stop. Do not call tools until the user supplies the app identity and screen source.
To make your app promo, I need:
1. App name + one-line description of what it does
(e.g. "Nova — an AI journaling app for iOS")
2. Where should I pull the app screens from?
— iOS App Store: give me the App Store URL or app name → I'll use
`mcp__pika__fetch_appstore_screens` to fetch screenshots, metadata, and icon
— Web app / website: give me the URL → I'll capture it with Pika MCP
— Local files / URLs: drop the paths and I'll upload them
3. Brand logo — path or URL (preferred) or skip to use the App Store icon
The logo anchors the end card and prevents Seedance from hallucinating brand text.
If you don't have a logo file, use the fetched App Store icon as the fallback.
4. Aspect ratio: 16:9 (landscape/YouTube) / 9:16 (Reels/TikTok) / 1:1
If the trigger message or prior context already supplies part of this, ask only for the missing required fields before touching any tool. These are the only questions the user needs to answer; the agent decides mode, prompt, camera, and style.
Once answered, the agent:
Before calling any generation tool, verify both assets are in hand:
| Asset | Required | If missing |
|---|---|---|
| Real app screenshots (≥1 actual sourced image) | Yes | Stop and ask for screenshots |
| Brand logo OR app icon | Yes | Use the mcp__pika__fetch_appstore_screens icon when App Store sourcing is used; otherwise stop and ask for a logo/icon |
If either is missing, tell the user exactly what's needed and wait. Real assets are what keep the teaser grounded; text-to-video placeholders make Seedance invent UI.
Avoid:
The only acceptable path forward is real assets from the user. If MCP fetching or capturing failed (App Store returned nothing, website screenshot errored), report what happened and ask the user to provide the screens manually. Never invent them.
Use Pika MCP mcp__pika__fetch_appstore_screens; do not use a local scraper. It accepts a full App Store URL, numeric app ID, or app-name search term:
fetch_appstore_screens(
query: <app_store_url | numeric_app_id | search_term>,
country: "us",
max_screens: 10,
include_icon: true
)
Expected result shape:
{
"app_url": "https://apps.apple.com/...",
"metadata": { "name": "...", "subtitle": "...", "description": "...", "category": "...", "icon_url": "https://..." },
"icon": { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/...", "filename": "appstore-icon.png", "mime_type": "image/png", "width": 1024, "height": 1024 },
"screenshots": [
{ "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/.../1290x2796bb.png", "filename": "appstore-screen-01.png", "mime_type": "image/png", "width": 1290, "height": 2796 }
],
"count": 1
}
If mcp__pika__fetch_appstore_screens returns no screenshots, report the error and ask the user for 3-5 real screenshots plus a logo/icon. Do not fall back to Playwright/headless App Store capture and do not invent UI.
After App Store assets are fetched, pick the 3–5 screens that show the core UI. Skip:
Screen selection principle — maximize visual contrast. Each selected screen should look as different as possible from the others: dark vs. light background, UI-dense vs. photo-heavy, micro close-up vs. wide grid, minimal vs. busy. If all your screens look similar, Seedance blends them into a visual mush. What made Dazz Cam work: 3D camera grid + Polaroid output + VHS panels + fisheye orb — four completely distinct visual worlds. What makes teasers fail: four screens of the same UI at slightly different scroll positions.
Use Pika MCP's capture tool:
capture_website(url="https://example.com", mode="screenshot")
# Returns image_url — use directly as a reference
Call once per distinct page/view you want to include.
User provides paths → upload each via Pika MCP (see Asset Upload section below).
After sourcing screens, read every screenshot using Claude's vision before writing a single word of prompt. This is the most important step — skip it and you get a generic glass blob with no story.
For each screenshot, record:
Also pull the app metadata from the mcp__pika__fetch_appstore_screens result, or from the user-provided description:
Output of Stage 1: A numbered feature map:
Screen 1 — [filename]: Shows [X UI]. Represents [Y feature]. Moment: [hook/build/reveal].
Screen 2 — [filename]: ...
...
After the map, score each screen for visual uniqueness: does it look completely different from the others you've mapped? Prefer screens with distinct color palettes, distinct layout density, and distinct subject matter. A great set has maximum visual spread — the hook should feel nothing like the build, which should feel nothing like the reveal.
Do NOT proceed to Stage 2 until this map is written out.
Every 15s promo needs a spine. Design the story arc before touching the prompt template.
| Beat | Seconds | Job | Which screen(s) |
|---|---|---|---|
| Hook | 0–3s | Grab attention — show the most dramatic UI moment or the problem being solved | The most visually striking screen |
| Build | 3–10s | Feature walkthrough in logical user-journey order | 2–3 screens in sequence |
| Reveal | 10–13s | Pull-back or product overview — the "so that's what it does" moment | Wide shot or most complete screen |
| Logo | 13–15s | Brand lock — wordmark materializes, accent color pulse | Logo (@Image6 or last ref). COMING SOON is added later as a post-generation text overlay. |
| Arc | When to use | Structure |
|---|---|---|
| Problem → Solution | Productivity/tool apps | Hook = pain point UI → Build = app solves it → Reveal = result |
| Feature Parade | Feature-rich apps | Hook = most impressive feature → Build = 2 more features → Reveal = overview |
| Journey | Consumer/lifestyle apps | Hook = entry point → Build = the experience → Reveal = outcome |
| Transformation | Before/after type apps | Hook = the "before" → Build = the process → Reveal = the "after" |
Write out the arc explicitly before generating:
Arc type: [Problem→Solution / Feature Parade / Journey / Transformation]
Hook (0-3s): Screen [N] — [what happens] — camera: [extreme close-up on X]
Build (3-10s): Screen [N] → [N] → [N] — [what each reveals] — camera: [whip pan / orbital / etc.]
Reveal (10-13s): Screen [N] — [what it shows] — camera: [pull-back to show full product]
Logo (13-15s): @Image[N] — wordmark materializes whole in a burst of [accent color] light and holds. Do not ask the video model to render the `COMING SOON` copy; it is added later as a post-generation text overlay.
Do NOT write the Seedance prompt until this arc is defined.
After the arc is defined and the 3–5 screens are selected, enhance each one with GPT-image-2 before uploading to Seedance. This lifts compressed website captures and App Store thumbnails to a cleaner, higher-fidelity reference.
For each selected screen (including the logo/end card reference):
result = generate_image(
provider="gpt-image-2",
prompt="High quality version, preserve all content exactly",
reference_images=["<original_cdn_url>"],
aspect_ratio="16:9", # match the capture — use 9:16 for portrait screens
quality="medium",
)
# use result.image_url (or result.url) as the Seedance reference
Rules:
aspect_ratio to the original capture (desktop = 16:9, mobile = 9:16).reference_images array in the Seedance call — not the originals.With the feature map (Stage 1) and arc (Stage 2) in hand, write the Seedance prompt. Every @Image description must reference the real UI content from the feature map — never write generic descriptions like "a mobile interface with controls."
Choose the template by reading the app's screenshots — don't default to liquid glass. Read exactly one of these based on the app's personality; the other never loads:
references/template-a-cinematic.md. The proven BEAT-structure template plus validated examples.references/liquid-glass.md. Template B skeleton plus glass transformation vocabulary.The accent color is always from the brand — read the icon and primary UI color, never invent one.
@Image description comes directly from the Stage 1 feature map@ImageN description as its primary brief — "a dark chat interface" vs "a VHS three-panel grid of city streets, a skate park, and a coastal sunset with retro timestamp overlays" produce completely different results. Copy the most visually specific details from your Stage 1 feature map verbatim.Primary — Seedance:
generate_reference_video(
provider="seedance",
reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"], # 3–5 screens + icon
prompt="<prompt using @Image1 … @Image5 tokens>",
resolution="1080p", # always
duration=15, # always
sound=True, # always
aspect_ratio="16:9", # or 9:16 / 1:1 per user request
seed=<int>, # set one; reuse it for content-policy recovery
)
If Seedance finishes generation and then returns a 422 whose body includes type: "content_policy_violation", reason: "partner_validation_failed", loc: ["body", "generated_video"], and msg: "Output audio has sensitive content.", treat it as a recoverable generated-audio moderation false positive.
reference_images with sound=False and the same seed.sound=True and the same seed.sound=True replay succeeds, route the recovered sound-on URL into Stage 4 as generated_teaser_url. Keep the silent probe URL only as debugging context.sound=True replay fails again, run the Kling fallback once. If Kling is unavailable, route the silent URL into Stage 4 as generated_teaser_url and explicitly note that generated-audio moderation remained flaky.Do not change the prompt, references, aspect ratio, duration, or seed during this recovery path. Changing any of them turns the silent probe into a new generation instead of testing whether only generated audio triggered moderation.
If task status remains queued or running until Seedance returns a timeout such as seedance timed out after 900s or seedance timed out after 1200s, treat it as provider queue saturation, not a prompt/content failure.
When this happens, run the Kling fallback with the same selected references, same beat structure, duration=15, sound=True, and quality_mode="pro". Convert @ImageN prompt tokens to <<<image_N>>> before calling Kling.
Do not keep retrying Seedance after a timeout unless the user explicitly asks to wait for Seedance. The timeout path has already spent the launch-demo wall-clock budget; switching provider is the documented recovery.
Fallback — Kling (non-audio partner_validation_failed or insufficient_balance):
generate_reference_video(
provider="kling",
reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],
prompt="<prompt using <<<image_1>>> … <<<image_5>>> tokens>",
quality_mode="pro", # = 1080p on Kling (NOT resolution=)
duration=15,
sound=True,
aspect_ratio="16:9",
)
Seedance tokens: @Image1 … @Image5 | Kling tokens: <<<image_1>>> … <<<image_5>>>
Seedance constraints: skip fast=True because it caps at 720p; skip negative_prompt because Seedance rejects it; skip auto_duration because this path is fixed at 15s.
Kling constraint: use quality_mode="pro" for 1080p; Kling rejects resolution=.
Kling fallback is async. If generate_reference_video(provider="kling") returns a task_id, follow the task until terminal.
If task_status returns status: queued with statusMessage containing Worker handoff: task was requeued for retry on another worker., treat it as a worker restart handoff, not a failed render. Keep polling mcp__pika__task_status(task_id); the next worker should reclaim the same task.
If statusMessage starts with Kling is at capacity, treat it as provider capacity wait. Keep polling the same task while lastUpdatedAt continues moving.
Do not submit a duplicate Kling request while the original task is still queued or running. Duplicates can burn provider quota and make artifact provenance unclear.
If status stays queued for more than 10 minutes with no lastUpdatedAt movement, capture the task_id, status, statusMessage, and lastUpdatedAt, then cancel the stalled original with mcp__pika__task_cancel(task_id) before retrying. Only after cancel returns cancelled, retry the exact same Kling request once with the same prompt, references, shots, aspect ratio, duration, and quality mode. If cancel fails because the task already completed or failed, inspect that terminal result instead of retrying. If the retry also stalls, stop and report both task IDs instead of changing the creative prompt.
If the user provides local file paths, convert them to public URLs before calling generate:
mcp__pika__upload_asset(filename, mime_type, size_bytes).presigned_url using the host client's file-upload capability.public_url as the reference URL in generation calls.Supported mime types: image/png, image/jpeg, image/webp, video/mp4, audio/mpeg, audio/wav
Do not ask Seedance or Kling to render COMING SOON. Video models garble new
typography, especially all-caps CTA text, so the final two seconds use a
deterministic COMING SOON overlay as a post-generation text overlay.
After Seedance or Kling returns the 15s teaser URL, call:
edit_text_overlay(
video_url=<generated_teaser_url>,
text="COMING SOON",
position="bottom_center",
font_size=56,
font_color="white",
start_s=13,
end_s=15,
)
If edit_text_overlay returns { task_id }, poll mcp__pika__task_status
until it reaches completed, failed, or cancelled, then unwrap the returned
URL. Save the returned URL as final_url. If the overlay call fails, surface
that failure and the unoverlaid teaser URL as a diagnostic preview; do not
deliver a teaser whose only COMING SOON text was generated by the video model.
Return the final Pika CDN URL as the primary deliverable. If the host client requires local media markers, create that local preview outside this skill flow after confirming the CDN URL is reachable.
If generation completes asynchronously: follow the MCP tool's returned status handle until the video reaches a terminal state, then deliver the final URL.
The prompt is the output of Stages 1 + 2, not a starting point. Never fill in the template from imagination — fill it from the feature map and arc you built. A prompt written without Stage 1 analysis will produce a generic glass blob.
Use specific camera language — Seedance responds to it:
| Term | Effect |
|---|---|
extreme macro close-up on [specific element] | Tight detail shot — glass edge, button, icon |
crash zoom into [element] | Fast push-in, creates energy |
whip pan to | Hard lateral cut with motion blur |
orbital sweep around | 360° arc around the floating panel |
push-in drift | Slow, cinematic dolly |
pull-back to reveal | Classic product reveal — shows full form |
hard cut to black | Clean beat before logo |
Alternate fast cuts with slower drifts — pure rapid cuts feel chaotic, pure slow drifts feel boring.
(Glass transformation vocabulary lives in references/liquid-glass.md — only relevant on the Template B path.)
For product shots, lock the device to a black void — never place in environments:
# Floating desktop screens (SaaS / desktop apps)
Show the desktop screens floating in 3D space on a pure black background, tilted at
slight angles like a MacBook product shot. The UI elements on screen become translucent
glass with reflections and refractions. No text, no logos, no words.
# iPad reveal
An iPad Pro floating in empty black space, tilted at a cinematic angle like an Apple
product shot. The iPad is a real solid device with visible bezels — only the screen
content has the glass effect. The device slowly rotates. No text, no logos.
# MacBook
A MacBook Pro floating in empty black space, open at a cinematic angle. The screen
displays [content]. Light catches the aluminium edges. No text, no logos.
All runs are 15s, 1080p. Select 3–5 screens based on the narrative arc.
| Refs | Use case |
|---|---|
| 3 | Standard — one screen per beat (hook / build / reveal) + icon as @Image4 |
| 4 | Two build beats + hook + icon |
| 5 | Feature-rich — hook + 3 build beats + icon. Don't exceed 5. |
The golden rule: 1 reference per ~3 seconds of video.
These phrases are empirical prompt/flow anchors. Keep them when simplifying the skill:
| Phrase | Where | Why load-bearing |
|---|---|---|
High quality version, preserve all content exactly | GPT-image-2 enhancement pass | Keeps the enhancement pass from inventing UI while cleaning compression artifacts. |
Do NOT write the Seedance prompt until this arc is defined | Stage 2.5 gate | Prevents generic motion prompts that are not grounded in the selected screens. |
The prompt is the output of Stages 1 + 2, not a starting point | Prompting guide | Forces the agent to use the screen feature map and story arc instead of template-filling from imagination. |
pure black background / floating in empty black space | Device framing prompts | Keeps product shots focused on the app UI rather than hallucinated environments. |
materializes whole / crystallizes as a single form / fades in as a complete element | Logo reveal wording | Avoids per-letter logo construction, which causes garbled brand text. |
Typical run time is 4-8 minutes:
| Step | Wall clock | Notes |
|---|---|---|
| Asset sourcing | 10-60s | App Store via mcp__pika__fetch_appstore_screens; website capture depends on page load |
| Screen analysis + arc | 2-5 min | User confirmation can add time |
| GPT-image-2 enhancement | 30-90s | Run selected screens in parallel |
| Seedance generation | 3-5 min | Generated-audio moderation recovery adds one silent probe plus one same-seed sound replay |
| Kling fallback | 5-15 min | Capacity wait or worker handoff may temporarily show queued; follow the Kling queued/handoff recovery runbook |
| Download verification | <30s | Local sanity check before delivery |
Seedance is the default because it handles polished motion-graphics references and 1080p app teasers well. Kling is the fallback for moderation, balance, or Seedance timeout failures because it is more permissive on some screen content and uses quality_mode="pro" for 1080p.
| Symptom | Cause | Fix |
|---|---|---|
fast=True with resolution="1080p" | Seedance caps fast mode at 720p | Remove fast; keep resolution="1080p" |
negative_prompt rejected | Seedance does not accept this field | Use positive framing such as "smooth motion, stable camera" |
Seedance generated-audio moderation: content_policy_violation / partner_validation_failed, generated_video, "Output audio has sensitive content." | Often a false positive on non-sensitive app-sizzle references | Follow the generated-audio recovery runbook: same-seed sound=False probe, then same-seed sound=True replay |
Seedance timeout such as seedance timed out after ... | Provider queue saturation or tail latency exceeded the tool budget | Run the Kling fallback; do not keep retrying Seedance unless the user explicitly asks to wait |
Seedance partner_validation_failed on video | Screen content includes recording UI, celebrity faces, or similar moderation triggers | Switch to provider="kling" and convert tokens to <<<image_N>>> |
| Faces in screenshots trigger content policy | Screenshot includes real people | Crop faces out before upload, or use Kling |
| 6+ reference images reduce quality | The model blends too many refs | Keep to 3-5 references, roughly one per 3 seconds |
| Prompt tail ignored | Prompt exceeds about 200 words | Trim to the beat structure and the concrete UI details |
| Text in output is garbled | Video model is asked to render new text | Keep text as existing reference-image content; overlay any new branding in post |
| Logo reveal hallucinates letterforms | "assemble/build/construct" language triggers per-glyph rendering | Use "materializes whole", "crystallizes as a single form", or "fades in as a complete element" |
Task returns { task_id } instead of inline | Long-running generation exceeded inline budget | Poll mcp__pika__task_status(task_id) until completed, failed, or cancelled; unwrap result.structuredContent when present |
Kling task returns status: queued after previously running | Worker handoff or provider capacity wait | Follow the Kling queued/handoff recovery runbook; do not duplicate-submit unless queued for more than 10 minutes with no lastUpdatedAt movement |
Kling rejects resolution= | Kling uses a different quality knob | Use quality_mode="pro" |
| App Store icon URL points to promo art | App Store metadata fallback found feature artwork | Prefer the icon.url returned by mcp__pika__fetch_appstore_screens; if missing, ask for a logo/icon file |