This skill should be used when the user asks to "create a video script", "convert ideas to script", "transcribe video content", "structure video narrative", "optimize script for AI video", "generate hooks and CTAs", "time video segments", "write script for Sora", "write script for Veo", or needs timed, segmented video scripts for AI video generation platforms.
From video-production-suitenpx claudepluginhub nbkm8y5/claude-plugins --plugin video-production-suiteThis skill uses the workspace's default tool permissions.
references/hook-library.mdreferences/segment-structures.mdreferences/timing-guidelines.mdExecutes pre-written implementation plans: critically reviews, follows bite-sized steps exactly, runs verifications, tracks progress with checkpoints, uses git worktrees, stops on blockers.
Guides idea refinement into designs: explores context, asks questions one-by-one, proposes approaches, presents sections for approval, writes/review specs before coding.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Transform raw ideas, transcripts, or bullet points into production-ready video scripts optimized for AI video generation platforms (Sora 2, Veo 3.1). This skill structures unorganized content into engaging, properly-timed scripts with narrative arcs, hooks, and calls-to-action, specifically optimized for 15-second segment generation.
This skill converts any form of content input into production-ready video scripts with precise timing, narrative structure, and platform-specific optimization. Output scripts are structured for deterministic parsing by post-processing tools and ready for immediate use with video-prompt skill.
Extract and identify input type and requirements:
Input Types:
Extract from input:
Choose structure based on video length and AI platform constraints:
Structure: Hook → Context → Value → CTA Target Word Count (English): 35-41 words (promotional pace, 160 WPM) Target Word Count (Spanish): 32-38 words (promotional pace, 145 WPM) Best for: Platform-native content, single-idea videos, maximum AI generation reliability
Critical: Both Sora 2 (15s max) and Veo 3.1 (max 8-second clips, but can batch 1-4 clips) benefit from 15-second segment planning.
Structure:
Word count: 70-82 words total (English promotional) Best for: Reels, TikTok, educational content, testimonials
Structure:
Word count: 140-164 words total (English promotional) Best for: Deep dives, case studies, detailed tutorials
Production Note: For Sora 2, generate each segment separately and stitch in post. For Veo 3.1, batch generate using Ingredients-to-Video for character continuity.
Create each narrative component using proven patterns:
Purpose: Stop the scroll, spark immediate curiosity
Pattern Library:
Deliverable: Generate 2-3 hook alternatives for A/B testing
See references/hook-library.md for 50+ proven hook templates.
Purpose: Establish relevance and show understanding
Pattern:
Key: Be specific and relatable, not generic.
Purpose: Deliver core insight or teaching
Pattern:
Constraints by Length:
Purpose: Direct viewer to next action
Pattern Options (lowest to highest commitment):
Deliverable: Generate 2 CTA variations for testing
CRITICAL: Scripts must fit within target duration with buffer time.
English:
Spanish (Latin American):
See references/timing-guidelines.md for 10+ languages with complete WPM tables.
Buffer Requirements:
Map physical gestures to script moments for AI video generation:
Gesture Patterns by Moment:
Format:
0-3s → Open hand gesture (welcoming) — "[script phrase]"
3-7s → Pointing emphasis — "[script phrase]"
7-10s → Two-handed gesture — "[script phrase]"
Provide B-roll, text overlay, and emphasis recommendations:
Overlay: 'Key Formula = Value' at 5s mark
B-roll: Show [specific visual] during "[key phrase]"
Pause: 0.3s before key number
Text emphasis: 'BENEFIT' at 8s
Indicate delivery adjustments:
[PAUSE 0.3s]
[EMPHASIS on "critical statistic"]
[SLOWER pace: "Here's what you do"]
[ENERGY UP: "Act now"]
CRITICAL: Output must follow this exact section structure for post-processing scripts and integration with video-prompt skill.
## SCRIPT METADATA
**Title:** [Descriptive video title]
**Total Duration:** [X] seconds
**Segments:** [N] × 15s clips
**Language:** [Language]
**Tone:** [Conversational/Professional/Promotional/Urgent]
**Platform Optimization:** [Sora 2 | Veo 3.1 | Both]
**Total Word Count:** [X] words
---
## HOOKS
### Hook Option A: [Hook Type Name]
**Pattern:** [Question/Mistake/Promise/etc.]
**Text:** "[Complete hook text]"
**Psychology:** [Why this hook works]
### Hook Option B: [Hook Type Name]
**Pattern:** [Hook type]
**Text:** "[Complete hook text]"
**Psychology:** [Why this hook works]
### Hook Option C: [Hook Type Name]
**Pattern:** [Hook type]
**Text:** "[Complete hook text]"
**Psychology:** [Why this hook works]
---
## SEGMENTS
### Segment 1 (0-15s) — [Narrative Component Names]
**Full Script:**
[Complete script text for this segment, optimized for timing]
**Word Count:** [X] words
**Calculated Duration:** [Y] seconds at [Z] WPM
**Validation:** ✓ Within 14.0-14.5s target
**Gesture Cues:**
* 0-3s → [Gesture description] — "[key phrase]"
* 3-7s → [Gesture description] — "[key phrase]"
* 7-10s → [Gesture description] — "[key phrase]"
* 10-15s → [Gesture description] — "[key phrase]"
**Visual Suggestions:**
* [Overlay/B-roll/emphasis recommendation]
* [Text overlay timing and content]
* [Pacing notes]
---
### Segment 2 (15-30s) — [Narrative Component Names]
[Same format as Segment 1]
---
[Additional segments for 30s/60s videos]
---
## CTA VARIATIONS
### CTA Option 1: [CTA Type — Engagement/Comment/DM/Link]
**Text:** "[Complete CTA text]"
**Commitment Level:** [Low/Medium/High]
**Best For:** [Content type/audience context]
### CTA Option 2: [CTA Type]
**Text:** "[Complete CTA text]"
**Commitment Level:** [Level]
**Best For:** [Context]
---
## PRODUCTION NOTES
### Narrative Structure
[Overall arc description: Hook → Context → Value → CTA]
### Energy & Pacing
[Delivery style notes, energy modulation across segments]
### Platform-Specific Considerations
[Sora 2: Generate segments separately, stitch in post]
[Veo 3.1: Use Ingredients-to-Video for character continuity]
### Post-Production Requirements
[Any compositing, editing, or stitching requirements]
### Alternative Structures
[If applicable, show alternative ways to structure the same content]
Post-processing scripts parse these exact section headers (## SCRIPT METADATA, ## HOOKS, ## SEGMENTS, ## CTA VARIATIONS, ## PRODUCTION NOTES) to extract structured data for JSON export or direct integration with video-prompt generation.
| Language | Promotional WPM | 15s Target Words | 30s Target Words | 60s Target Words |
|---|---|---|---|---|
| English | 160 | 38-41 | 75-82 | 150-164 |
| Spanish (LA) | 145 | 35-38 | 69-76 | 138-152 |
| Spanish (EU) | 160 | 38-41 | 75-82 | 150-164 |
| French | 160 | 38-41 | 75-82 | 150-164 |
| German | 130 | 30-33 | 60-66 | 120-132 |
| Portuguese | 150 | 36-39 | 72-78 | 144-156 |
| Italian | 150 | 36-39 | 72-78 | 144-156 |
| Mandarin | 140 CPM | 13-16 | 26-32 | 52-64 |
| Japanese | 400 CPM | 19-21 | 38-42 | 76-84 |
| Korean | 300 SPM | 18-21 | 36-42 | 72-84 |
See references/timing-guidelines.md for complete multi-language timing specifications.
This skill outputs timed, structured scripts. The video-prompt skill adds production specifications.
This skill provides:
Next skill (video-prompt) adds:
Combined Workflow:
Raw idea/concept
↓
video-script (this skill)
↓
Timed, segmented script with narrative structure
↓
video-prompt
↓
Production-ready AI video prompts (platform-specific)
↓
Generate in Sora 2 or Veo 3.1 → Stitch if needed
Before finalizing script, verify:
Content:
Timing:
Production Readiness:
Structure:
Language:
Structure: Hook → Context → Value → CTA
Example (English, 160 WPM):
Hook: "Here's what you need to know about DSCR loans."
Context: "Most people miss this key detail."
Value: "The secret is rental income, not W-2s. If your property brings in 1.2 times the payment, you qualify."
CTA: "Save this for later."
Total: 38 words = 14.25s at 160 WPM ✓
Segment 1 (Hook + Context): Grab attention, establish offer or problem
Segment 2 (Value + CTA): Deliver benefit, urgency, and action
Segment 1: Hook Segment 2: Problem + Context in detail Segment 3: Solution Part 1 (core concept) Segment 4: Solution Part 2 + Implementation + CTA
Solution:
Solution:
references/hook-library.md for proven patternsSolution:
Solution:
Solution:
## SCRIPT METADATA
**Title:** DSCR Loans Explained for Real Estate Investors
**Total Duration:** 30 seconds
**Segments:** 2 × 15s clips
**Language:** English
**Tone:** Educational/Professional
**Platform Optimization:** Both (Sora 2 + Veo 3.1)
**Total Word Count:** 78 words
---
## HOOKS
### Hook Option A: Question Pattern
**Pattern:** Question
**Text:** "What if you could get a loan based on your property's cash flow—not your paycheck?"
**Psychology:** Engages curiosity, promises solution to common pain point
### Hook Option B: Mistake Pattern
**Pattern:** Mistake/Problem
**Text:** "Most investors hit a wall because they can't qualify using W-2s."
**Psychology:** Identifies relatable frustration, positions solution
### Hook Option C: Promise Pattern
**Pattern:** Promise/Benefit
**Text:** "Property loans without W-2s? Here's exactly how."
**Psychology:** Direct benefit promise, creates knowledge gap
---
## SEGMENTS
### Segment 1 (0-15s) — Hook + Context
**Full Script:**
What if you could get a loan based on your property's cash flow—not your paycheck? Many investors hit a wall because they don't qualify using W-2s. Here's a DSCR method that changes that.
**Word Count:** 38 words
**Calculated Duration:** 14.25s at 160 WPM
**Validation:** ✓ Within 14.0-14.5s target
**Gesture Cues:**
* 0-3s → Open palms (welcoming) — "What if you could..."
* 3-8s → Pointing gesture (emphasis) — "Many investors hit a wall"
* 8-15s → Forward lean (engaging) — "Here's a DSCR method"
**Visual Suggestions:**
* B-roll: Property exterior or cash flow diagram
* Text overlay: "Cash Flow > Paycheck" at 7s
* Pause 0.3s after "W-2s" for emphasis
---
### Segment 2 (15-30s) — Value + CTA
**Full Script:**
DSCR means your rental income covers 1.2 times your payment. Lenders see you as low risk. I've done 100+ of these. If the numbers work, you qualify. DM me 'DSCR' for my free tool.
**Word Count:** 40 words
**Calculated Duration:** 15.0s at 160 WPM
**Validation:** ✓ Within 14.0-15.0s target (acceptable for final segment)
**Gesture Cues:**
* 15-18s → Counting gesture (one hand) — "1.2 times"
* 18-22s → Confident gesture (hands together) — "low risk"
* 22-26s → Pointing to camera — "I've done 100+"
* 26-30s → Inviting gesture toward camera — "DM me 'DSCR'"
**Visual Suggestions:**
* Overlay: "DSCR = Rental Income ÷ Debt" at 16s
* B-roll: Calculator or spreadsheet visual
* Text emphasis: "100+ Deals" at 24s
* Stronger energy for CTA
---
## CTA VARIATIONS
### CTA Option 1: DM for Resource
**Text:** "DM me 'DSCR' for my free tool."
**Commitment Level:** Medium
**Best For:** Educational content with lead magnet offer
### CTA Option 2: Save for Later
**Text:** "Save this and follow for more investor tips."
**Commitment Level:** Low
**Best For:** Building audience, lower-commitment ask
---
## PRODUCTION NOTES
### Narrative Structure
Hook → Context → Problem → Solution → Credibility → CTA
### Energy & Pacing
- Segment 1: Moderate energy, build curiosity
- Segment 2: Confident energy, authoritative delivery on CTA
### Platform-Specific Considerations
**Sora 2:** Generate two separate 15s clips, stitch in post with cut or dissolve transition
**Veo 3.1:** Use Ingredients-to-Video mode with character ingredient for continuity across both clips
### Post-Production Requirements
- Stitch segments with 0.5s crossfade or hard cut
- Add overlays at specified timestamps
- Ensure audio levels consistent across segments
This skill includes detailed reference materials:
Refer to these files for deeper guidance on specific components.