Authoritative Python-FFmpeg parameter integration reference ensuring type safety, accurate parameter mappings, and proper unit conversions. PROACTIVELY activate for: (1) ffmpeg-python library usage, (2) Python subprocess FFmpeg calls, (3) Caption/subtitle parameter mapping (drawtext, ASS), (4) Color format conversions (BGR, RGB, ABGR, ASS &HAABBGGRR), (5) Time unit conversions (seconds, centiseconds, milliseconds), (6) Type safety validation (int, float, string), (7) Coordinate systems, (8) Parameter range enforcement, (9) Frame pipe handling, (10) Error detection for type mismatches. Provides: Complete parameter type reference, color format conversion tables, time unit conversion formulas, validation patterns, working Python examples with proper typing.
Provides type-safe Python-FFmpeg parameter mappings for color formats, time units, and validation.
/plugin marketplace add JosiahSiegel/claude-plugin-marketplace/plugin install ffmpeg-master@claude-plugin-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
MANDATORY: Always Use Backslashes on Windows for File Paths
When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).
Complete type-safe parameter mapping reference for integrating FFmpeg with Python.
| FFmpeg Parameter | Python Type | Range/Format | Common Mistake |
|---|---|---|---|
-crf | int or str | 0-51 (H.264/H.265) | Using float: crf=18.5 ❌ |
-b:v | str | "5M", "1000k" | Using int: b:v=5000000 ❌ |
fontsize | int | 1-999 | Using str: fontsize="24" ❌ |
fontcolor | str | "white", "#FFFFFF", "0xFFFFFF" | Wrong format: fontcolor="255,255,255" ❌ |
| ASS PrimaryColour | str | "&HAABBGGRR" | Using RGB: &H00FFFFFF ❌ (should be BGR) |
alpha | str (expression) | "0.5", "min(1,t/2)" | Using float: alpha=0.5 ❌ |
x, y | str or int | "100", "(w-tw)/2" | Forgetting quotes on expressions ❌ |
| Context | Format | Byte Order | Example | Python Type |
|---|---|---|---|---|
| FFmpeg drawtext | Named/Hex | RGB | "white", "#FFFFFF", "0xFFFFFF" | str |
| ASS PrimaryColour | &HAABBGGRR | ABGR | &H00FFFFFF (white) | str |
| ASS OutlineColour | &HAABBGGRR | ABGR | &H00000000 (black) | str |
| OpenCV cv2 | Array | BGR | [255, 255, 255] | np.ndarray |
| PIL/Pillow | Tuple/Hex | RGB | (255, 255, 255), "#FFFFFF" | tuple or str |
| NumPy | Array | RGB or BGR | Depends on source | np.ndarray |
from typing import Tuple
def rgb_to_bgr_hex(r: int, g: int, b: int) -> str:
"""
Convert RGB (0-255) to BGR hex string.
Used for OpenCV color specifications.
Args:
r, g, b: Red, Green, Blue (0-255)
Returns:
Hex string in BGR order: "0xBBGGRR"
"""
return f"0x{b:02X}{g:02X}{r:02X}"
def rgb_to_ass_color(r: int, g: int, b: int, alpha: int = 0) -> str:
"""
Convert RGB to ASS/SSA color format (&HAABBGGRR).
CRITICAL: ASS uses BGR order, not RGB!
Args:
r, g, b: Red, Green, Blue (0-255)
alpha: Alpha channel (0=opaque, 255=transparent)
Returns:
ASS color string: "&HAABBGGRR"
Examples:
>>> rgb_to_ass_color(255, 255, 255) # White
'&H00FFFFFF'
>>> rgb_to_ass_color(255, 0, 0) # Red
'&H000000FF'
>>> rgb_to_ass_color(0, 255, 0) # Green
'&H0000FF00'
>>> rgb_to_ass_color(0, 0, 255) # Blue
'&H00FF0000'
"""
return f"&H{alpha:02X}{b:02X}{g:02X}{r:02X}"
def ass_color_to_rgb(ass_color: str) -> Tuple[int, int, int, int]:
"""
Parse ASS color format (&HAABBGGRR) to RGBA.
Args:
ass_color: ASS color string like "&H00FFFFFF"
Returns:
Tuple (r, g, b, alpha)
"""
# Remove &H prefix
hex_val = ass_color.replace("&H", "").replace("&h", "")
# Pad to 8 characters if needed (some formats omit alpha)
hex_val = hex_val.zfill(8)
# Extract AABBGGRR
alpha = int(hex_val[0:2], 16)
blue = int(hex_val[2:4], 16)
green = int(hex_val[4:6], 16)
red = int(hex_val[6:8], 16)
return (red, green, blue, alpha)
def ffmpeg_color_to_rgb(color: str) -> Tuple[int, int, int]:
"""
Parse FFmpeg named or hex color to RGB.
Args:
color: Named color ("white") or hex ("#FFFFFF", "0xFFFFFF")
Returns:
Tuple (r, g, b)
"""
# Named colors (subset)
named_colors = {
"white": (255, 255, 255),
"black": (0, 0, 0),
"red": (255, 0, 0),
"green": (0, 255, 0),
"blue": (0, 0, 255),
"yellow": (255, 255, 0),
"cyan": (0, 255, 255),
"magenta": (255, 0, 255),
}
if color.lower() in named_colors:
return named_colors[color.lower()]
# Parse hex
hex_val = color.replace("#", "").replace("0x", "")
r = int(hex_val[0:2], 16)
g = int(hex_val[2:4], 16)
b = int(hex_val[4:6], 16)
return (r, g, b)
# Common color presets (ASS format)
ASS_COLORS = {
"white": "&H00FFFFFF",
"black": "&H00000000",
"red": "&H000000FF",
"green": "&H0000FF00",
"blue": "&H00FF0000",
"yellow": "&H0000FFFF",
"cyan": "&H00FFFF00",
"magenta": "&H00FF00FF",
"orange": "&H0000A5FF", # RGB(255, 165, 0)
"purple": "&H00800080", # RGB(128, 0, 128)
}
# Transparency examples (ASS alpha channel)
ASS_ALPHA = {
"opaque": 0x00, # Fully opaque (0%)
"transparent_10": 0x1A, # 10% transparent
"transparent_25": 0x40, # 25% transparent
"transparent_50": 0x80, # 50% transparent (common for shadows)
"transparent_75": 0xBF, # 75% transparent
"invisible": 0xFF, # Fully transparent (100%)
}
| Color | RGB | Hex (RGB) | ASS (&HAABBGGRR) | OpenCV BGR Array |
|---|---|---|---|---|
| White | (255,255,255) | #FFFFFF | &H00FFFFFF | [255,255,255] |
| Black | (0,0,0) | #000000 | &H00000000 | [0,0,0] |
| Red | (255,0,0) | #FF0000 | &H000000FF | [0,0,255] |
| Green | (0,255,0) | #00FF00 | &H0000FF00 | [0,255,0] |
| Blue | (0,0,255) | #0000FF | &H00FF0000 | [255,0,0] |
| Yellow | (255,255,0) | #FFFF00 | &H0000FFFF | [0,255,255] |
| Cyan | (0,255,255) | #00FFFF | &H00FFFF00 | [255,255,0] |
| Magenta | (255,0,255) | #FF00FF | &H00FF00FF | [255,0,255] |
| Orange | (255,165,0) | #FFA500 | &H0000A5FF | [0,165,255] |
| Context | Unit | Python Type | Conversion Formula | Example |
|---|---|---|---|---|
| FFmpeg filters (fade, xfade) | Seconds | float or int | N/A | duration=1.5 |
| ASS karaoke (\k, \kf, \ko) | Centiseconds | int | cs = seconds * 100 | {\k50} = 0.5s |
| ASS animation (\t, \fad, \move) | Milliseconds | int | ms = seconds * 1000 | \t(0,500,...) = 0.5s |
from typing import Union
def seconds_to_centiseconds(seconds: float) -> int:
"""
Convert seconds to centiseconds for ASS karaoke tags.
Args:
seconds: Time in seconds
Returns:
Centiseconds (1/100 second)
Examples:
>>> seconds_to_centiseconds(0.5)
50
>>> seconds_to_centiseconds(1.0)
100
>>> seconds_to_centiseconds(2.5)
250
"""
return int(seconds * 100)
def seconds_to_milliseconds(seconds: float) -> int:
"""
Convert seconds to milliseconds for ASS animation tags.
Args:
seconds: Time in seconds
Returns:
Milliseconds (1/1000 second)
Examples:
>>> seconds_to_milliseconds(0.5)
500
>>> seconds_to_milliseconds(1.0)
1000
>>> seconds_to_milliseconds(0.2)
200
"""
return int(seconds * 1000)
def centiseconds_to_seconds(centiseconds: int) -> float:
"""
Convert ASS karaoke centiseconds to seconds.
Args:
centiseconds: Duration in centiseconds
Returns:
Seconds
"""
return centiseconds / 100.0
def milliseconds_to_seconds(milliseconds: int) -> float:
"""
Convert ASS animation milliseconds to seconds.
Args:
milliseconds: Duration in milliseconds
Returns:
Seconds
"""
return milliseconds / 1000.0
def format_ass_timestamp(seconds: float) -> str:
"""
Format seconds as ASS timestamp (H:MM:SS.CC).
Args:
seconds: Time in seconds
Returns:
ASS timestamp string
Examples:
>>> format_ass_timestamp(1.5)
'0:00:01.50'
>>> format_ass_timestamp(65.25)
'0:01:05.25'
>>> format_ass_timestamp(3661.0)
'1:01:01.00'
"""
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = int(seconds % 60)
centis = int((seconds % 1) * 100)
return f"{hours}:{minutes:02d}:{secs:02d}.{centis:02d}"
def parse_ass_timestamp(timestamp: str) -> float:
"""
Parse ASS timestamp to seconds.
Args:
timestamp: ASS timestamp like "0:00:05.50"
Returns:
Time in seconds
Examples:
>>> parse_ass_timestamp("0:00:01.50")
1.5
>>> parse_ass_timestamp("0:01:05.25")
65.25
"""
parts = timestamp.split(":")
hours = int(parts[0])
minutes = int(parts[1])
sec_parts = parts[2].split(".")
seconds = int(sec_parts[0])
centiseconds = int(sec_parts[1]) if len(sec_parts) > 1 else 0
total = hours * 3600 + minutes * 60 + seconds + centiseconds / 100.0
return total
# Quick conversion constants
SECOND_TO_CS = 100 # Centiseconds per second
SECOND_TO_MS = 1000 # Milliseconds per second
CS_TO_MS = 10 # Milliseconds per centisecond
| Human Readable | Seconds | Centiseconds (ASS \k) | Milliseconds (ASS \t) |
|---|---|---|---|
| 100ms (flash) | 0.1 | 10 | 100 |
| 250ms (quick) | 0.25 | 25 | 250 |
| 500ms (half second) | 0.5 | 50 | 500 |
| 1 second | 1.0 | 100 | 1000 |
| 1.5 seconds | 1.5 | 150 | 1500 |
| 2 seconds | 2.0 | 200 | 2000 |
| Parameter | Python Type | Description | Example |
|---|---|---|---|
text | str | Text to display | text='Hello World' |
textfile | str (path) | Path to text file | textfile='/path/to/text.txt' |
| Parameter | Python Type | Range/Format | Validation | Example |
|---|---|---|---|---|
fontfile | str (path) | Absolute path to .ttf/.otf | File exists | fontfile='/fonts/Arial.ttf' |
fontsize | int | 1-999 (practical: 12-200) | 1 <= size <= 999 | fontsize=48 |
fontcolor | str | Named or hex RGB | Valid color | fontcolor='white' |
fontcolor_expr | str (expression) | Dynamic color expression | Valid FFmpeg expr | fontcolor_expr='0xFFFFFF' |
CRITICAL: fontsize MUST be int, not str. Common error:
# ❌ WRONG:
drawtext(fontsize="24")
# ✅ CORRECT:
drawtext(fontsize=24)
| Parameter | Python Type | Format | Example |
|---|---|---|---|
x | str or int | Pixel value or expression | x=10 or x='(w-tw)/2' |
y | str or int | Pixel value or expression | y=50 or y='h-th-20' |
Position Expression Variables:
w: Video width (pixels)h: Video height (pixels)tw: Text width (pixels)th: Text height (pixels)t: Time in seconds# Static position (int)
x=100, y=50
# Dynamic position (str expression)
x='(w-tw)/2' # Centered horizontally
y='h-th-20' # 20px from bottom
# Time-based animation (str expression)
x='w-mod(t*100,w+tw)' # Scrolling ticker
| Parameter | Python Type | Range | Example |
|---|---|---|---|
borderw | int | 0-20 (practical) | borderw=3 |
bordercolor | str | Named or hex RGB | bordercolor='black' |
shadowx | int | -50 to 50 (practical) | shadowx=2 |
shadowy | int | -50 to 50 (practical) | shadowy=2 |
shadowcolor | str | Named or hex RGB | shadowcolor='black' |
box | int | 0 (off) or 1 (on) | box=1 |
boxcolor | str | Named/hex + alpha | boxcolor='black@0.5' |
boxborderw | int | 0-50 | boxborderw=5 |
Alpha Transparency in Colors:
# Format: "color@opacity"
# opacity: 0.0 (transparent) to 1.0 (opaque)
boxcolor='black@0.5' # 50% transparent black
shadowcolor='red@0.3' # 30% opaque red
fontcolor='white@0.8' # 80% opaque white
| Parameter | Python Type | Format | Example |
|---|---|---|---|
enable | str (expression) | Boolean expression | enable='gte(t,2)' |
alpha | str (expression) | 0.0-1.0 | alpha='min(1,t/2)' |
Timing Expressions:
# Show after 2 seconds
enable='gte(t,2)'
# Show between 2-5 seconds
enable='between(t,2,5)'
# Fade in over 1 second
alpha='min(1,t)'
# Fade out over last 2 seconds (10s video)
alpha='if(gt(t,8),1-(t-8)/2,1)'
from typing import NamedTuple
class ASSStyle(NamedTuple):
"""Type-safe ASS style definition."""
name: str
fontname: str
fontsize: int
primary_colour: str # &HAABBGGRR format
secondary_colour: str # &HAABBGGRR format
outline_colour: str # &HAABBGGRR format
back_colour: str # &HAABBGGRR format (shadow)
bold: int # 0 or -1 (FFmpeg quirk: -1 for bold)
italic: int # 0 or 1
underline: int # 0 or 1
strikeout: int # 0 or 1
scale_x: int # Percentage (100 = normal)
scale_y: int # Percentage (100 = normal)
spacing: int # Letter spacing in pixels
angle: float # Rotation angle in degrees
border_style: int # 1 (outline) or 3 (opaque box)
outline: float # Outline width (0.0-4.0)
shadow: float # Shadow distance (0.0-4.0)
alignment: int # Numpad alignment (1-9)
margin_l: int # Left margin (pixels)
margin_r: int # Right margin (pixels)
margin_v: int # Vertical margin (pixels)
encoding: int # Character encoding (1=UTF-8)
def ass_style_to_string(style: ASSStyle) -> str:
"""
Convert ASSStyle to ASS format string.
Returns:
ASS Style line
"""
return (
f"Style: {style.name},"
f"{style.fontname},{style.fontsize},"
f"{style.primary_colour},{style.secondary_colour},"
f"{style.outline_colour},{style.back_colour},"
f"{style.bold},{style.italic},{style.underline},{style.strikeout},"
f"{style.scale_x},{style.scale_y},{style.spacing},{style.angle},"
f"{style.border_style},{style.outline},{style.shadow},"
f"{style.alignment},"
f"{style.margin_l},{style.margin_r},{style.margin_v},"
f"{style.encoding}"
)
# Example usage
karaoke_style = ASSStyle(
name="Karaoke",
fontname="Arial Black",
fontsize=72,
primary_colour="&H00FFFFFF", # White text (unhighlighted)
secondary_colour="&H000000FF", # Red text (highlighted)
outline_colour="&H00000000", # Black outline
back_colour="&H80000000", # 50% transparent black shadow
bold=-1, # Bold (FFmpeg uses -1)
italic=0,
underline=0,
strikeout=0,
scale_x=100,
scale_y=100,
spacing=0,
angle=0.0,
border_style=1, # Outline + shadow
outline=3.0, # 3px outline
shadow=2.0, # 2px shadow
alignment=2, # Bottom center
margin_l=10,
margin_r=10,
margin_v=50, # 50px from bottom
encoding=1 # UTF-8
)
print(ass_style_to_string(karaoke_style))
| Parameter | Python Type | Range | Unit | Notes |
|---|---|---|---|---|
fontsize | int | 1-999 | Points | Screen-relative |
primary_colour | str | &H00000000 - &HFFFFFFFF | ABGR hex | Text color |
secondary_colour | str | &H00000000 - &HFFFFFFFF | ABGR hex | Karaoke fill |
outline_colour | str | &H00000000 - &HFFFFFFFF | ABGR hex | Border color |
back_colour | str | &H00000000 - &HFFFFFFFF | ABGR hex | Shadow color |
bold | int | -1 (on), 0 (off) | Boolean | FFmpeg quirk: -1 for bold |
italic | int | 0, 1 | Boolean | Standard |
scale_x, scale_y | int | 1-1000 | Percentage | 100 = normal |
outline | float | 0.0-4.0 | Pixels | Border width |
shadow | float | 0.0-4.0 | Pixels | Shadow offset |
alignment | int | 1-9 | Numpad | See alignment chart |
7 (top-left) 8 (top-center) 9 (top-right)
4 (middle-left) 5 (middle-center) 6 (middle-right)
1 (bottom-left) 2 (bottom-center) 3 (bottom-right)
ASS_ALIGNMENT = {
"bottom_left": 1,
"bottom_center": 2,
"bottom_right": 3,
"middle_left": 4,
"middle_center": 5,
"middle_right": 6,
"top_left": 7,
"top_center": 8,
"top_right": 9,
}
| Tag | Name | Unit | Python Type | Range | Effect |
|---|---|---|---|---|---|
\k | Karaoke | Centiseconds | int | 0-9999 | Instant highlight |
\kf / \K | Karaoke Fill | Centiseconds | int | 0-9999 | Progressive fill |
\ko | Karaoke Outline | Centiseconds | int | 0-9999 | Outline sweep |
def generate_karaoke_line(
words: list[str],
durations: list[float], # In SECONDS
style: str = "Karaoke"
) -> str:
"""
Generate ASS karaoke dialogue line.
Args:
words: List of words/syllables
durations: Duration for each word IN SECONDS
style: ASS style name
Returns:
ASS dialogue line with karaoke tags
Example:
>>> generate_karaoke_line(
... ["Hello", "world"],
... [0.5, 0.6]
... )
'{\\k50}Hello {\\k60}world'
"""
# Convert seconds to centiseconds
karaoke_tags = []
for word, duration_sec in zip(words, durations):
cs = int(duration_sec * 100) # Centiseconds
karaoke_tags.append(f"{{\\k{cs}}}{word}")
return " ".join(karaoke_tags)
# Example usage
words = ["Never", "gonna", "give", "you", "up"]
durations = [0.8, 0.6, 0.6, 0.5, 0.7] # seconds
karaoke_text = generate_karaoke_line(words, durations)
print(karaoke_text)
# Output: {\k80}Never {\k60}gonna {\k60}give {\k50}you {\k70}up
| Tag | Format | Unit | Example | Description |
|---|---|---|---|---|
\t | \t(t1,t2,tags) | Milliseconds | \t(0,500,\fscx120) | Animate over time |
\fad | \fad(in,out) | Milliseconds | \fad(300,200) | Fade in/out |
\move | \move(x1,y1,x2,y2,t1,t2) | Milliseconds | \move(0,0,100,100,0,1000) | Move position |
\fscx, \fscy | \fscxN, \fscyN | Percentage | \fscx120\fscy120 | Scale X/Y |
\frz | \frzN | Degrees | \frz45 | Rotate (Z-axis) |
\c, \1c | \c&HBBGGRR& | ABGR hex | \c&HFF0000& | Primary color |
\3c | \3c&HBBGGRR& | ABGR hex | \3c&H000000& | Outline color |
\4c | \4c&HBBGGRR& | ABGR hex | \4c&H808080& | Shadow color |
from typing import List, Tuple
def create_scale_animation(
duration_ms: int,
start_scale: int = 80,
peak_scale: int = 120,
end_scale: int = 100
) -> str:
"""
Create bounce scale animation (pop effect).
Args:
duration_ms: Total animation duration in MILLISECONDS
start_scale: Initial scale percentage
peak_scale: Peak scale (overshoot)
end_scale: Final settled scale
Returns:
ASS animation tags
Example:
>>> create_scale_animation(400)
'\\fscx80\\fscy80\\t(0,150,\\fscx120\\fscy120)\\t(150,300,\\fscx95\\fscy95)\\t(300,400,\\fscx100\\fscy100)'
"""
t1 = int(duration_ms * 0.375) # 37.5% to peak
t2 = int(duration_ms * 0.75) # 75% to settle
t3 = duration_ms
mid_scale = int((peak_scale + end_scale) / 2) - 5
return (
f"\\fscx{start_scale}\\fscy{start_scale}"
f"\\t(0,{t1},\\fscx{peak_scale}\\fscy{peak_scale})"
f"\\t({t1},{t2},\\fscx{mid_scale}\\fscy{mid_scale})"
f"\\t({t2},{t3},\\fscx{end_scale}\\fscy{end_scale})"
)
def create_fade_animation(
fade_in_ms: int,
fade_out_ms: int = 0
) -> str:
"""
Create fade in/out animation.
Args:
fade_in_ms: Fade in duration in MILLISECONDS
fade_out_ms: Fade out duration in MILLISECONDS (0 = no fade out)
Returns:
ASS fade tag
Example:
>>> create_fade_animation(300, 200)
'\\fad(300,200)'
"""
return f"\\fad({fade_in_ms},{fade_out_ms})"
def create_color_transition(
start_color_rgb: Tuple[int, int, int],
end_color_rgb: Tuple[int, int, int],
duration_ms: int
) -> str:
"""
Create smooth color transition animation.
Args:
start_color_rgb: Starting RGB color
end_color_rgb: Ending RGB color
duration_ms: Transition duration in MILLISECONDS
Returns:
ASS animation tags
Example:
>>> create_color_transition((255,255,255), (255,0,0), 1000)
'\\c&H00FFFFFF&\\t(0,1000,\\c&H000000FF&)'
"""
start_ass = rgb_to_ass_color(*start_color_rgb)[:-2] + "&" # Remove last 2 chars, add &
end_ass = rgb_to_ass_color(*end_color_rgb)[:-2] + "&"
return f"\\c{start_ass}\\t(0,{duration_ms},\\c{end_ass})"
# Example: Complete animated karaoke line
def create_animated_karaoke_word(
word: str,
karaoke_duration_sec: float,
pop_animation: bool = True
) -> str:
"""
Create word with karaoke + pop animation.
Args:
word: Word text
karaoke_duration_sec: Karaoke fill duration in SECONDS
pop_animation: Add scale pop effect
Returns:
ASS karaoke word with animations
"""
karaoke_cs = int(karaoke_duration_sec * 100) # Centiseconds
karaoke_ms = int(karaoke_duration_sec * 1000) # Milliseconds
if pop_animation:
pop = create_scale_animation(karaoke_ms, 90, 115, 100)
return f"{{\\k{karaoke_cs}{pop}}}{word}"
else:
return f"{{\\k{karaoke_cs}}}{word}"
# Usage
animated_line = " ".join([
create_animated_karaoke_word("Never", 0.8, True),
create_animated_karaoke_word("gonna", 0.6, True),
create_animated_karaoke_word("give", 0.6, True),
create_animated_karaoke_word("you", 0.5, True),
create_animated_karaoke_word("up", 0.7, True),
])
import ffmpeg
from typing import Optional, Union
def apply_drawtext_filter(
input_stream,
text: str,
fontsize: int,
fontcolor: str = "white",
x: Union[str, int] = 10,
y: Union[str, int] = 10,
fontfile: Optional[str] = None,
borderw: int = 0,
bordercolor: str = "black",
shadowx: int = 0,
shadowy: int = 0,
shadowcolor: str = "black",
box: int = 0,
boxcolor: str = "black@0.5",
boxborderw: int = 0,
enable: Optional[str] = None,
alpha: Optional[str] = None
):
"""
Apply drawtext filter with type safety.
Args:
input_stream: ffmpeg input stream
text: Text to display
fontsize: Font size in points (int, 1-999)
fontcolor: Color name or hex string
x: X position (int or expression string)
y: Y position (int or expression string)
fontfile: Path to font file (optional)
borderw: Border width (int, 0-20)
bordercolor: Border color
shadowx: Shadow X offset (int)
shadowy: Shadow Y offset (int)
shadowcolor: Shadow color
box: Enable background box (0 or 1)
boxcolor: Box color with alpha (e.g., "black@0.5")
boxborderw: Box border width
enable: Enable expression (e.g., "gte(t,2)")
alpha: Alpha expression (e.g., "min(1,t)")
Returns:
ffmpeg stream with drawtext filter applied
Raises:
TypeError: If parameters have incorrect types
ValueError: If parameters are out of valid range
"""
# Type validation
if not isinstance(fontsize, int):
raise TypeError(f"fontsize must be int, got {type(fontsize).__name__}")
if not (1 <= fontsize <= 999):
raise ValueError(f"fontsize must be 1-999, got {fontsize}")
if not isinstance(borderw, int) or borderw < 0:
raise ValueError(f"borderw must be non-negative int, got {borderw}")
if not isinstance(box, int) or box not in (0, 1):
raise ValueError(f"box must be 0 or 1, got {box}")
# Build filter parameters
params = {
"text": text,
"fontsize": fontsize,
"fontcolor": fontcolor,
"x": x,
"y": y,
"borderw": borderw,
"bordercolor": bordercolor,
"box": box,
}
# Optional parameters
if fontfile:
params["fontfile"] = fontfile
if shadowx != 0:
params["shadowx"] = shadowx
if shadowy != 0:
params["shadowy"] = shadowy
params["shadowcolor"] = shadowcolor
if box == 1:
params["boxcolor"] = boxcolor
if boxborderw > 0:
params["boxborderw"] = boxborderw
if enable:
params["enable"] = enable
if alpha:
params["alpha"] = alpha
return input_stream.drawtext(**params)
# Example usage
input_file = ffmpeg.input("input.mp4")
output = apply_drawtext_filter(
input_file,
text="Hello World",
fontsize=48,
fontcolor="white",
x="(w-tw)/2",
y="(h-th)/2",
borderw=2,
bordercolor="black",
shadowx=2,
shadowy=2,
enable="between(t,1,5)"
)
output = ffmpeg.output(output, "output.mp4")
ffmpeg.run(output)
import ffmpeg
from pathlib import Path
def add_subtitles_with_audio(
input_video: str,
output_video: str,
subtitle_text: str,
fontsize: int = 48,
crf: int = 18,
audio_codec: str = "aac",
audio_bitrate: str = "192k"
):
"""
Add burned-in subtitles while preserving audio.
CRITICAL: Always explicitly handle audio stream to prevent loss.
Args:
input_video: Input video path
output_video: Output video path
subtitle_text: Text to display
fontsize: Font size (int)
crf: Constant Rate Factor for H.264 (int, 0-51)
audio_codec: Audio codec (default: "aac")
audio_bitrate: Audio bitrate (default: "192k")
"""
# Input
input_file = ffmpeg.input(input_video)
# Video processing
video = input_file.video.drawtext(
text=subtitle_text,
fontsize=fontsize,
fontcolor="white",
x="(w-tw)/2",
y="h-th-50",
borderw=2,
bordercolor="black"
)
# Audio passthrough (CRITICAL)
audio = input_file.audio
# Output with both streams
output = ffmpeg.output(
video,
audio,
output_video,
vcodec="libx264",
crf=crf,
acodec=audio_codec,
audio_bitrate=audio_bitrate
)
# Run
output = output.overwrite_output()
ffmpeg.run(output)
import subprocess
import numpy as np
from typing import Generator, Tuple
def read_video_frames(
input_path: str,
width: int,
height: int,
pix_fmt: str = "rgb24"
) -> Generator[np.ndarray, None, None]:
"""
Read video frames using FFmpeg subprocess.
Args:
input_path: Input video file path
width: Frame width
height: Frame height
pix_fmt: Pixel format ("rgb24" or "bgr24")
Yields:
NumPy array frames (height, width, 3)
Example:
>>> for frame in read_video_frames("input.mp4", 1920, 1080):
... # Process frame (RGB format)
... processed = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
"""
# FFmpeg command
cmd = [
"ffmpeg",
"-i", input_path,
"-f", "rawvideo",
"-pix_fmt", pix_fmt,
"-" # Output to stdout
]
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.DEVNULL,
bufsize=10**8
)
frame_size = width * height * 3
try:
while True:
raw_frame = process.stdout.read(frame_size)
if len(raw_frame) != frame_size:
break
# Convert to NumPy array
frame = np.frombuffer(raw_frame, dtype=np.uint8)
frame = frame.reshape((height, width, 3))
yield frame
finally:
process.stdout.close()
process.wait()
def write_video_frames(
output_path: str,
width: int,
height: int,
fps: float = 30.0,
pix_fmt: str = "rgb24",
crf: int = 18
) -> subprocess.Popen:
"""
Create FFmpeg process for writing frames.
Args:
output_path: Output video file path
width: Frame width
height: Frame height
fps: Frames per second
pix_fmt: Pixel format ("rgb24" or "bgr24")
crf: Constant Rate Factor (0-51)
Returns:
subprocess.Popen instance (write frames to .stdin)
Example:
>>> writer = write_video_frames("output.mp4", 1920, 1080, 30)
>>> for frame in frames:
... writer.stdin.write(frame.tobytes())
>>> writer.stdin.close()
>>> writer.wait()
"""
cmd = [
"ffmpeg",
"-y", # Overwrite output
"-f", "rawvideo",
"-vcodec", "rawvideo",
"-s", f"{width}x{height}",
"-pix_fmt", pix_fmt,
"-r", str(fps),
"-i", "-", # Read from stdin
"-c:v", "libx264",
"-preset", "fast",
"-crf", str(crf),
"-pix_fmt", "yuv420p",
output_path
]
process = subprocess.Popen(
cmd,
stdin=subprocess.PIPE,
stderr=subprocess.DEVNULL
)
return process
# Complete pipeline example
def process_video_frames(
input_path: str,
output_path: str,
process_fn: callable
):
"""
Read, process, and write video frames.
Args:
input_path: Input video
output_path: Output video
process_fn: Function to process each frame
"""
# Get video dimensions
import cv2
cap = cv2.VideoCapture(input_path)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = cap.get(cv2.CAP_PROP_FPS)
cap.release()
# Create writer
writer = write_video_frames(output_path, width, height, fps)
try:
# Process frames
for frame in read_video_frames(input_path, width, height):
processed = process_fn(frame)
writer.stdin.write(processed.tobytes())
finally:
writer.stdin.close()
writer.wait()
# ❌ WRONG - passing string for int parameter
import ffmpeg
ffmpeg.input("input.mp4").drawtext(
text="Hello",
fontsize="24" # ❌ Should be int
).output("output.mp4")
# ✅ CORRECT
ffmpeg.input("input.mp4").drawtext(
text="Hello",
fontsize=24 # ✅ int
).output("output.mp4")
# ❌ WRONG - using RGB for ASS color
def wrong_ass_color(r: int, g: int, b: int) -> str:
return f"&H00{r:02X}{g:02X}{b:02X}" # ❌ RGB order
# ✅ CORRECT - BGR order for ASS
def correct_ass_color(r: int, g: int, b: int) -> str:
return f"&H00{b:02X}{g:02X}{r:02X}" # ✅ BGR order
# Example: Pure red
print(wrong_ass_color(255, 0, 0)) # ❌ "&H00FF0000" = Blue in ASS!
print(correct_ass_color(255, 0, 0)) # ✅ "&H000000FF" = Red in ASS
# ❌ WRONG - mixing units
def wrong_karaoke_timing():
# Karaoke tag uses centiseconds
# Animation uses milliseconds - DIFFERENT!
return r"{\k100\t(0,100,\fscx120)}Word" # ❌ Mismatch!
# \k100 = 1 second
# \t(0,100,...) = 0.1 seconds (100ms)
# ✅ CORRECT - consistent timing
def correct_karaoke_timing(duration_sec: float):
cs = int(duration_sec * 100) # Karaoke: centiseconds
ms = int(duration_sec * 1000) # Animation: milliseconds
return rf"{{\k{cs}\t(0,{ms},\fscx120)}}Word"
print(correct_karaoke_timing(1.0))
# Output: {\k100\t(0,1000,\fscx120)}Word
# Both tags now represent 1 second ✅
# ❌ WRONG - expression without quotes
import ffmpeg
ffmpeg.input("input.mp4").drawtext(
text="Hello",
x=(w-tw)/2 # ❌ Python evaluates this, causes NameError
)
# ✅ CORRECT - expression as string
ffmpeg.input("input.mp4").drawtext(
text="Hello",
x="(w-tw)/2" # ✅ FFmpeg evaluates this at runtime
)
# ❌ WRONG - audio is silently dropped
import ffmpeg
input_file = ffmpeg.input("input.mp4")
(
input_file
.filter("scale", 1280, 720) # ❌ Only video stream
.output("output.mp4")
.run()
)
# ✅ CORRECT - explicitly handle audio
input_file = ffmpeg.input("input.mp4")
video = input_file.video.filter("scale", 1280, 720)
audio = input_file.audio # ✅ Preserve audio
ffmpeg.output(video, audio, "output.mp4").run()
# ❌ WRONG - using 1 for bold (works in some renderers, not all)
ass_style = f"Style: Default,Arial,48,&H00FFFFFF,...,1,..." # ❌ May not work
# ✅ CORRECT - FFmpeg expects -1 for bold
ass_style = f"Style: Default,Arial,48,&H00FFFFFF,...,-1,..." # ✅ Proper bold
from typing import Union
import re
def validate_fontsize(size: int) -> int:
"""Validate fontsize parameter."""
if not isinstance(size, int):
raise TypeError(f"fontsize must be int, got {type(size).__name__}")
if not (1 <= size <= 999):
raise ValueError(f"fontsize must be 1-999, got {size}")
return size
def validate_crf(crf: int, codec: str = "h264") -> int:
"""Validate CRF parameter."""
if not isinstance(crf, int):
raise TypeError(f"crf must be int, got {type(crf).__name__}")
ranges = {
"h264": (0, 51),
"h265": (0, 51),
"hevc": (0, 51),
"vp9": (0, 63),
"av1": (0, 63),
}
min_crf, max_crf = ranges.get(codec, (0, 51))
if not (min_crf <= crf <= max_crf):
raise ValueError(f"crf for {codec} must be {min_crf}-{max_crf}, got {crf}")
return crf
def validate_ass_color(color: str) -> str:
"""Validate ASS color format."""
pattern = r"^&H[0-9A-Fa-f]{8}$"
if not re.match(pattern, color):
raise ValueError(f"Invalid ASS color format: {color} (expected &HAABBGGRR)")
return color
def validate_alignment(alignment: int) -> int:
"""Validate ASS alignment (1-9 numpad)."""
if not isinstance(alignment, int):
raise TypeError(f"alignment must be int, got {type(alignment).__name__}")
if not (1 <= alignment <= 9):
raise ValueError(f"alignment must be 1-9, got {alignment}")
return alignment
def validate_ffmpeg_color(color: str) -> str:
"""Validate FFmpeg color (named or hex)."""
named_colors = {
"white", "black", "red", "green", "blue",
"yellow", "cyan", "magenta", "orange", "purple"
}
if color.lower() in named_colors:
return color
# Validate hex format
hex_pattern = r"^(#|0x)?[0-9A-Fa-f]{6}$"
if re.match(hex_pattern, color):
return color
raise ValueError(f"Invalid FFmpeg color: {color}")
def validate_time_expression(expr: str) -> str:
"""Validate FFmpeg time expression syntax."""
# Basic validation - check for common mistakes
if not isinstance(expr, str):
raise TypeError(f"Time expression must be str, got {type(expr).__name__}")
# Check for unquoted expressions in Python (common mistake)
if "w" in expr or "h" in expr or "tw" in expr or "th" in expr:
# Likely an expression - ensure it's a string
if not isinstance(expr, str):
raise TypeError("FFmpeg expressions must be strings")
return expr
import ffmpeg
from typing import List, Tuple
from pathlib import Path
class KaraokeGenerator:
"""Type-safe karaoke subtitle generator."""
def __init__(
self,
video_path: str,
output_path: str,
font_size: int = 72,
font_name: str = "Arial Black",
text_color_rgb: Tuple[int, int, int] = (255, 255, 255),
highlight_color_rgb: Tuple[int, int, int] = (255, 0, 0)
):
self.video_path = video_path
self.output_path = output_path
self.font_size = validate_fontsize(font_size)
self.font_name = font_name
self.text_color = rgb_to_ass_color(*text_color_rgb)
self.highlight_color = rgb_to_ass_color(*highlight_color_rgb)
self.lyrics: List[Tuple[float, List[Tuple[str, float]]]] = []
def add_line(
self,
start_time: float,
words: List[str],
durations: List[float]
):
"""
Add karaoke line.
Args:
start_time: Line start time in SECONDS
words: List of words
durations: Duration for each word in SECONDS
"""
if len(words) != len(durations):
raise ValueError("words and durations must have same length")
self.lyrics.append((start_time, list(zip(words, durations))))
def generate_ass(self) -> str:
"""Generate complete ASS subtitle file."""
lines = [
"[Script Info]",
"Title: Karaoke Subtitles",
"ScriptType: v4.00+",
"PlayResX: 1920",
"PlayResY: 1080",
"",
"[V4+ Styles]",
"Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, "
"OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, "
"ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, "
"Alignment, MarginL, MarginR, MarginV, Encoding",
(
f"Style: Karaoke,{self.font_name},{self.font_size},"
f"{self.text_color},{self.highlight_color},"
"&H00000000,&H80000000,"
"-1,0,0,0,100,100,0,0,1,3,2,2,10,10,50,1"
),
"",
"[Events]",
"Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text"
]
# Generate dialogue lines
for start_time, word_durations in self.lyrics:
# Calculate end time
total_duration = sum(dur for _, dur in word_durations)
end_time = start_time + total_duration
# Generate karaoke tags (centiseconds)
karaoke_text = ""
for word, duration_sec in word_durations:
cs = seconds_to_centiseconds(duration_sec)
karaoke_text += f"{{\\k{cs}}}{word} "
karaoke_text = karaoke_text.strip()
# Format timestamps
start_str = format_ass_timestamp(start_time)
end_str = format_ass_timestamp(end_time)
dialogue = f"Dialogue: 0,{start_str},{end_str},Karaoke,,0,0,0,,{karaoke_text}"
lines.append(dialogue)
return "\n".join(lines)
def render(self, crf: int = 18):
"""Render video with karaoke subtitles."""
# Generate ASS file
ass_content = self.generate_ass()
ass_path = Path(self.output_path).with_suffix(".ass")
ass_path.write_text(ass_content, encoding="utf-8")
# Apply subtitles with ffmpeg-python
input_file = ffmpeg.input(self.video_path)
video = input_file.video.filter("ass", str(ass_path))
audio = input_file.audio
output = ffmpeg.output(
video,
audio,
self.output_path,
vcodec="libx264",
crf=validate_crf(crf),
acodec="aac"
)
ffmpeg.run(output.overwrite_output())
print(f"✅ Karaoke video saved: {self.output_path}")
# Usage example
karaoke = KaraokeGenerator(
"input.mp4",
"karaoke_output.mp4",
font_size=80,
text_color_rgb=(255, 255, 255), # White
highlight_color_rgb=(255, 0, 0) # Red
)
# Add lyrics
karaoke.add_line(
start_time=1.0,
words=["Never", "gonna", "give", "you", "up"],
durations=[0.8, 0.6, 0.6, 0.5, 0.7]
)
karaoke.add_line(
start_time=4.3,
words=["Never", "gonna", "let", "you", "down"],
durations=[0.8, 0.6, 0.6, 0.5, 0.7]
)
# Render
karaoke.render(crf=18)
import ffmpeg
from typing import Optional
class TextOverlay:
"""Type-safe text overlay builder."""
def __init__(self, text: str):
self.text = text
self.fontsize: int = 48
self.fontcolor: str = "white"
self.x: str = "10"
self.y: str = "10"
self.borderw: int = 0
self.bordercolor: str = "black"
self.shadowx: int = 0
self.shadowy: int = 0
self.enable: Optional[str] = None
self.alpha: Optional[str] = None
def set_size(self, size: int) -> 'TextOverlay':
"""Set font size (fluent API)."""
self.fontsize = validate_fontsize(size)
return self
def set_color(self, color: str) -> 'TextOverlay':
"""Set font color (fluent API)."""
self.fontcolor = validate_ffmpeg_color(color)
return self
def center(self) -> 'TextOverlay':
"""Center text horizontally and vertically."""
self.x = "(w-tw)/2"
self.y = "(h-th)/2"
return self
def set_border(self, width: int, color: str = "black") -> 'TextOverlay':
"""Add text border."""
if not isinstance(width, int) or width < 0:
raise ValueError(f"Border width must be non-negative int")
self.borderw = width
self.bordercolor = validate_ffmpeg_color(color)
return self
def set_shadow(self, x: int, y: int, color: str = "black") -> 'TextOverlay':
"""Add text shadow."""
self.shadowx = x
self.shadowy = y
return self
def fade_in(self, duration: float) -> 'TextOverlay':
"""Add fade-in effect."""
self.alpha = f"min(1,t/{duration})"
return self
def show_between(self, start: float, end: float) -> 'TextOverlay':
"""Show text between specific times."""
self.enable = f"between(t,{start},{end})"
return self
def apply(self, stream) -> ffmpeg.Stream:
"""Apply text overlay to ffmpeg stream."""
params = {
"text": self.text,
"fontsize": self.fontsize,
"fontcolor": self.fontcolor,
"x": self.x,
"y": self.y,
}
if self.borderw > 0:
params["borderw"] = self.borderw
params["bordercolor"] = self.bordercolor
if self.shadowx != 0 or self.shadowy != 0:
params["shadowx"] = self.shadowx
params["shadowy"] = self.shadowy
if self.enable:
params["enable"] = self.enable
if self.alpha:
params["alpha"] = self.alpha
return stream.drawtext(**params)
# Usage with fluent API
input_file = ffmpeg.input("input.mp4")
overlay = (
TextOverlay("Hello World")
.set_size(72)
.set_color("white")
.center()
.set_border(3, "black")
.set_shadow(2, 2, "black")
.fade_in(1.0)
.show_between(1.0, 5.0)
)
output = overlay.apply(input_file)
output = ffmpeg.output(output, "output.mp4", vcodec="libx264", crf=18)
ffmpeg.run(output.overwrite_output())
When integrating FFmpeg with Python:
fontsize → int (not str)crf → int (not float)b:v, audio_bitrate → str with unit ("5M", "192k")str (named or hex)str (quoted)float for seconds, int for frames.audioThis reference ensures type-safe, bug-free Python-FFmpeg integration with accurate parameter mappings and proper unit conversions.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.