Help us improve
Share bugs, ideas, or general feedback.
From alemtuzlak-skills
Turns raw talking-head/screen-share recordings into finished edited videos with transcript-synced overlays, then generates blog/social/YouTube content.
npx claudepluginhub alemtuzlak/skills --plugin self-improveHow this skill is triggered — by the user, by Claude, or both
Slash command
/alemtuzlak-skills:produce-videoThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Take a recording the user provides and turn it into a finished video. The recording is the spine and drives the timeline; this skill edits (silence + chosen mistakes/pacing) and augments (overlays), then renders and generates launch content. It does NOT author a synthetic video — for PR-driven synthetic promos use `hyperframes-video` instead.
references/frame-analysis.mdreferences/framing-safe-zones.mdreferences/mistake-detection.mdreferences/overlay-triggers.mdscripts/bake-cuts.mjsscripts/bake-cuts.test.mjsscripts/desilence.mjsscripts/desilence.test.mjsscripts/extract-frame.mjsscripts/extract-frame.test.mjstemplates/overlays/CodeSnippet.htmltemplates/overlays/Comparison.htmltemplates/overlays/Diagram.htmltemplates/overlays/ImageEmbed.htmltemplates/overlays/ListSteps.htmltemplates/overlays/OnScreenHighlight.htmltemplates/overlays/PunchInZoom.htmltemplates/overlays/SectionLabel.htmltemplates/overlays/WordHighlight.htmltemplates/project/DESIGN.mdAI-assisted video editing workflows that cut, structure, and augment real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut.
AI-assisted video editing workflow for cutting, structuring, and enhancing real footage using FFmpeg, Remotion, ElevenLabs, and fal.ai. Activates when users want to edit video, trim clips, make vlogs, or build video content.
Guides end-to-end video production from ideation, scripting, and storyboarding through shooting, editing, sound design, publishing, and performance analysis for YouTube and brand content.
Share bugs, ideas, or general feedback.
Take a recording the user provides and turn it into a finished video. The recording is the spine and drives the timeline; this skill edits (silence + chosen mistakes/pacing) and augments (overlays), then renders and generates launch content. It does NOT author a synthetic video — for PR-driven synthetic promos use hyperframes-video instead.
Quality principle (non-negotiable): every footage step preserves the original quality. Never let a tool re-encode the footage at low quality. De-silence and cuts run at near-lossless CRF 12 from the original (auto-editor is used only to DETECT silence, never to encode — its default crushes footage to ~0.6 Mbps), and the render runs at CRF 12. CRF gives small files for static screencasts; that is full quality, not loss.
/transcribe-video — all transcription (returns word-level transcript.words.json). Invoke it; do not transcribe by hand.hyperframes — composition authoring rules (Visual Identity Gate, Layout Before Animation, timeline contract, video/overlay layering). Invoke before writing/mounting any composition HTML.hyperframes-cli — every npx hyperframes command (init, lint, inspect, preview, render).gsap — GSAP timeline/easing patterns for overlays.youtube-copy — YouTube title/description/tags/chapters (P6).blog-post — the blog post (P6).social-copy — X/Bluesky/LinkedIn/Reddit copy (P6).scripts/desilence.mjs — auto-editor silence removal + before/after duration.scripts/extract-frame.mjs — single-frame grab (framing detection + on-screen highlight).scripts/bake-cuts.mjs — ffmpeg trim/concat cutter (dense keyframes) used in P2.templates/overlays/*.html — 9 overlay sub-composition templates (filled per instance; sub-comps are standalone HTML docs in the installed CLI — read each file's header comment).templates/project/{DESIGN.md,styles.css} — Visual Identity Gate + brand var(--brand-*) tokens the overlays consume.references/ — mistake-detection.md, overlay-triggers.md, framing-safe-zones.md, frame-analysis.md.youtube-copy, blog-post, and social-copy skills (no vendored LLM script).npx hyperframes preview and opens the browser tab itself. NEVER ask the user to run preview or open a URL, and NEVER substitute a terminal-only summary for the live preview."Use sane defaults" / "don't ask questions" skips the configuration questions (P0) and the iteration prompting — it NEVER skips the approval gates, the agent-launched preview, or the explicit-render gate.
Verify the environment before doing any work; warn the user about anything missing and how to fix it.
auto-editor --version. If it is not installed, STOP and tell the user: "auto-editor is not installed - install it with pipx install auto-editor (or pip install auto-editor), then re-run." Do not attempt P1 without it.ffmpeg -version. If missing, STOP and tell the user to install ffmpeg and add it to PATH. The pipeline cannot run without it./transcribe-video): docker info. Warn if the daemon is down (start Docker Desktop).node --version. Warn if older.transcribe-video (P2/P3 transcription), hyperframes + hyperframes-cli + gsap (P2/P4/P5 composition & render), youtube-copy + social-copy (P6). blog-post only if the user opts into a blog (P6, optional).hyperframes Visual Identity Gate — detect from the repo, else ask 3 style questions; write templates/project/DESIGN.md + brand values into styles.css); overlay density (sparse / balanced / rich).<OUT>): everything this skill produces lives under ONE folder, default <recording-dir>/<video-slug>-produced/ (override with a user-supplied path). Create it now. Final structure:
<OUT>/
final.mp4 # rendered video (P5)
hyperframes/ # the HyperFrames project: index.html, styles.css, DESIGN.md, compositions/, renders/ (P4–P5)
transcript.txt # final edited transcript (P3 / P6)
transcript.srt
transcript.words.json
youtube.md # P6
socials.md # P6
blog.md # P6
.work/ # intermediates: 01_desilenced.mp4, 02_edit_N.mp4, frames/ (prunable in P7)
All subsequent phases write inside <OUT> — never scatter outputs elsewhere. <work> below = <OUT>/.work.node scripts/desilence.mjs <input> --out <work>/01_desilenced.mp4 --margin 0.3s
desilence.mjs uses auto-editor ONLY to detect the silent ranges (--export v3), then cuts the ORIGINAL with ffmpeg at near-lossless CRF 12, keeping the original audio. Do NOT let auto-editor encode the output — its default re-encode crushes the footage to ~0.6 Mbps and permanently blurs it (faces especially); no downstream bitrate can recover that. Report before/after duration and seconds removed. This becomes the working copy.
/transcribe-video on <work>/01_desilenced.mp4 with --out <work> → working transcript (intermediate; superseded by the P3 final transcript).references/mistake-detection.md (cuts + speed-ramps), snapped to word boundaries.npx hyperframes preview and opens the browser itself (per references/framing-safe-zones.md). This is the decision surface — never replace it with a terminal table. A one-line terminal note ("3 flags marked in the preview — review and tell me which to cut/ramp/keep") is the most the terminal should carry.<work>/02_edit_N.mp4, then re-preview:
node scripts/bake-cuts.mjs <input> --out <work>/02_edit_N.mp4 --cut s,e --cut s,e … (ffmpeg trim/concat at near-lossless CRF 12 by default — trims video+audio identically in one pass). Do NOT use auto-editor --cut-out for the mistake cuts — its variadic arg mis-assigns the last range as a positional input, it re-encodes lossily, and a select-filter approach desyncs audio.--set-speed-for-range speed,s,e (pitch preserved); pass ONE range per invocation, or place --set-speed-for-range last on the command line, to dodge the same variadic-arg pitfall.<work>/final_cut.mp4.Cuts and ramps are baked into the footage HERE (before P3) so the final transcript's word timestamps match the edited timeline.
Invoke /transcribe-video on <work>/final_cut.mp4 with --out <OUT> so transcript.txt / transcript.srt / transcript.words.json land at the top of the output folder. This word-timestamped transcript is the sync source for all overlays (P4) and the input to content generation (P6).
references/framing-safe-zones.md): sample frames with extract-frame.mjs, classify phases, confirm phases + safe zones with the user.references/overlay-triggers.md): scan the word-timestamped transcript → overlay plan rows {start, est_duration, type, content, safe_zone, why}. For on-screen code, use references/frame-analysis.md (extract frame → locate → normalized box → stability guard). For code cards, repo-if-available-else-synthesize (flag synthesized).hyperframes + gsap first):
npx hyperframes init <OUT>/hyperframes --video <OUT>/.work/final_cut.mp4 (the project lives at <OUT>/hyperframes; base: video track 0 muted playsinline; audio on a separate track at volume 1 so the voice plays). Read the generated index.html to learn this CLI version's sub-composition include syntax.templates/project/styles.css (with the confirmed brand values) into <OUT>/hyperframes.templates/overlays/*.html, give it a unique composition id + data-composition-id, fill its {{tokens}}, set data-start = the transcript timestamp and data-duration = the dwell, mount it above the video. Punch-in zooms scale the video's WRAPPER div (never the <video>).rgba(8,10,14,0.92), rounded, subtle border/shadow); titles/body are white (var(--brand-text)), secondary/kicker/meta are light slate (#CBD5E1). NEVER use var(--brand-accent)/var(--brand-primary) for body/subtitle/kicker text — accent is for bars, borders, marker sweeps, icons, and big display words only (and only when they clear ~4.5:1 on the panel). When unsure, make text white. npx hyperframes validate (WCAG contrast) must pass with zero contrast warnings.npx hyperframes lint + npx hyperframes inspect (overflow + safe zones) + npx hyperframes validate (contrast). Fix until clean.npx hyperframes preview and opens the browser. Freeform iteration loop on overlays.standard quality encodes at ~1.4 Mbps and visibly crushes footage; always pass a low CRF:
npx hyperframes render --output <OUT>/final.mp4 --crf 12
(run from <OUT>/hyperframes. CRF 12 is near-lossless; do NOT use --video-bitrate expecting a big file — for static screencasts CRF produces a small file at full quality, and that is correct, not low quality.) Final mp4 keeps original audio. After render, further edits return to the loop and require a fresh explicit render command.--format webm/mov and ffmpeg-composite them over the cut footage (punch-in zoom then applied via ffmpeg) — rarely needed.Generate the launch content by invoking dedicated skills, each on the final edited transcript (P3 — pass the timestamped transcript.srt so YouTube chapters get accurate times), writing each output into <OUT>:
youtube-copy → <OUT>/youtube.md (click-worthy title, above-the-fold description, tags, and timestamped chapters from the SRT). Default on.social-copy → <OUT>/socials.md (X + thread, Bluesky, LinkedIn, Reddit). Default on.blog-post → <OUT>/blog.md (full blog post from the transcript). Optional — ask the user "Want a blog post too?" and only run it if they say yes (default: skip). A blog is a bigger artifact many videos don't need.Invoke each via the Skill tool with the transcript as the source; they own their own quality rules. Do NOT use a monolithic LLM script for this. If a skill is unavailable, note it and continue with the others (the rendered video is already saved).
Writing rule for ALL generated text (titles, descriptions, chapters, blog, socials): never use em-dashes (— / –), use a hyphen -.
<OUT> contains the consolidated result: final.mp4, hyperframes/, transcript.*, youtube.md, socials.md, blog.md.<OUT>/.work/ (intermediates: de-silenced/cut files, extracted frames). The user owns disposition.<OUT> in the file explorer for the user and report the single folder path.Docker (for /transcribe-video), auto-editor + ffmpeg, Node ≥22, a HyperFrames-capable environment, and an LLM provider for P6.