From creative-skills
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
npx claudepluginhub rnben/hermes-skills --plugin creative-skillsThis skill uses the workspace's default tool permissions.
This is visual art. ASCII characters are the medium; cinema is the standard.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
Analyzes competition with Porter's Five Forces, Blue Ocean Strategy, and positioning maps to identify differentiation opportunities and market positioning for startups and pitches.
This is visual art. ASCII characters are the medium; cinema is the standard.
Before writing a single line of code, articulate the creative concept. What is the mood? What visual story does this tell? What makes THIS project different from every other ASCII video? The user's prompt is a starting point — interpret it with creative ambition, not literal transcription.
First-render excellence is non-negotiable. The output must be visually striking without requiring revision rounds. If something looks generic, flat, or like "AI-generated ASCII art," it is wrong — rethink the creative concept before shipping.
Go beyond the reference vocabulary. The effect catalogs, shader presets, and palette libraries in the references are a starting vocabulary. For every project, combine, modify, and invent new patterns. The catalog is a palette of paints — you write the painting.
Be proactively creative. Extend the skill's vocabulary when the project calls for it. If the references don't have what the vision demands, build it. Include at least one visual moment the user didn't ask for but will appreciate — a transition, an effect, a color choice that elevates the whole piece.
Cohesive aesthetic over technical correctness. All scenes in a video must feel connected by a unifying visual language — shared color temperature, related character palettes, consistent motion vocabulary. A technically correct video where every scene uses a random different effect is an aesthetic failure.
Dense, layered, considered. Every frame should reward viewing. Never flat black backgrounds. Always multi-grid composition. Always per-scene variation. Always intentional color.
| Mode | Input | Output | Reference |
|---|---|---|---|
| Video-to-ASCII | Video file | ASCII recreation of source footage | references/inputs.md § Video Sampling |
| Audio-reactive | Audio file | Generative visuals driven by audio features | references/inputs.md § Audio Analysis |
| Generative | None (or seed params) | Procedural ASCII animation | references/effects.md |
| Hybrid | Video + audio | ASCII video with audio-reactive overlays | Both input refs |
| Lyrics/text | Audio + text/SRT | Timed text with visual effects | references/inputs.md § Text/Lyrics |
| TTS narration | Text quotes + TTS API | Narrated testimonial/quote video with typed text | references/inputs.md § TTS Integration |
Single self-contained Python script per project. No GPU required.
| Layer | Tool | Purpose |
|---|---|---|
| Core | Python 3.10+, NumPy | Math, array ops, vectorized effects |
| Signal | SciPy | FFT, peak detection (audio modes) |
| Imaging | Pillow (PIL) | Font rasterization, frame decoding, image I/O |
| Video I/O | ffmpeg (CLI) | Decode input, encode output, mux audio |
| Parallel | concurrent.futures | N workers for batch/clip rendering |
| TTS | ElevenLabs API (optional) | Generate narration clips |
| Optional | OpenCV | Video frame sampling, edge detection |
Every mode follows the same 6-stage pipeline:
INPUT → ANALYZE → SCENE_FN → TONEMAP → SHADE → ENCODE
uint8 H,W,3). Composes multiple character grids via _render_vf() + pixel blend modes. See references/composition.mdreferences/composition.md § Adaptive TonemapShaderChain + FeedbackBuffer. See references/shaders.md| Dimension | Options | Reference |
|---|---|---|
| Character palette | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), project-specific | architecture.md § Palettes |
| Color strategy | HSV, OKLAB/OKLCH, discrete RGB palettes, auto-generated harmony, monochrome, temperature | architecture.md § Color System |
| Background texture | Sine fields, fBM noise, domain warp, voronoi, reaction-diffusion, cellular automata, video | effects.md |
| Primary effects | Rings, spirals, tunnel, vortex, waves, interference, aurora, fire, SDFs, strange attractors | effects.md |
| Particles | Sparks, snow, rain, bubbles, runes, orbits, flocking boids, flow-field followers, trails | effects.md § Particles |
| Shader mood | Retro CRT, clean modern, glitch art, cinematic, dreamy, industrial, psychedelic | shaders.md |
| Grid density | xs(8px) through xxl(40px), mixed per layer | architecture.md § Grid System |
| Coordinate space | Cartesian, polar, tiled, rotated, fisheye, Möbius, domain-warped | effects.md § Transforms |
| Feedback | Zoom tunnel, rainbow trails, ghostly echo, rotating mandala, color evolution | composition.md § Feedback |
| Masking | Circle, ring, gradient, text stencil, animated iris/wipe/dissolve | composition.md § Masking |
| Transitions | Crossfade, wipe, dissolve, glitch cut, iris, mask-based reveal | shaders.md § Transitions |
Never use the same config for the entire video. For each section/scene:
For every project, invent at least one of:
Don't just pick from the catalog. The catalog is vocabulary — you write the poem.
Before any code, articulate the creative concept:
Map the user's prompt to aesthetic choices. A "chill lo-fi visualizer" demands different everything from a "glitch cyberpunk data stream."
references/optimization.mdSingle Python file. Components (with references):
references/optimization.mdreferences/inputs.mdreferences/architecture.mdreferences/architecture.md § Palettesreferences/architecture.md § Colorcanvas (uint8 H,W,3); references/scenes.mdreferences/composition.mdShaderChain + FeedbackBuffer; references/shaders.mdreferences/scenes.mdcanvas.mean() > 8 for all ASCII content. If dark, lower gammatonemap(), Not Linear MultipliersThis is the #1 visual issue. ASCII on black is inherently dark. Never use canvas * N multipliers — they clip highlights. Use adaptive tonemap:
def tonemap(canvas, gamma=0.75):
f = canvas.astype(np.float32)
lo, hi = np.percentile(f[::4, ::4], [1, 99.5])
if hi - lo < 10: hi = lo + 10
f = np.clip((f - lo) / (hi - lo), 0, 1) ** gamma
return (f * 255).astype(np.uint8)
Pipeline: scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg
Per-scene gamma: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85. Use screen blend (not overlay) for dark layers.
macOS Pillow: textbbox() returns wrong height. Use font.getmetrics(): cell_height = ascent + descent. See references/troubleshooting.md.
Never stderr=subprocess.PIPE with long-running ffmpeg — buffer fills at 64KB and deadlocks. Redirect to file. See references/troubleshooting.md.
Not all Unicode chars render in all fonts. Validate palettes at init — render each char, check for blank output. See references/troubleshooting.md.
For segmented videos (quotes, scenes, chapters), render each as a separate clip file for parallel rendering and selective re-rendering. See references/scenes.md.
| Component | Budget |
|---|---|
| Feature extraction | 1-5ms |
| Effect function | 2-15ms |
| Character render | 80-150ms (bottleneck) |
| Shader pipeline | 5-25ms |
| Total | ~100-200ms/frame |
| File | Contents |
|---|---|
references/architecture.md | Grid system, resolution presets, font selection, character palettes (20+), color system (HSV + OKLAB + discrete RGB + harmony generation), _render_vf() helper, GridLayer class |
references/composition.md | Pixel blend modes (20 modes), blend_canvas(), multi-grid composition, adaptive tonemap(), FeedbackBuffer, PixelBlendStack, masking/stencil system |
references/effects.md | Effect building blocks: value field generators, hue fields, noise/fBM/domain warp, voronoi, reaction-diffusion, cellular automata, SDFs, strange attractors, particle systems, coordinate transforms, temporal coherence |
references/shaders.md | ShaderChain, _apply_shader_step() dispatch, 38 shader catalog, audio-reactive scaling, transitions, tint presets, output format encoding, terminal rendering |
references/scenes.md | Scene protocol, Renderer class, SCENES table, render_clip(), beat-synced cutting, parallel rendering, design patterns (layer hierarchy, directional arcs, visual metaphors, compositional techniques), complete scene examples at every complexity level, scene design checklist |
references/inputs.md | Audio analysis (FFT, bands, beats), video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
references/optimization.md | Hardware detection, quality profiles, vectorized patterns, parallel rendering, memory management, performance budgets |
references/troubleshooting.md | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling, brightness diagnostics, ffmpeg issues, font problems, common mistakes |
If the user asks for creative, experimental, surprising, or unconventional output, select the strategy that best fits and reason through its steps BEFORE generating code.