Skill

visual-creation

AI image and video generation. Use when: creating artwork, images, illustrations, animations, videos, visual assets, AI art generation, style guidance, choosing image or video models, text-in-image.

Install

npx claudepluginhub yzavyas/claude-1337 --plugin visuals-1337

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Decision frameworks for AI image and video generation. Not tutorials — corrections, gotchas, and "which tool for which job."

Supporting Assets

references/image-models.mdreferences/midjourney/animation.mdreferences/midjourney/parameters.mdreferences/midjourney/reference-types.mdreferences/midjourney/v7-guide.mdreferences/video-models.mdscripts/gemini-image.py

SKILL.md

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

139.2k

mcp-builder

9 files

Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).

anthropics-skills-13

124.2k

canvas-design

20 files

Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.

anthropics-skills-13

124.2k

Stats

Parent Repo Stars3

Parent Repo Forks2

Last CommitJan 18, 2026

Actions

View Source View Plugin View on GitHub View README

AI Visual Creation

Decision frameworks for AI image and video generation. Not tutorials — corrections, gotchas, and "which tool for which job."

Midjourney: Version Gotchas

V7 Breaking Changes (Critical)

Feature	V6	V7
Multi-prompt `::` weighting	✅ Works	⚠️ CHANGED (different behavior)
Negative weights `::-0.5`	✅ Works	⚠️ Less predictable
`--cref` (Character Ref)	✅	❌ DEPRECATED (use `--oref`)
`--stylize` scale	0-1000	0-1000 (different results!)
`--no` parameter	✅	✅
`--iw` range	0-2	0-3
`--oref` (Omni Reference)	❌	✅ New (2x GPU cost)
`--draft` mode	❌	✅ New (10x faster, half cost)
`--exp` parameter	❌	✅ New (0-100)

Stylize Scale Migration: V6 --s 100 ≈ V7 --s 300-400 | V6 --s 250 ≈ V7 --s 600-700

V7 workarounds for changed weighting:

Word order matters (early = more weight)
Use natural language emphasis
--no for exclusion
Repetition for emphasis

V6 prompt: cyberpunk::2 nature::1 dystopian::-0.5 V7 equivalent: cyberpunk city with nature elements, NOT dystopian --no dystopian, grim, dark

Midjourney: Reference Type Decision

Quick Selector

I want...	Use	Parameter	Version
Composition inspiration + text	Image Prompt	`--iw 1-2`	All
Same aesthetic, different subject	`--sref`	`--sw 100-300`	All
Same character, new pose/outfit	`--cref`	`--cw 0-50`	V6 only
Same character, keep everything	`--cref`	`--cw 100`	V6 only
Exact object/character preservation	`--oref`	`--ow 100-400`	V7 only

⚠️ V7 Migration: --cref deprecated in V7. Use --oref instead (works for characters AND objects).

Reference Type Deep Dive

Image Prompt (--iw)

Mental model: Addition (image + text = result)
Preserves: Composition, layout
Changes: Details, style via text
Range: 0-3 (V7), 0-2 (V6)

Style Reference (--sref)

Mental model: Multiplication (style × subject = result)
Preserves: Color palette, mood, rendering
Changes: Subject, composition entirely
Range: --sw 0-1000

Character Reference (--cref) — V6 ONLY

⚠️ Deprecated in V7 — use --oref instead
CRITICAL: Works best with Midjourney-generated images, NOT real photos
--cw 0 = face only (max outfit flexibility)
--cw 100 = everything (face, hair, clothing)
Cannot preserve: fine freckles, small logos, detailed tattoos

Omni Reference (--oref) — V7 ONLY

2x GPU cost
Only ONE reference allowed
NOT compatible with inpainting/outpainting or Draft mode
Competing params: high --stylize needs higher --ow to balance

Common Failures

Problem	Cause	Fix
Reference ignored	--iw too low	Increase to 2.0+
Shape lost, got mandala	Symmetry bias	Add "asymmetrical", use `--no symmetric, mandala`
Character looks different	Using real photo	Use Midjourney-generated source
Style overwhelms shape	High --sw, low --iw	Lower --sw OR increase --iw
--oref not working	V6 or Draft mode	Switch to V7 standard mode

Model Selection: Images

Decision Matrix

Need	Best Choice	Why	Backup
Photorealism	Flux 2 / Imagen 4	Best benchmark quality	Midjourney V7
Artistic/stylized	Midjourney V7	Color harmony, mood, abstract	Leonardo.ai
Text in images	Ideogram 3.0	85-90% accuracy (best)	GPT Image 1.5
Character consistency	Leonardo.ai	Custom LoRA training	Flux Kontext
Technical diagrams	Flux 2	Text + spatial control	Recraft V3
Speed priority	SDXL / SD4 Turbo	13 sec/image	Ideogram Turbo
Quality priority	Flux 2 Pro	Best 2026 benchmarks	GPT Image 1.5
Commercial safety	Adobe Firefly	Licensed training only	DALL-E 3
Budget (API)	Flux Schnell	$0.003/image	SDXL
Open source	Stable Diffusion	80% market share	HunyuanImage

Text Rendering Hierarchy

Best → Worst: Ideogram 3.0 (85-90%) >> GPT Image 1.5 >> Recraft V3 >> Flux 2 (~60%) >> Imagen 4 >> DALL-E 3 >> Midjourney V7 (~15% better than V6, still poor)

Rule: If you need readable text, don't use Midjourney. Use Ideogram, GPT Image, or Flux 2.

Model Selection: Video

Decision Matrix

Need	Best Choice	Why	Backup
Highest quality	Runway Gen-4.5	Benchmark leader (1,247 ELO)	Veo 3.1
With audio sync	Kling 2.6	Only simultaneous audio-visual	—
Longest duration	Kling 2.6	3 minutes native	Runway
Character consistency	Kling O1	Unified multimodal	Kling 2.6
Professional color	Luma Ray3	Only native HDR, 16-bit EXR	Runway
Budget	Hailuo 2.3	Best cost-effectiveness	Kling 2.3
Free/open source	HunyuanVideo	Beats Gen-3 quality	Stable Video

Key Insight

Audio-visual sync is now a competitive differentiator. Only Kling 2.6 generates video + voiceover + sound effects + ambient audio in a single pass.

Troubleshooting Patterns

"It won't preserve the shape"

Use Image Prompt with high --iw (2.0+)
Match aspect ratio (input 1:1 → output --ar 1:1)
Add --style raw for tighter adherence
Lower --stylize (30-50) for more literal interpretation
If still failing: Try Imagen 4 or Flux 2 — they preserve shapes more literally than Midjourney

"It keeps making it symmetric"

Midjourney defaults to symmetry. Fight it:

Add "asymmetrical" keyword explicitly
Use --no symmetric, mandala, radial, mirrored, balanced, centered
Add --chaos 6-10
Use directional language ("positioned to the left", "stepping diagonally")
Material words help ("weathered metal", "carved stone" resist perfect symmetry)

"Style overwhelms subject"

Balance the competing forces:

Lower --sw (style weight)
Increase --iw (image weight) if using reference
Use --style raw
Simplify text prompt

"Character keeps changing"

V7 (recommended):

Use --oref with Midjourney-generated source (2x GPU cost)
Start at --ow 100, increase to 200-400 for facial accuracy
For many images: Leonardo.ai with custom LoRA

V6 (legacy):

Use --cref with Midjourney-generated source (not real photos)
--cw 0 for face only, --cw 100 for everything

References

Need	Load
Midjourney reference types detail	midjourney/reference-types.md
Midjourney V7 full guide	midjourney/v7-guide.md
Midjourney parameters	midjourney/parameters.md
Midjourney animation/video	midjourney/animation.md
Image model comparison	image-models.md
Video model comparison	video-models.md

Sources: All claims cite official documentation (docs.midjourney.com, vendor APIs) and benchmarks (Artificial Analysis, LM Arena). Full URLs in reference files.