Skill

image-generation

From gr

Enhances image generation prompts with Subject-Context-Style structure, lighting physics, camera terminology, and character consistency patterns. Useful for creating detailed, physically coherent image prompts.

design

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/gr:image-generation

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Enhance every image generation prompt around three core elements:

SKILL.md

143 lines · ~1.6k tokens

Stats

LanguagePython

Stars21

Forks5

MaintenanceExcellent

Last CommitMay 21, 2026

Actions

View Source View Plugin View on GitHub View README

Image Generation Prompt Best Practices

Prompt Structure

Enhance every image generation prompt around three core elements:

1. SUBJECT (What)

The main focus of the image.

Physical characteristics: textures, materials, colors, scale
Actions, poses, expressions if applicable
Distinctive features that define the subject

2. CONTEXT (Where/When)

The environment and conditions.

Setting, background, spatial relationships (foreground, midground, background)
Time of day, weather, atmospheric conditions
Mood and emotional tone of the scene

3. STYLE (How)

The visual treatment.

Artistic or photographic approach: reference specific artists, movements, or styles
Lighting design: direction, quality, color temperature, shadows
Camera/lens choices: specify focal length, aperture, and shooting angle when photographic

Core Principles

Preserve intent -- Enrich the user's original vision, never override it
Positive descriptions only -- Describe what should be present; rephrase any exclusion as an inclusion
Specific over vague -- "golden hour sunlight at 15 degree angle" beats "nice lighting"
Natural flow -- Weave elements into a single flowing description, not a bullet list

Enhancement Patterns

Hyper-Specific Details

Add concrete visual details where the user left gaps:

Lighting: direction, quality, color temperature, shadow behavior. Always name the physical source ("warm afternoon sun through west window", not "warm lighting") -- named sources produce consistent shadows
Textures: surface materials, weathering, reflectivity
Atmosphere: particulates, humidity, depth haze
Scale: relative sizes, distances, proportions

Camera Control Terminology

When a photographic look is appropriate:

Lens type: "shot with 85mm portrait lens", "wide-angle 24mm"
Aperture: "shallow depth of field at f/1.8", "deep focus at f/11"
Angle: "low angle emphasizing height", "bird's eye view"
Motion: "motion blur on the paws", "frozen mid-action"

Atmospheric Enhancement

Convey mood through environmental details:

Emotional tone: "serene", "ominous", "jubilant"
Light quality: "dappled shadows", "harsh midday sun", "soft diffused overcast"
Weather/air: "morning mist", "dust particles in a sunbeam"

Text in Images

When the image should contain readable text (signs, labels, titles, typography):

Specify the exact text content in quotes: "OPEN 24 HOURS" in bold sans-serif
Describe visual treatment: font style, weight, size relative to the scene
Define placement and integration: "centered on the storefront awning", "hand-lettered on the chalkboard"

Feature Patterns

Character Consistency

When the same character must be recognizable across multiple images:

Include at least 3 recognizable visual markers (distinctive scar, signature clothing, unique hairstyle, characteristic accessory)
Use anchoring words: "distinctive", "signature", "always wears", "always has"
Be specific: "round tortoiseshell glasses" not just "glasses"
Use inputImagePath to iterate on a base character image until all markers are locked in, then use that locked image as the reference for subsequent generations

Compositional Integration (Multi-Element Blending)

When combining multiple visual elements in one scene:

Define spatial relationships with proportions: "foreground (40% of frame)", "midground", "background"
Use integration language: "seamlessly blending", "harmoniously composed", "naturally integrated"
Specify relative scale and interaction between elements

Real-World Accuracy

When depicting real places, cultures, or historical elements:

Use specific terminology: "traditional Edo-period architecture", "authentic Moroccan zellige tilework"
Include culturally accurate details
Reference geographical or historical specifics

Purpose-Driven Enhancement

Tailor the prompt to the intended use:

Purpose	Emphasis
Product photo	Clean background, studio lighting, commercial appeal
UI mockup	Flat design elements, consistent spacing, screen-appropriate
Presentation slide	Bold composition, clear focal point, text-friendly layout
Social media	Eye-catching, vibrant, crop-friendly aspect ratio
Book/album cover	Typography space, dramatic mood, symbolic elements
Video style anchor	Highest quality, 4K resolution, named physical light source, fine surface textures, film-like grain. This image becomes the visual reference for downstream video generation -- maximize detail and lighting consistency

Video Style Anchor Pipeline

When generating images that will serve as style references for AI video production:

Use highest quality: request maximum quality and 4K resolution -- the anchor conditions all downstream video
Character consistency: maintain character consistency when generating multiple hero images for the same scene or character
Iterate before committing: use inputImagePath to refine the hero image until lighting, texture, and composition are exactly right
Blend for composites: combine reference photography with branded elements when building composite anchors
Match video prompt lighting: use the exact same physical light source description in the image prompt that will appear in the video prompt -- shadow direction must be consistent across the chain

Image Editing

When modifying an existing image:

Preserve the original's core characteristics: color palette, lighting style, composition
Use anchoring phrases: "maintain the existing...", "preserve the original...", "keep the same..."
Be specific about what to change vs what to keep unchanged
Describe modifications relative to the existing image, not from scratch

Example

Input: "A happy dog in a park"

Enhanced: "Golden retriever mid-leap catching a red frisbee, ears flying, tongue out in joy, in a sunlit urban park. Soft morning light filtering through oak trees creates dappled shadows on emerald grass. Background shows families on picnic blankets, slightly out of focus. Shot from low angle emphasizing the dog's athletic movement, with motion blur on the paws suggesting speed."

image-generation

Popularity

Invocation

Context Preview

SKILL.md

image-generation

Popularity

Invocation

Context Preview

SKILL.md

Image Generation Prompt Best Practices

Prompt Structure

1. SUBJECT (What)

2. CONTEXT (Where/When)

3. STYLE (How)

Core Principles

Enhancement Patterns

Hyper-Specific Details

Camera Control Terminology

Atmospheric Enhancement

Text in Images

Feature Patterns

Character Consistency

Compositional Integration (Multi-Element Blending)

Real-World Accuracy

Purpose-Driven Enhancement

Video Style Anchor Pipeline

Image Editing

Example

Similar Skills

Image Generation Prompt Best Practices

Prompt Structure

1. SUBJECT (What)

2. CONTEXT (Where/When)

3. STYLE (How)

Core Principles

Enhancement Patterns

Hyper-Specific Details

Camera Control Terminology

Atmospheric Enhancement

Text in Images

Feature Patterns

Character Consistency

Compositional Integration (Multi-Element Blending)

Real-World Accuracy

Purpose-Driven Enhancement

Video Style Anchor Pipeline

Image Editing

Example

Similar Skills