Use when users request image generation, AI art creation, image editing with Gemini models, need help crafting prompts, or want brand-styled imagery. Handles both direct generation and interactive prompt design.
Uses uv heredocs to generate/edit images with Gemini models. Triggers on requests for image creation, editing, or prompt crafting help.
/plugin marketplace add pigfoot/claude-code-hubs/plugin install superpowers@pigfoot-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/brand-styles.mdreferences/guide.mdreferences/slide-deck-styles.mdQuick Python scripting with Gemini image generation using uv heredocs. No files needed for one-off tasks.
Supports two modes:
digraph when_to_use {
"User request" [shape=diamond];
"Explicit prompting request?" [shape=diamond];
"Prompt too vague?" [shape=diamond];
"Interactive Prompting Mode" [shape=box];
"Direct Generation Mode" [shape=box];
"User request" -> "Explicit prompting request?";
"Explicit prompting request?" -> "Interactive Prompting Mode" [label="yes"];
"Explicit prompting request?" -> "Prompt too vague?" [label="no"];
"Prompt too vague?" -> "Interactive Prompting Mode" [label="yes (<5 words)"];
"Prompt too vague?" -> "Direct Generation Mode" [label="no"];
}
Use this skill when:
Don't use when:
Check user's message for style specifications:
Structured syntax:
style: "trend" → Trend Micro brand colorsstyle: "notebooklm" or style: "slide" → NotebookLM presentation style (MUST apply visual characteristics)style: "custom" → Ask for custom color preferencesNatural language:
⚠️ CRITICAL for NotebookLM: When detected, apply the style characteristics (see Brand Style Integration section). NEVER include "NotebookLM" brand/logo/name in the Gemini prompt or generated slides - this violates trademark policies. notebooklm is a style trigger for Claude only - translate to descriptive terms like "clean professional presentation aesthetic" in the actual prompt.
Style detection precedence by mode:
| Mode | User Provides Style? | Action |
|---|---|---|
| Direct Generation | Yes (inline spec) | Use detected style immediately |
| Direct Generation | No | Generate with no style (default) |
| Interactive Prompting | Yes (inline spec) | Use detected style, skip style question |
| Interactive Prompting | No | Ask user about style preference in Step 2 |
Priority order (first match wins):
| Task | Pattern |
|---|---|
| Generate image | uv run - << 'EOF' with inline script |
| Edit image | Same, but contents=[prompt, img] |
| Complex workflow | Multiple small scripts, evaluate between |
| Model choice | If NANO_BANANA_MODEL set → use it (don't override). If not set → Claude chooses based on requirements |
| Output format | Default: webp, or NANO_BANANA_FORMAT env var (webp/jpg/png) |
| Output location | NNN-short-name/ (e.g., 001-cute-banana/) |
| Step | Action |
|---|---|
| 1. Gather | Check for reference images, style specs |
| 2. Clarify | Ask 2-4 questions about output type, subject, style |
| 3. Select Technique | Choose from 16+ patterns (see references/guide.md) |
| 4. Generate Prompt | Apply technique, brand style, aspect ratio |
| 5. Present | Show prompt with explanation and variations |
| 6. Execute | Generate image with crafted prompt |
Enter Interactive Prompting Mode when:
Use Direct Generation Mode when:
Default to heredoc for one-off tasks:
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
import os
import io
from pathlib import Path
from google import genai
from google.genai import types
from PIL import Image as PILImage
# Directory selection logic
# Claude decides based on conversation context and user intent:
# - Continuation of existing work → Specify existing directory
# - New unrelated topic → Use auto-increment
# - Uncertain → Ask user with AskUserQuestion
# Option 1: Reuse existing directory (for continuation)
# OUTPUT_DIR = Path("001-existing-topic") # Manually specify
# Option 2: Auto-increment for new topic (default)
existing_folders = sorted([d for d in Path(".").iterdir()
if d.is_dir() and len(d.name) >= 4
and d.name[:3].isdigit() and d.name[3] == '-'])
if existing_folders:
last_num = int(existing_folders[-1].name[:3])
next_num = last_num + 1
else:
next_num = 1
OUTPUT_DIR = Path(f"{next_num:03d}-cute-banana") # Format: NNN-short-name
OUTPUT_DIR.mkdir(exist_ok=True)
print(f"Using output directory: {OUTPUT_DIR}")
# Configuration from environment variables
# IMPORTANT: If NANO_BANANA_MODEL is set, use it - DO NOT override
model = os.environ.get("NANO_BANANA_MODEL")
if not model:
# Only choose model when NANO_BANANA_MODEL is not set
# Claude decides based on user request:
# - Use "gemini-2.5-flash-image" ONLY if user explicitly mentions speed/budget
# - Use "gemini-3-pro-image-preview" (default) for quality, slides, or normal requests
model = "gemini-3-pro-image-preview" # Replace with appropriate choice
output_format = os.environ.get("NANO_BANANA_FORMAT", "webp").lower()
quality = int(os.environ.get("NANO_BANANA_QUALITY", "90"))
# Detect if lossless format is needed (for diagrams/slides)
# See "Lossless WebP Decision Logic" in Configuration section for complete rules
use_lossless = False # Set to True for slide deck styles or explicit user request
# Initialize client with optional custom endpoint
base_url = os.environ.get("GOOGLE_GEMINI_BASE_URL")
api_key = os.environ.get("GEMINI_API_KEY") or os.environ.get("GOOGLE_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set")
exit(1)
try:
if base_url:
client = genai.Client(api_key=api_key, http_options={'base_url': base_url})
else:
client = genai.Client(api_key=api_key)
config_params = {
'response_modalities': ['IMAGE']
}
response = client.models.generate_content(
model=model,
contents=["A cute banana character with sunglasses"],
config=types.GenerateContentConfig(**config_params)
)
if not response.parts:
print("Error: No image generated in response")
exit(1)
except Exception as e:
print(f"Error during image generation: {e}")
exit(1)
for part in response.parts:
if part.inline_data is not None:
# Get google-genai Image object
genai_image = part.as_image()
# Convert to PIL Image from bytes
pil_image = PILImage.open(io.BytesIO(genai_image.image_bytes))
# Save with format conversion
if output_format in ("jpg", "jpeg"):
output_path = OUTPUT_DIR / "generated.jpg"
pil_image.convert("RGB").save(output_path, "JPEG", quality=quality)
elif output_format == "webp":
output_path = OUTPUT_DIR / "generated.webp"
if use_lossless:
# Lossless WebP for slide decks (VP8L encoding)
# Saves 20-30% vs PNG, zero quality loss (vs lossy: saves 95% but blurs)
pil_image.save(output_path, "WEBP", lossless=True)
print(f"Saved: {output_path} (WEBP lossless, optimized for slides)")
else:
# Lossy WebP for photos (VP8 encoding)
pil_image.save(output_path, "WEBP", quality=quality)
print(f"Saved: {output_path} (WEBP, quality={quality})")
else: # png (default fallback)
output_path = OUTPUT_DIR / "generated.png"
pil_image.save(output_path, "PNG")
print(f"Saved: {output_path} (PNG)")
EOF
Key points:
google-genai (NOT google-generativeai)# /// script blockgoogle-genai, pillow (for .as_image() to get image bytes, then convert to PIL for saving)Save images to numbered directories:
Format: NNN-short-name/
Directory selection - Intent-based decision:
Claude determines whether this is a continuation of existing work or a new topic based on conversation context:
Continuation (reuse existing directory):
OUTPUT_DIR = Path("001-existing-topic")New topic (auto-increment new directory):
When uncertain:
AskUserQuestion to clarify: "Should I add this to the existing [topic] directory, or create a new one?"Examples:
# Example 1: New topic (auto-increment)
# User: "Generate user auth flow"
# → No existing context, creates: 001-user-auth-flow/
# Example 2: Continuation (reuse directory)
# Existing: 001-user-auth-flow/ with login.webp
# User: "Add a signup screen too"
# Claude understands: This is continuation of auth flow topic
# → Reuse: OUTPUT_DIR = Path("001-user-auth-flow")
# → Saves: 001-user-auth-flow/signup.webp
# Example 3: New topic (auto-increment)
# Existing: 001-user-auth-flow/
# User: "Create cute cat illustration"
# Claude understands: Different topic, not related to auth
# → Scans, finds 001, creates: 002-cute-cat/
# Example 4: Uncertain (ask user)
# Existing: 001-japan-trip/ with cover.webp, overview.webp
# User: "Generate a travel conclusion"
# Claude uncertain: Could be for japan-trip or new generic travel content
# → Ask: "Should I add this to the existing japan-trip directory, or create a new one?"
Customize plugin behavior with environment variables:
| Variable | Default | Description |
|---|---|---|
NANO_BANANA_MODEL | (Claude chooses) | Specify image generation model. If set, Claude will NOT override it. Valid models: gemini-3-pro-image-preview (quality), gemini-2.5-flash-image (speed) |
NANO_BANANA_FORMAT | webp | Output format: webp, jpg, or png |
NANO_BANANA_QUALITY | 90 | Image quality (1-100) for webp/jpg |
GOOGLE_GEMINI_BASE_URL | (official API) | Custom API endpoint (for non-official deployments) |
GEMINI_API_KEY | (falls back to GOOGLE_API_KEY) | API key (official or custom endpoint) |
Model Selection Guidelines:
When NANO_BANANA_MODEL is NOT set, Claude selects model based on user requirements:
gemini-3-pro-image-preview - Best quality, accurate colors, good text rendering (recommended for slides)gemini-2.5-flash-image - Faster generation, lower cost (ONLY when user explicitly requests speed/budget)IMPORTANT: These are IMAGE generation models from the gemini-image API series. Do NOT use text generation models like gemini-2.0-flash-exp, gemini-exp-1206, or gemini-2.0-flash-thinking-exp-* - they are incompatible with image generation.
Lossless WebP Decision Logic:
When generating images, set use_lossless based on this priority order:
# Apply rules in this exact order (first match wins):
if "style: trend" in user_message or "style: notebooklm" in user_message:
use_lossless = True # Slide deck styles with text/icons
elif user explicitly requests lossless/highest quality (understand intent in ANY language):
use_lossless = True # Examples: "lossless", "highest quality", "perfect quality", "for printing"
elif user explicitly requests lossy/smaller file (understand intent in ANY language):
use_lossless = False # Examples: "lossy", "compress more", "smaller file", "reduce size"
else:
use_lossless = False # Default for photos and general images
Why lossless matters:
Load existing image and include in request.
Directory strategy: Editing an existing image is a continuation of the same topic, so reuse the source image's directory. This keeps all variations together.
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
import os
import io
from pathlib import Path
from google import genai
from google.genai import types
from PIL import Image as PILImage
# Directory selection: Editing existing image = same topic
# Reuse the source image's directory for edited output
OUTPUT_DIR = Path("001-cute-banana") # Same directory as source image
OUTPUT_DIR.mkdir(exist_ok=True)
print(f"Using output directory: {OUTPUT_DIR}")
# Configuration from environment variables
# IMPORTANT: If NANO_BANANA_MODEL is set, use it - DO NOT override
model = os.environ.get("NANO_BANANA_MODEL")
if not model:
# Only choose model when NANO_BANANA_MODEL is not set
model = "gemini-3-pro-image-preview" # Replace with appropriate choice
output_format = os.environ.get("NANO_BANANA_FORMAT", "webp").lower()
quality = int(os.environ.get("NANO_BANANA_QUALITY", "90"))
# Initialize client
base_url = os.environ.get("GOOGLE_GEMINI_BASE_URL")
api_key = os.environ.get("GEMINI_API_KEY") or os.environ.get("GOOGLE_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set")
exit(1)
try:
if base_url:
client = genai.Client(api_key=api_key, http_options={'base_url': base_url})
else:
client = genai.Client(api_key=api_key)
# Load existing image
img = PILImage.open("001-cute-banana/generated.webp")
response = client.models.generate_content(
model=model,
contents=[
"Add a party hat to this character",
img # Pass PIL Image directly
],
config=types.GenerateContentConfig(
response_modalities=['IMAGE']
)
)
if not response.parts:
print("Error: No image generated in response")
exit(1)
except FileNotFoundError as e:
print(f"Error: Input image not found: {e}")
exit(1)
except Exception as e:
print(f"Error during image editing: {e}")
exit(1)
for part in response.parts:
if part.inline_data is not None:
genai_image = part.as_image()
pil_image = PILImage.open(io.BytesIO(genai_image.image_bytes))
if output_format in ("jpg", "jpeg"):
output_path = OUTPUT_DIR / "edited.jpg"
pil_image.convert("RGB").save(output_path, "JPEG", quality=quality)
elif output_format == "webp":
output_path = OUTPUT_DIR / "edited.webp"
pil_image.save(output_path, "WEBP", quality=quality)
else: # png
output_path = OUTPUT_DIR / "edited.png"
pil_image.save(output_path, "PNG")
print(f"Saved: {output_path}")
EOF
Aspect ratio and resolution:
config_params = {
'response_modalities': ['IMAGE'],
'image_config': types.ImageConfig(
aspect_ratio="16:9", # "1:1", "16:9", "9:16", "4:3", "3:4"
image_size="2K" # "1K", "2K", "4K" (UPPERCASE required)
)
}
config = types.GenerateContentConfig(**config_params)
Common aspect ratios by use case:
| Aspect Ratio | Use Cases | Best For |
|---|---|---|
| 16:9 | Presentation slides, modern displays, YouTube thumbnails | Widescreen presentations, video content |
| 4:3 | Traditional presentations, documents | Classic PowerPoint format, printed slides |
| 1:1 | Social media posts, profile images | Instagram posts, icons, square designs |
| 9:16 | Mobile vertical, stories | Instagram/TikTok stories, mobile-first content |
| 3:4 | Print materials, posters | Printed documents, portrait orientation |
Resolution recommendations:
Pattern: Small scripts → Evaluate → Decide next
digraph workflow {
"Write ONE script" [shape=box];
"Run and observe" [shape=box];
"Check saved image" [shape=box];
"Satisfied?" [shape=diamond];
"Done" [shape=box];
"Write ONE script" -> "Run and observe";
"Run and observe" -> "Check saved image";
"Check saved image" -> "Satisfied?";
"Satisfied?" -> "Done" [label="yes"];
"Satisfied?" -> "Write ONE script" [label="no, refine"];
}
Don't:
Do:
For generating 3+ slides for presentations, use the Hybrid Mode: Plan → Parallel → Review.
IMPORTANT: Always plan before generating multiple slides.
Step 1: Planning (Mandatory)
Before any generation, create a complete plan including:
Define style specification
Create content outline
Slide 1: Title - "Presentation Title"
Slide 2: Overview - 3 key points
Slide 3: Details - Deep dive on point 1
Slide 4: Data - Charts and metrics
Slide 5: Conclusion - Summary and CTA
Pre-plan output directory and file structure
All slides should be saved in a single directory with numbered filenames:
001-presentation-topic/
├── 001-title.webp
├── 002-overview.webp
├── 003-details.webp
├── 004-data-viz.webp
└── 005-conclusion.webp
Critical: Do NOT create separate directories per slide. Use one shared directory with numbered files.
Step 2: Parallel Generation
Use Task agents to generate 3-5 slides simultaneously:
Step 3: Review & Adjust
After parallel generation:
Step 4: Adding More Slides (Continuation)
If user requests additional slides after initial generation (e.g., "Add conclusion slide", "Generate the ending too"):
OUTPUT_DIR = Path("001-presentation-topic")Example:
# Initial generation created: 001-japan-trip/ with 001-cover.webp, 002-overview.webp
# User: "Add a conclusion slide"
# Claude understands: This is continuation of japan-trip presentation
# → Reuse directory:
OUTPUT_DIR = Path("001-japan-trip")
# → Save as: 003-conclusion.webp
When to use:
| Slides | Approach | Reason |
|---|---|---|
| 1-2 | Sequential | Faster, no coordination overhead |
| 3-5 | Parallel | Speed benefit outweighs coordination |
| 6+ | Parallel batches (3-5 each) | Split into manageable groups |
For complete multi-slide workflow details, see references/slide-deck-styles.md (lines 411-586) which includes:
If you're thinking any of these thoughts, you're over-engineering:
All of these mean: Use heredoc. It's a one-off task.
Step 1: Gather Reference Materials and Detect Style
Before asking questions, check if the user has provided:
style: "trend" or "use style trend"Style Detection:
style: "trend" (case-insensitive) → Set brand_style = "trend"style: "custom" → Set brand_style = "custom"Step 2: Clarify Intent with Questions
Use AskUserQuestion tool to understand user's goal. Ask 2-4 questions:
Core Questions (always ask):
Technique-Specific Questions (conditional):
Step 3: Determine Prompt Style
Based on user responses, select technique from references/guide.md:
| User Need | Recommended Style |
|---|---|
| Simple, quick generation | Narrative Prompt (Technique 1) |
| Precise control over details | Structured Prompt (Technique 2) |
| Era-specific aesthetic | Vibe Library + Photography Terms (Techniques 3-4) |
| Magazine/poster with text | Physical Object Framing (Technique 5) |
| Conceptual/interpretive | Perspective Framing (Technique 6) |
| Diagram/infographic | Educational Imagery (Technique 7) |
| Editing existing image | Image Transformation (Technique 8) |
| Multiple views/panels | Multi-Panel Output (Technique 9) |
| Multiple reference images | Reference Role Assignment (Technique 12) |
Step 4: Generate the Prompt
references/guide.md to access technique detailsreferences/brand-styles.md)Brand Style Integration:
If user selected Trend Micro brand style or NotebookLM style:
references/brand-styles.md (for Trend) or references/slide-deck-styles.md (for NotebookLM) for complete specificationsuse_lossless = True (see Configuration section for complete decision logic)style: "notebooklm"):
notebooklm is a style trigger - you MUST apply these characteristics:
Step 5: Present and Iterate
Present to user:
Offer to refine based on feedback.
Step 6: Generate the Image
Execute with the crafted prompt using Direct Generation Mode pattern above.
Important: Apply the Lossless WebP Decision Logic from the Configuration section to determine use_lossless setting.
digraph file_decision {
"User request" [shape=diamond];
"Will be run multiple times?" [shape=diamond];
"Complex, needs iteration?" [shape=diamond];
"User explicitly asks for file?" [shape=diamond];
"Create file" [shape=box];
"Use heredoc" [shape=box];
"User request" -> "Will be run multiple times?";
"Will be run multiple times?" -> "Create file" [label="yes"];
"Will be run multiple times?" -> "Complex, needs iteration?" [label="no"];
"Complex, needs iteration?" -> "Create file" [label="yes"];
"Complex, needs iteration?" -> "User explicitly asks for file?" [label="no"];
"User explicitly asks for file?" -> "Create file" [label="yes"];
"User explicitly asks for file?" -> "Use heredoc" [label="no"];
}
API Key Issues:
"Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set"export GEMINI_API_KEY="your-key"echo $GEMINI_API_KEY to verify it's setModel Name Errors:
"Model not found" or "Invalid model name"gemini-3-pro-image-preview (NOT gemini-3-pro-image)gemini-2.5-flash-image (NOT gemini-flash)-preview or -image suffixgemini-2.0-flash-exp will fail - use image models onlyAspect Ratio Errors:
"Invalid aspect ratio""16:9", "4:3", "1:1", "9:16", "3:4" (with quotes)16:9 (no quotes), "16x9" (wrong separator)Rate Limiting:
"429 Too Many Requests" or "Quota exceeded"gemini-2.5-flash-image for higher rate limitsNo Image Generated:
"Error: No image generated in response"response.parts to see what was returnedImage Size vs Quality Trade-offs:
quality=70 or lower for webp/jpguse_lossless=True for text/diagramsimage_size="1K" for faster previewsImport Errors:
"ModuleNotFoundError: No module named 'google.genai'"# dependencies = ["google-genai", "pillow"] in script headergoogle-generativeai (old API) instead of google-genaiFile I/O Errors:
"FileNotFoundError" when editing imagesls -la path/to/image.webppwdecho $GEMINI_API_KEY"A red circle"print(response.parts) before processingls -la NNN-*/file output-dir/generated.webp| Mistake | Fix |
|---|---|
Creating permanent .py files for one-off tasks | Use heredoc instead |
Using google-generativeai (old API) | Use google-genai (new API) |
| Using wrong model names | Use gemini-3-pro-image-preview or gemini-2.5-flash-image (image generation models only) |
| Using text generation models | Do NOT use gemini-2.0-flash-exp, gemini-exp-1206, gemini-2.0-flash-thinking-exp-* - they don't generate images |
Overriding NANO_BANANA_MODEL when set | If user set NANO_BANANA_MODEL, respect it - don't change to "cheaper" model |
Saving to flat files (output.png) | Use NNN-short-name/ directories |
| Hardcoding PNG format | Use format conversion with NANO_BANANA_FORMAT (default: webp) |
| Creating workflow orchestrators | Write small scripts, iterate manually |
| Not detecting inline style specs | Check for both style: syntax and natural language mentions |
| Skipping prompting when user asks for help | Enter Interactive Mode when user says "help", "craft", "improve" |
| Using PIL to draw/edit images | Use Gemini API with contents=[prompt, img] |
| Writing documentation for simple tasks | Just run scripts and print status |
| Auto-chaining multiple steps | Run one step, evaluate, decide next |
For complex workflows (thinking process, Google Search grounding, multi-turn conversations), see references/guide.md.
For complete prompting techniques (16 techniques with examples), see references/guide.md.
For brand style specifications, see references/brand-styles.md.
For slide deck and presentation styles (NotebookLM aesthetic, infographics, data viz), see references/slide-deck-styles.md.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.