Use when users request image generation, AI art creation, image editing with Gemini models, need help crafting prompts, or want brand-styled imagery. Handles both direct generation and interactive prompt design.
/plugin marketplace add pigfoot/claude-code-hubs/plugin install superpowers@pigfoot-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/brand-styles.mdreferences/guide.mdreferences/slide-deck-styles.mdQuick Python scripting with Gemini image generation using uv heredocs. No files needed for one-off tasks.
Supports two modes:
digraph when_to_use {
"User request" [shape=diamond];
"Explicit prompting request?" [shape=diamond];
"Prompt too vague?" [shape=diamond];
"Interactive Prompting Mode" [shape=box];
"Direct Generation Mode" [shape=box];
"User request" -> "Explicit prompting request?";
"Explicit prompting request?" -> "Interactive Prompting Mode" [label="yes"];
"Explicit prompting request?" -> "Prompt too vague?" [label="no"];
"Prompt too vague?" -> "Interactive Prompting Mode" [label="yes (<5 words)"];
"Prompt too vague?" -> "Direct Generation Mode" [label="no"];
}
Use this skill when:
Don't use when:
Check user's message for style specifications:
Structured syntax:
style: "trend" → Trend Micro brand colorsstyle: "notebooklm" or style: "slide" → NotebookLM presentation style (MUST apply visual characteristics)style: "custom" → Ask for custom color preferencesNatural language:
⚠️ CRITICAL for NotebookLM: When detected, apply the style characteristics (see Brand Style Integration section). NEVER include "NotebookLM" brand/logo/name in the Gemini prompt or generated slides - this violates trademark policies. notebooklm is a style trigger for Claude only - translate to descriptive terms like "clean professional presentation aesthetic" in the actual prompt.
Style detection precedence by mode:
| Mode | User Provides Style? | Action |
|---|---|---|
| Direct Generation | Yes (inline spec) | Use detected style immediately |
| Direct Generation | No | Generate with no style (default) |
| Interactive Prompting | Yes (inline spec) | Use detected style, skip style question |
| Interactive Prompting | No | Ask user about style preference in Step 2 |
Priority order (first match wins):
| Task | Pattern |
|---|---|
| Generate image | uv run - << 'EOF' with inline script |
| Edit image | Same, but contents=[prompt, img] |
| Complex workflow | Multiple small scripts, evaluate between |
| Model choice | NANO_BANANA_MODEL (if set) > Pro (default) > Flash (if budget/fast) |
| Output format | Default: webp, or NANO_BANANA_FORMAT env var (webp/jpg/png) |
| Output location | NNN-short-name/ (e.g., 001-cute-banana/) |
| Step | Action |
|---|---|
| 1. Gather | Check for reference images, style specs |
| 2. Clarify | Ask 2-4 questions about output type, subject, style |
| 3. Select Technique | Choose from 16+ patterns (see references/guide.md) |
| 4. Generate Prompt | Apply technique, brand style, aspect ratio |
| 5. Present | Show prompt with explanation and variations |
| 6. Execute | Generate image with crafted prompt |
Enter Interactive Prompting Mode when:
Use Direct Generation Mode when:
Default to heredoc for one-off tasks:
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
import os
import io
from pathlib import Path
from google import genai
from google.genai import types
from PIL import Image as PILImage
# Auto-increment folder detection
# Scan for existing NNN-* directories and use next available number
existing_folders = sorted([d for d in Path(".").iterdir()
if d.is_dir() and len(d.name) >= 4
and d.name[:3].isdigit() and d.name[3] == '-'])
if existing_folders:
last_num = int(existing_folders[-1].name[:3])
next_num = last_num + 1
else:
next_num = 1
OUTPUT_DIR = Path(f"{next_num:03d}-cute-banana") # Format: NNN-short-name
OUTPUT_DIR.mkdir(exist_ok=True)
print(f"Using output directory: {OUTPUT_DIR}")
# Configuration from environment variables
model = os.environ.get("NANO_BANANA_MODEL")
if not model:
# Model selection logic (Claude decides based on user request):
# Use gemini-2.5-flash-image ONLY if user explicitly mentions:
# - "fast", "quick", "draft", "preview", "budget", "cheap"
# Otherwise, ALWAYS use gemini-3-pro-image-preview (default):
# - Better quality, especially for text rendering in slides
# - More accurate color reproduction
# - Better handling of complex prompts
model = "gemini-3-pro-image-preview" # Default: prioritize quality
output_format = os.environ.get("NANO_BANANA_FORMAT", "webp").lower()
quality = int(os.environ.get("NANO_BANANA_QUALITY", "90"))
# Detect if lossless format is needed (for diagrams/slides)
# See "Lossless WebP Decision Logic" in Configuration section for complete rules
use_lossless = False # Set to True for slide deck styles or explicit user request
# Initialize client with optional custom endpoint
base_url = os.environ.get("GOOGLE_GEMINI_BASE_URL")
api_key = os.environ.get("GEMINI_API_KEY") or os.environ.get("GOOGLE_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set")
exit(1)
try:
if base_url:
client = genai.Client(api_key=api_key, http_options={'base_url': base_url})
else:
client = genai.Client(api_key=api_key)
config_params = {
'response_modalities': ['IMAGE']
}
response = client.models.generate_content(
model=model,
contents=["A cute banana character with sunglasses"],
config=types.GenerateContentConfig(**config_params)
)
if not response.parts:
print("Error: No image generated in response")
exit(1)
except Exception as e:
print(f"Error during image generation: {e}")
exit(1)
for part in response.parts:
if part.inline_data is not None:
# Get google-genai Image object
genai_image = part.as_image()
# Convert to PIL Image from bytes
pil_image = PILImage.open(io.BytesIO(genai_image.image_bytes))
# Save with format conversion
if output_format in ("jpg", "jpeg"):
output_path = OUTPUT_DIR / "generated.jpg"
pil_image.convert("RGB").save(output_path, "JPEG", quality=quality)
elif output_format == "webp":
output_path = OUTPUT_DIR / "generated.webp"
if use_lossless:
# Lossless WebP for slide decks (VP8L encoding)
# Saves 20-30% vs PNG, zero quality loss (vs lossy: saves 95% but blurs)
pil_image.save(output_path, "WEBP", lossless=True)
print(f"Saved: {output_path} (WEBP lossless, optimized for slides)")
else:
# Lossy WebP for photos (VP8 encoding)
pil_image.save(output_path, "WEBP", quality=quality)
print(f"Saved: {output_path} (WEBP, quality={quality})")
else: # png (default fallback)
output_path = OUTPUT_DIR / "generated.png"
pil_image.save(output_path, "PNG")
print(f"Saved: {output_path} (PNG)")
EOF
Key points:
google-genai (NOT google-generativeai)# /// script blockgoogle-genai, pillow (for .as_image() to get image bytes, then convert to PIL for saving)Save images to numbered directories:
Format: NNN-short-name/
Naming rules:
NNN-* folders in current working directoryExamples:
001-user-auth-flow/002-cute-cat/003-startup-logo/Directory scanning example:
# Existing: 001-cute-cat/, 002-logo-design/
# New request: "Generate a sunset landscape"
# → Scans directory, finds 001 and 002, creates: 003-sunset-landscape/
Customize plugin behavior with environment variables:
| Variable | Default | Description |
|---|---|---|
NANO_BANANA_MODEL | (Claude chooses: Pro or Flash) | Force specific model (overrides Claude's choice) |
NANO_BANANA_FORMAT | webp | Output format: webp, jpg, or png |
NANO_BANANA_QUALITY | 90 | Image quality (1-100) for webp/jpg |
GOOGLE_GEMINI_BASE_URL | (official API) | Custom API endpoint (for non-official deployments) |
GEMINI_API_KEY | (falls back to GOOGLE_API_KEY) | API key (official or custom endpoint) |
Lossless WebP Decision Logic:
When generating images, set use_lossless based on this priority order:
# Apply rules in this exact order (first match wins):
if "style: trend" in user_message or "style: notebooklm" in user_message:
use_lossless = True # Slide deck styles with text/icons
elif user explicitly requests lossless/highest quality (understand intent in ANY language):
use_lossless = True # Examples: "lossless", "highest quality", "perfect quality", "for printing"
elif user explicitly requests lossy/smaller file (understand intent in ANY language):
use_lossless = False # Examples: "lossy", "compress more", "smaller file", "reduce size"
else:
use_lossless = False # Default for photos and general images
Why lossless matters:
Load existing image and include in request:
uv run - << 'EOF'
# /// script
# dependencies = ["google-genai", "pillow"]
# ///
import os
import io
from pathlib import Path
from google import genai
from google.genai import types
from PIL import Image as PILImage
# Auto-increment folder detection
existing_folders = sorted([d for d in Path(".").iterdir()
if d.is_dir() and len(d.name) >= 4
and d.name[:3].isdigit() and d.name[3] == '-'])
if existing_folders:
last_num = int(existing_folders[-1].name[:3])
next_num = last_num + 1
else:
next_num = 1
OUTPUT_DIR = Path(f"{next_num:03d}-party-hat")
OUTPUT_DIR.mkdir(exist_ok=True)
print(f"Using output directory: {OUTPUT_DIR}")
# Configuration from environment variables
model = os.environ.get("NANO_BANANA_MODEL")
if not model:
model = "gemini-3-pro-image-preview"
output_format = os.environ.get("NANO_BANANA_FORMAT", "webp").lower()
quality = int(os.environ.get("NANO_BANANA_QUALITY", "90"))
# Initialize client
base_url = os.environ.get("GOOGLE_GEMINI_BASE_URL")
api_key = os.environ.get("GEMINI_API_KEY") or os.environ.get("GOOGLE_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set")
exit(1)
try:
if base_url:
client = genai.Client(api_key=api_key, http_options={'base_url': base_url})
else:
client = genai.Client(api_key=api_key)
# Load existing image
img = PILImage.open("001-cute-banana/generated.webp")
response = client.models.generate_content(
model=model,
contents=[
"Add a party hat to this character",
img # Pass PIL Image directly
],
config=types.GenerateContentConfig(
response_modalities=['IMAGE']
)
)
if not response.parts:
print("Error: No image generated in response")
exit(1)
except FileNotFoundError as e:
print(f"Error: Input image not found: {e}")
exit(1)
except Exception as e:
print(f"Error during image editing: {e}")
exit(1)
for part in response.parts:
if part.inline_data is not None:
genai_image = part.as_image()
pil_image = PILImage.open(io.BytesIO(genai_image.image_bytes))
if output_format in ("jpg", "jpeg"):
output_path = OUTPUT_DIR / "edited.jpg"
pil_image.convert("RGB").save(output_path, "JPEG", quality=quality)
elif output_format == "webp":
output_path = OUTPUT_DIR / "edited.webp"
pil_image.save(output_path, "WEBP", quality=quality)
else: # png
output_path = OUTPUT_DIR / "edited.png"
pil_image.save(output_path, "PNG")
print(f"Saved: {output_path}")
EOF
Aspect ratio and resolution:
config_params = {
'response_modalities': ['IMAGE'],
'image_config': types.ImageConfig(
aspect_ratio="16:9", # "1:1", "16:9", "9:16", "4:3", "3:4"
image_size="2K" # "1K", "2K", "4K" (UPPERCASE required)
)
}
config = types.GenerateContentConfig(**config_params)
Common aspect ratios by use case:
| Aspect Ratio | Use Cases | Best For |
|---|---|---|
| 16:9 | Presentation slides, modern displays, YouTube thumbnails | Widescreen presentations, video content |
| 4:3 | Traditional presentations, documents | Classic PowerPoint format, printed slides |
| 1:1 | Social media posts, profile images | Instagram posts, icons, square designs |
| 9:16 | Mobile vertical, stories | Instagram/TikTok stories, mobile-first content |
| 3:4 | Print materials, posters | Printed documents, portrait orientation |
Resolution recommendations:
Pattern: Small scripts → Evaluate → Decide next
digraph workflow {
"Write ONE script" [shape=box];
"Run and observe" [shape=box];
"Check saved image" [shape=box];
"Satisfied?" [shape=diamond];
"Done" [shape=box];
"Write ONE script" -> "Run and observe";
"Run and observe" -> "Check saved image";
"Check saved image" -> "Satisfied?";
"Satisfied?" -> "Done" [label="yes"];
"Satisfied?" -> "Write ONE script" [label="no, refine"];
}
Don't:
Do:
For generating 3+ slides for presentations, use the Hybrid Mode: Plan → Parallel → Review.
IMPORTANT: Always plan before generating multiple slides.
Step 1: Planning (Mandatory)
Before any generation, create a complete plan including:
Define style specification
Create content outline
Slide 1: Title - "Presentation Title"
Slide 2: Overview - 3 key points
Slide 3: Details - Deep dive on point 1
Slide 4: Data - Charts and metrics
Slide 5: Conclusion - Summary and CTA
Pre-plan output directories
001-title-slide/
002-overview/
003-details/
004-data-viz/
005-conclusion/
Step 2: Parallel Generation
Use Task agents to generate 3-5 slides simultaneously:
Step 3: Review & Adjust
After parallel generation:
When to use:
| Slides | Approach | Reason |
|---|---|---|
| 1-2 | Sequential | Faster, no coordination overhead |
| 3-5 | Parallel | Speed benefit outweighs coordination |
| 6+ | Parallel batches (3-5 each) | Split into manageable groups |
For complete multi-slide workflow details, see references/slide-deck-styles.md (lines 411-586) which includes:
If you're thinking any of these thoughts, you're over-engineering:
All of these mean: Use heredoc. It's a one-off task.
Step 1: Gather Reference Materials and Detect Style
Before asking questions, check if the user has provided:
style: "trend" or "use style trend"Style Detection:
style: "trend" (case-insensitive) → Set brand_style = "trend"style: "custom" → Set brand_style = "custom"Step 2: Clarify Intent with Questions
Use AskUserQuestion tool to understand user's goal. Ask 2-4 questions:
Core Questions (always ask):
Technique-Specific Questions (conditional):
Step 3: Determine Prompt Style
Based on user responses, select technique from references/guide.md:
| User Need | Recommended Style |
|---|---|
| Simple, quick generation | Narrative Prompt (Technique 1) |
| Precise control over details | Structured Prompt (Technique 2) |
| Era-specific aesthetic | Vibe Library + Photography Terms (Techniques 3-4) |
| Magazine/poster with text | Physical Object Framing (Technique 5) |
| Conceptual/interpretive | Perspective Framing (Technique 6) |
| Diagram/infographic | Educational Imagery (Technique 7) |
| Editing existing image | Image Transformation (Technique 8) |
| Multiple views/panels | Multi-Panel Output (Technique 9) |
| Multiple reference images | Reference Role Assignment (Technique 12) |
Step 4: Generate the Prompt
references/guide.md to access technique detailsreferences/brand-styles.md)Brand Style Integration:
If user selected Trend Micro brand style or NotebookLM style:
references/brand-styles.md (for Trend) or references/slide-deck-styles.md (for NotebookLM) for complete specificationsuse_lossless = True (see Configuration section for complete decision logic)style: "notebooklm"):
notebooklm is a style trigger - you MUST apply these characteristics:
Step 5: Present and Iterate
Present to user:
Offer to refine based on feedback.
Step 6: Generate the Image
Execute with the crafted prompt using Direct Generation Mode pattern above.
Important: Apply the Lossless WebP Decision Logic from the Configuration section to determine use_lossless setting.
digraph file_decision {
"User request" [shape=diamond];
"Will be run multiple times?" [shape=diamond];
"Complex, needs iteration?" [shape=diamond];
"User explicitly asks for file?" [shape=diamond];
"Create file" [shape=box];
"Use heredoc" [shape=box];
"User request" -> "Will be run multiple times?";
"Will be run multiple times?" -> "Create file" [label="yes"];
"Will be run multiple times?" -> "Complex, needs iteration?" [label="no"];
"Complex, needs iteration?" -> "Create file" [label="yes"];
"Complex, needs iteration?" -> "User explicitly asks for file?" [label="no"];
"User explicitly asks for file?" -> "Create file" [label="yes"];
"User explicitly asks for file?" -> "Use heredoc" [label="no"];
}
API Key Issues:
"Error: GEMINI_API_KEY or GOOGLE_API_KEY environment variable not set"export GEMINI_API_KEY="your-key"echo $GEMINI_API_KEY to verify it's setModel Name Errors:
"Model not found" or "Invalid model name"gemini-3-pro-image-preview (NOT gemini-3-pro-image)gemini-2.5-flash-image (NOT gemini-flash)-preview or -image suffixAspect Ratio Errors:
"Invalid aspect ratio""16:9", "4:3", "1:1", "9:16", "3:4" (with quotes)16:9 (no quotes), "16x9" (wrong separator)Rate Limiting:
"429 Too Many Requests" or "Quota exceeded"gemini-2.5-flash-image for higher rate limitsNo Image Generated:
"Error: No image generated in response"response.parts to see what was returnedImage Size vs Quality Trade-offs:
quality=70 or lower for webp/jpguse_lossless=True for text/diagramsimage_size="1K" for faster previewsImport Errors:
"ModuleNotFoundError: No module named 'google.genai'"# dependencies = ["google-genai", "pillow"] in script headergoogle-generativeai (old API) instead of google-genaiFile I/O Errors:
"FileNotFoundError" when editing imagesls -la path/to/image.webppwdecho $GEMINI_API_KEY"A red circle"print(response.parts) before processingls -la NNN-*/file output-dir/generated.webp| Mistake | Fix |
|---|---|
Creating permanent .py files for one-off tasks | Use heredoc instead |
Using google-generativeai (old API) | Use google-genai (new API) |
| Using wrong model names | Use gemini-3-pro-image-preview or gemini-2.5-flash-image |
Saving to flat files (output.png) | Use NNN-short-name/ directories |
| Hardcoding PNG format | Use format conversion with NANO_BANANA_FORMAT (default: webp) |
| Creating workflow orchestrators | Write small scripts, iterate manually |
| Not detecting inline style specs | Check for both style: syntax and natural language mentions |
| Skipping prompting when user asks for help | Enter Interactive Mode when user says "help", "craft", "improve" |
| Using PIL to draw/edit images | Use Gemini API with contents=[prompt, img] |
| Writing documentation for simple tasks | Just run scripts and print status |
| Auto-chaining multiple steps | Run one step, evaluate, decide next |
For complex workflows (thinking process, Google Search grounding, multi-turn conversations), see references/guide.md.
For complete prompting techniques (16 techniques with examples), see references/guide.md.
For brand style specifications, see references/brand-styles.md.
For slide deck and presentation styles (NotebookLM aesthetic, infographics, data viz), see references/slide-deck-styles.md.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.