From creator-stack
Generates and edits images using Google Gemini Pro and Flash models via Python CLI scripts. Handles prompts, image references, aspect ratios, and outputs to Downloads. Activates on image requests.
npx claudepluginhub kenneth-liao/ai-launchpad-marketplace --plugin creator-stackThis skill uses the workspace's default tool permissions.
Generate and edit images using Google Gemini models. Supports two models:
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Generate and edit images using Google Gemini models. Supports two models:
gemini-3-pro-image-preview) — High quality, complex prompts, thinking modegemini-2.5-flash-image) — Fast, cheap, good for iterationRequired:
GEMINI_API_KEY — Get from Google AI Studiouv (recommended) or Python 3.10+ with google-genai installedWith uv (recommended — zero setup):
Dependencies are declared inline via PEP 723 and auto-installed on first run. Just use uv run instead of python3.
With pip (fallback):
pip install -r ${CLAUDE_SKILL_DIR}/requirements.txt
Default output: Images save to ~/Downloads/nanobanana_<timestamp>.png automatically. Do NOT pass -o unless the user specifies where to save. If the user provides a filename without a directory (e.g., "save it as robot.png"), use -o ~/Downloads/robot.png.
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "a cute robot mascot, pixel art style"
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "make the background blue" -i input.jpg
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "quick sketch of a cat" --model flash
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "apply the style of the first image to the second" \
-i style_ref.png subject.jpg
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png
| Pro (default) | Flash | |
|---|---|---|
| Speed | Slower | ~2-3x faster |
| Cost | Higher | Lower |
| Text rendering | Good | Unreliable |
| Complex scenes | Excellent | Adequate |
| Thinking mode | Yes | No |
| Best for | Final production images | Exploration, drafts, batch |
Rule of thumb: Use Flash for exploration and batch generation, Pro for final output.
scripts/generate.pyMain image generation script.
Usage: generate.py [OPTIONS] PROMPT
Arguments:
PROMPT Text prompt for image generation
Options:
-o, --output PATH Output file path (default: ~/Downloads/nanobanana_<timestamp>.png)
-i, --input PATH... Input image(s) for editing / reference (up to 14)
-m, --model MODEL Model: 'pro' (default), 'flash', or full model ID
-r, --ratio RATIO Aspect ratio (1:1, 16:9, 9:16, 21:9, etc.)
-s, --size SIZE Image size: 1K, 2K, or 4K (default: standard)
--search Enable Google Search grounding for accuracy
--retries N Max retries on rate limit (default: 3)
-v, --verbose Show detailed output
Supported aspect ratios:
1:1 — Square (default)2:3, 3:2 — Portrait/Landscape3:4, 4:3 — Standard4:5, 5:4 — Photo9:16, 16:9 — Widescreen21:9 — Ultra-wide/CinematicImage sizes:
1K — Fast, lower detail2K — Enhanced detail (2048px)4K — Maximum quality (3840px), best for text renderingscripts/batch_generate.pyGenerate multiple images with sequential naming.
Usage: batch_generate.py [OPTIONS] PROMPT
Arguments:
PROMPT Text prompt for image generation
Options:
-n, --count N Number of images to generate (default: 10)
-d, --dir PATH Output directory (default: ~/Downloads)
-p, --prefix STR Filename prefix (default: "image")
-m, --model MODEL Model: 'pro' (default), 'flash', or full model ID
-r, --ratio RATIO Aspect ratio
-s, --size SIZE Image size (1K/2K/4K)
--search Enable Google Search grounding
--retries N Max retries per image on rate limit (default: 3)
--delay SECONDS Delay between generations (default: 3)
--parallel N Concurrent requests (default: 1, max recommended: 5)
-q, --quiet Suppress progress output
Example:
uv run ${CLAUDE_SKILL_DIR}/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo
Note: When importing as a Python module,
google-genaimust be available in the calling script's environment. If usinguv run, add a PEP 723dependenciesblock to your own script (see example in Pattern 2 below).
import sys
from pathlib import Path
sys.path.insert(0, str(Path("${CLAUDE_SKILL_DIR}/scripts")))
from generate import generate_image, edit_image, batch_generate
# Generate image
result = generate_image(
prompt="a futuristic city at night",
output_path="city.png",
aspect_ratio="16:9",
image_size="4K",
model="pro",
)
# Edit existing image
result = edit_image(
prompt="add flying cars to the sky",
input_path="city.png",
output_path="city_edited.png",
)
# Multi-image reference
result = generate_image(
prompt="combine the color palette of the first with the composition of the second",
input_paths=["palette_ref.png", "composition_ref.png"],
output_path="combined.png",
)
{
"success": True, # or False
"path": "/path/to/output.png", # or None on failure
"error": None, # or error message string
"metadata": {
"model": "gemini-3-pro-image-preview",
"prompt": "...",
"aspect_ratio": "16:9",
"image_size": "4K",
"use_search": False,
"input_images": None, # or list of paths
"text_response": "...", # optional text from model
"thinking": "...", # Pro model reasoning (when available)
"timestamp": "2025-01-26T...",
}
}
# In your skill's script:
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "google-genai>=1.0.0",
# ]
# ///
import sys
from pathlib import Path
NANOBANANA_DIR = Path("${CLAUDE_SKILL_DIR}/scripts")
sys.path.insert(0, str(NANOBANANA_DIR))
from generate import generate_image
def generate_thumbnail(prompt: str, output_path: str) -> dict:
"""Generate a YouTube thumbnail with project defaults."""
return generate_image(
prompt=prompt,
output_path=output_path,
aspect_ratio="16:9",
image_size="2K",
model="flash",
max_retries=3,
)
from batch_generate import batch_generate
def on_progress(completed, total, result):
print(f"Progress: {completed}/{total}")
results = batch_generate(
prompt="logo concept",
count=20,
output_dir="./logos",
prefix="logo",
model="flash",
aspect_ratio="1:1",
on_progress=on_progress,
)
successful = [r for r in results if r["success"]]
When a downstream skill needs multiple consistently-styled images (e.g., newsletter visuals, thumbnail A/B variants), use the anchor-and-reference pattern:
from generate import generate_image
# Step 1: Generate the style anchor
anchor = generate_image(
prompt="warm illustration style, earth tones, soft gradients, clean lines",
output_path="anchor.png",
model="pro",
)
# Step 2: Generate each image in the series, referencing the anchor
subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"]
series_paths = [anchor["path"]]
for i, subject in enumerate(subjects):
result = generate_image(
prompt=f"{subject}, matching the visual style and color palette of the reference image exactly",
input_paths=[anchor["path"]], # always include the anchor
output_path=f"series_{i+1:02d}.png",
model="pro",
)
if result["success"]:
series_paths.append(result["path"])
The full sequential generation patterns are documented in the Sequential Generation section below.
| Variable | Description | Default |
|---|---|---|
GEMINI_API_KEY | Google Gemini API key | Required |
IMAGE_OUTPUT_DIR | Default output directory | ~/Downloads |
Create images from text descriptions. Both models excel at:
Transform existing images with natural language:
Provide up to 14 reference images for:
Enable --search for factually accurate images involving:
Rate limit errors are automatically retried with exponential backoff (default: 3 retries). No action needed from callers.
All images generated by Gemini contain an invisible SynthID digital watermark. This is automatic, cannot be disabled, and survives common transformations (resize, crop, compression). Be aware of this for any use case requiring watermark-free output.
Use sequential generation to maintain visual consistency across a series of images. The core technique: generate an anchor image first, then pass it as a reference (-i) for every subsequent image in the series.
Generate a single anchor image that establishes the visual identity for a series. Reference it for all subsequent images.
When to use: Newsletter visual series, A/B thumbnail variants, brand-consistent image batches.
Workflow:
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py \
"modern flat illustration style, warm earth tones, soft gradients, clean lines, \
minimal detail, cozy atmosphere" \
--model pro -o anchor.png
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py \
"a laptop on a desk with coffee, matching the visual style, color palette, \
and lighting of the reference image exactly" \
-i anchor.png --model pro -o image_01.png
Tip: Use Flash to draft the anchor quickly, then regenerate with Pro once you find a style you like.
Keep the same character or subject looking consistent across different scenes and poses.
When to use: Mascot in multiple contexts, product photography series, recurring character.
Workflow:
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py \
"a friendly robot mascot with round blue body, orange antenna, large expressive eyes, \
simple geometric design, standing front-facing on white background" \
--model pro -o subject_front.png
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py \
"the same robot character from the reference image, now sitting at a desk typing, \
same proportions and colors, office background" \
-i subject_front.png --model pro -o subject_office.png
uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py \
"the same robot character from the reference images, now outdoors in a park, \
same proportions and colors, waving at the viewer" \
-i subject_front.png subject_office.png --model pro -o subject_park.png
Build a reference pool over a long series, adding each successful output as a reference for the next.
When to use: Series of 5+ images where consistency must compound across the full set.
Workflow:
-i list. Drop weaker outputs.Why cap at 3-4 references: More references dilute the style signal. The model averages across all inputs — too many and the result loses coherence. Keep only the images that best represent the target style.
Reference ordering matters: Place the style anchor first in the -i list. The model weights earlier references slightly more.
Good prompts include:
See references/prompts.md for detailed prompt templates by category and model-specific tips.
--model flash for exploration batches (faster, cheaper)--delay 5 or --parallel with modest concurrency"uv: command not found"
curl -LsSf https://astral.sh/uv/install.sh | sh or brew install uv"Error: google-genai package not installed"
uv run instead of python3 to auto-install dependenciespip install -r ${CLAUDE_SKILL_DIR}/requirements.txt"GEMINI_API_KEY environment variable not set"
GEMINI_API_KEY in your environment before running"No image in response"
"Rate limit exceeded after N retries"
Import errors in batch_generate.py
Multi-turn conversational editing — The Gemini API supports stateful chat sessions for iterative image editing (e.g., "make it bluer" → "now add a hat" → "zoom out"). This requires fundamentally different stateful architecture and is not currently implemented. No downstream skill currently needs this.