npx claudepluginhub kylesnowschwartz/simpleclaude --plugin sc-skillsThis skill uses the workspace's default tool permissions.
Generate and edit images using Google's Gemini API. The SDK reads `GOOGLE_API_KEY` by default (`GEMINI_API_KEY` as fallback). Or pass a key explicitly to `genai.Client(api_key=...)`.
Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.
Builds DCF models with sensitivity analysis, Monte Carlo simulations, and scenario planning for investment valuation and risk assessment.
Calculates profitability (ROE, margins), liquidity (current ratio), leverage, efficiency, and valuation (P/E, EV/EBITDA) ratios from financial statements in CSV, JSON, text, or Excel for investment analysis.
Generate and edit images using Google's Gemini API. The SDK reads GOOGLE_API_KEY by default (GEMINI_API_KEY as fallback). Or pass a key explicitly to genai.Client(api_key=...).
| Model | Codename | Best For |
|---|---|---|
gemini-2.5-flash-image | Nano Banana | Most use cases, fast, good quality (default) |
gemini-3-pro-image-preview | Nano Banana Pro | High-res (2K/4K), Google Search grounding, precise text |
gemini-3.1-flash-image-preview | Nano Banana 2 | High volume, extended aspect ratios, 512 size |
Start with gemini-2.5-flash-image. Upgrade to Pro for high-res output or search grounding.
gemini-2.5-flash-imageAll models: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
3.1 Flash only: 1:4, 4:1, 1:8, 8:1
All models: 1K (default), 2K, 4K
3.1 Flash only: 512
from google import genai
from google.genai import types
client = genai.Client() # Reads GOOGLE_API_KEY (or GEMINI_API_KEY fallback)
response = client.models.generate_content(
model="gemini-2.5-flash-image",
contents="Your prompt here",
)
for part in response.parts:
if part.text is not None:
print(part.text)
elif part.inline_data is not None:
image = part.as_image()
image.save("output.jpg") # save() takes path only, writes raw bytes
Note: response_modalities is optional. Omit it to let the model decide. Set ['IMAGE'] for image-only output, or ['TEXT', 'IMAGE'] for interleaved text and images.
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=prompt,
config=types.GenerateContentConfig(
image_config=types.ImageConfig(
aspect_ratio="16:9",
image_size="2K",
),
),
)
Chat mode is recommended for editing. The SDK handles thought signatures automatically across turns.
from PIL import Image
client = genai.Client()
image = Image.open("input.png")
chat = client.chats.create(model="gemini-2.5-flash-image")
# First edit
response = chat.send_message(["Add a sunset to this scene", image])
for i, part in enumerate(response.candidates[0].content.parts):
if part.text is not None:
print(part.text)
elif part.inline_data is not None:
image = part.as_image()
image.save(f"edited_{i}.jpg")
# Continue refining
response = chat.send_message("Make the colors warmer")
PIL Image objects, base64 bytes, and file URIs (via client.files.upload()) all work as image inputs.
Generate images informed by real-time data. Requires Pro model.
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents="Visualize today's weather in Tokyo as an infographic",
config=types.GenerateContentConfig(
image_config=types.ImageConfig(
aspect_ratio="16:9",
image_size="1K",
),
tools=[types.Tool(google_search=types.GoogleSearch())],
),
)
Image search grounding (searching for reference images) is only available on gemini-3.1-flash-image-preview.
Combine elements from multiple sources. Pass PIL Image objects directly in the contents list.
from PIL import Image
response = client.models.generate_content(
model="gemini-2.5-flash-image",
contents=[
"Create a group photo of these people in an office",
Image.open("person1.png"),
Image.open("person2.png"),
Image.open("person3.png"),
],
)
Limits differ by model:
Include camera details: lens type, lighting, angle, mood.
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"
Specify style explicitly:
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"
Be explicit about font style and placement:
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"
Describe lighting setup and surface:
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"
The API returns JPEG in practice. image.save(path) writes raw bytes from the API response. It takes only a path string (no format kwarg).
# Save as-is (JPEG bytes from the API)
image.save("output.jpg")
To convert formats, use PIL on the raw bytes:
from PIL import Image
import io
for part in response.parts:
if part.inline_data is not None:
pil_img = Image.open(io.BytesIO(part.inline_data.data))
pil_img.save("output.png") # PIL handles the conversion
save(path) writes raw bytes; no format kwarg exists. Use PIL for format conversionresponse_modalities is optional; omit to let the model decide output formatimage_config (only modality config)person_generation parameter exists on ImageConfig for controlling person depiction in outputs