From oh-my-daily-skills
Builds Next.js App Router image-generation apps using Gemini Nano Banana/Pro models with AI SDK. Covers Server Actions/API routes, multi-turn editing, storage, rate limiting, safety, and cost controls.
npx claudepluginhub shiqkuangsan/oh-my-daily-skillsThis skill uses the workspace's default tool permissions.
Build production-ready web applications powered by Google's Nano Banana image generation APIs—creating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.
Generates and edits images using Google Gemini Pro and Flash models via Python CLI scripts. Handles prompts, image references, aspect ratios, and outputs to Downloads. Activates on image requests.
Generates or edits images using Google's Nano Banana 2 (gemini-3.1-flash-image-preview). Supports text-to-image, image editing, and 512/1K/2K/4K resolutions via Python script.
Generates images using Google's Gemini models for frontend designs, web projects, illustrations, graphics, hero images, icons, and artwork. Auto-activates on image generation requests.
Share bugs, ideas, or general feedback.
Build production-ready web applications powered by Google's Nano Banana image generation APIs—creating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.
Use ONLY these exact model strings. Do not invent, guess, or add date suffixes.
| Model String (use exactly) | Alias | Use Case |
|---|---|---|
gemini-2.5-flash-image | Nano Banana | Fast iterations, drafts, high volume |
gemini-3-pro-image-preview | Nano Banana Pro | Quality output, text rendering, 2K |
Common mistakes to avoid:
gemini-2.5-flash-preview-05-20 — wrong, date suffixes are for text modelsgemini-2.5-pro-image — wrong, 2.5 Pro doesn't do image generationgemini-3-flash-image — wrong, doesn't existgemini-pro-vision — wrong, that's for image input, not generationThe only valid image generation models are gemini-2.5-flash-image and gemini-3-pro-image-preview.
Examples were tested against the versions below; verify the latest AI SDK and Google provider docs before upgrading:
| Package | Minimum Version | Recommended |
|---|---|---|
ai | 3.4.0+ | ^4.0.0 |
@ai-sdk/google | 0.0.52+ | ^1.0.0 |
@ai-sdk/react | 0.0.62+ | ^1.0.0 |
next | 14.0.0+ | ^15.0.0 |
react | 18.2.0+ | ^19.0.0 |
Important notes:
'use server' directive# Check your versions
npm list ai @ai-sdk/google @ai-sdk/react next
# Update to latest
npm update ai @ai-sdk/google @ai-sdk/react
Breaking changes to watch:
result.files[0] structure may change between major versionsproviderOptions.google namespace for Gemini-specific configsuseChat hook API from @ai-sdk/reactNano Banana isn't just another image API—it's conversational by design. The core insight is that image generation works best as a dialogue, not a one-shot prompt.
Think of it as working with an AI art director:
gemini-2.5-flash-image for speed/iterations, gemini-3-pro-image-preview for quality/complexityChoose based on use case:
| Use Case | Model | Why |
|---|---|---|
| Rapid iterations, drafts | gemini-2.5-flash-image | Fast (2-5s), lower cost per image |
| Final output, quality | gemini-3-pro-image-preview | Superior quality, thinking, text rendering |
| Text-heavy images | gemini-3-pro-image-preview | Best typography, 2K resolution |
| Multi-turn editing | Either | Both support conversational editing |
| High volume | gemini-2.5-flash-image | Lower cost, faster throughput |
// app/actions/generate.ts
"use server";
import { google } from "@ai-sdk/google";
import { generateText } from "ai";
export async function generateImage(prompt: string) {
const result = await generateText({
model: google("gemini-2.5-flash-image"),
prompt,
providerOptions: {
google: {
responseModalities: ["IMAGE"],
imageConfig: { aspectRatio: "16:9" },
},
},
});
return result.files[0]; // { base64, uint8Array, mediaType }
}
// app/components/ImageGenerator.tsx
'use client'
import { useChat } from '@ai-sdk/react'
export function ImageGenerator() {
const { append, messages, isLoading } = useChat({
api: '/api/generate'
})
return (
<div>
{messages.map(m => (
<div key={m.id}>
{m.parts?.map((part, i) =>
part.type === 'image' && (
<img key={i} src={part.url} alt="Generated" />
)
)}
</div>
))}
<button
disabled={isLoading}
onClick={() => append({
role: 'user',
content: 'A futuristic cityscape at dusk'
})}
>
Generate
</button>
</div>
)
}
For prompt structure, quality boosters, enhancer utility, negative prompts, and use-case templates, see references/prompt-engineering.md.
For complete implementations including:
See references/advanced-patterns.md
For Gemini safety settings, pre-generation prompt filtering, safety block handling, and production best practices, see references/safety-settings.md.
For detailed configuration and operational concerns:
See references/configuration.md
❌ Inventing model names or adding date suffixes:
Why wrong: Image generation models have specific names; date suffixes like -preview-05-20 are for text models only
Better: Use exactly gemini-2.5-flash-image or gemini-3-pro-image-preview — no variations
❌ Using Gemini 2.5 Pro for images:
Why wrong: Gemini 2.5 Pro doesn't generate images directly
Better: Use gemini-2.5-flash-image or gemini-3-pro-image-preview
❌ Storing only base64 in database: Why wrong: Blobs database, expensive storage, slow retrieval Better: Store in object storage (Vercel Blob/S3), save URL only
❌ No rate limit handling: Why wrong: Will hit 429 errors in production, poor UX Better: Implement rate limiting with user-friendly error messages
❌ Ignoring multi-turn context: Why wrong: Wastes Nano Banana's conversational editing strength Better: Track chat history for iterative refinement
❌ Hardcoding API keys client-side: Why wrong: Exposes credentials, security risk Better: Use server actions / API routes with environment variables
❌ Using wrong aspect ratio: Why wrong: 21:9 on 1:1 request wastes tokens, unexpected crop Better: Match aspect ratio to intended use case
❌ No loading states: Why wrong: Image generation takes 5-30s, users think it's broken Better: Show progress indicators and estimated wait time
❌ Generating on every keystroke: Why wrong: Wastes quota, slow response Better: Debounce prompts, require explicit action
IMPORTANT: Every app should feel uniquely designed for its specific purpose.
Vary across dimensions:
Avoid overused patterns:
Context should drive design:
# .env.local
GEMINI_API_KEY=your_api_key_here
# For Vercel Blob storage
BLOB_READ_WRITE_TOKEN=your_vercel_token
# For S3 (optional)
S3_BUCKET=your-bucket
S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=your_key
S3_SECRET_ACCESS_KEY=your_secret
# For Upstash rate limiting (optional)
UPSTASH_REDIS_REST_URL=your_url
UPSTASH_REDIS_REST_TOKEN=your_token
# Install dependencies
npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob
# Or if using separate packages
npm install google-genai
Nano Banana enables conversational image generation that feels like working with a creative partner, not a tool.
The best apps:
You're building more than an image generator—you're creating a creative experience. Design it thoughtfully.