Generates AI images from text descriptions using Gemini 2.5 Flash Image API. Matches: "generate an image of", "create a picture of", "make me a photo of", "generate a photo", "AI image of", "create art of", "draw a [concrete noun]", "make an illustration of", "photorealistic image of", "artistic image of", "picture of a", "image of a", "nano banana", "make it darker", "change the background", "add more detail", "edit this image", "try variations", "generate variations". Do NOT use for: diagrams, flowcharts, charts, mind maps, org charts, architecture diagrams, data visualizations (use visualize) — e.g. "draw a flowchart", "chart this data", "org chart", "diagram this process", brainstorm or ideation (use brainstorm) — e.g. "brainstorm ideas", "explore options", prompt optimization or rewriting (use prompt-master) — e.g. "improve my prompt", "optimize this prompt".
From tandemnpx claudepluginhub binatrixai/tandem-marketplace --plugin tandemThis skill is limited to using the following tools:
evals/evals.jsonreferences/editing-guide.mdreferences/presets.mdreferences/prompt-guide.mdtemplate.mdExecutes pre-written implementation plans: critically reviews, follows bite-sized steps exactly, runs verifications, tracks progress with checkpoints, uses git worktrees, stops on blockers.
Guides idea refinement into designs: explores context, asks questions one-by-one, proposes approaches, presents sections for approval, writes/review specs before coding.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Generate AI images from text descriptions using the Gemini 2.5 Flash Image API
via Cloudflare AI Gateway. Guide users through structured prompt building for
best results. Save PNGs to ~/Tandem/creative/images/ and auto-open.
Output follows ${CLAUDE_SKILL_DIR}/template.md.
For prompt building reference, load ${CLAUDE_SKILL_DIR}/references/prompt-guide.md
once per conversation during guided mode.
For style and social format presets, load ${CLAUDE_SKILL_DIR}/references/presets.md
when user selects a preset option in guided mode.
For editing and variation workflows, load ${CLAUDE_SKILL_DIR}/references/editing-guide.md
when entering editing or variation mode.
See METHODOLOGY.md language mirror rule. Reply in the user's language.
Detect which mode applies based on the user's request:
Load ${CLAUDE_SKILL_DIR}/references/prompt-guide.md once per conversation.
Brand detection: Read ~/Tandem/memory/brand.md using the Read tool.
If the file exists: extract brand colors, style guidelines, and constraints.
These will be automatically appended to the prompt in Step 3.
If the user says "ignore brand" or "no brand": skip brand constraints for this generation.
If brand.md does not exist: skip silently -- no error, no mention.
Content safety note (show once per conversation): "Works best for: objects, scenes, abstract art, stylized illustrations. Note: photorealistic people and public figures may be restricted -- illustrated versions work great."
Walk through Visual Descriptor fields via AskUserQuestion:
2a. Subject (required):
AskUserQuestion: "What should be in the image? Be as specific as you can."
2b. Style:
AskUserQuestion: "What style should the image have?"
Options: ["Use a style preset", "Let me describe"]
If "Let me describe": ask for freeform style input.
If "Use a style preset": Load ${CLAUDE_SKILL_DIR}/references/presets.md.
Present the 10 style presets with their "Best for" descriptions:
AskUserQuestion: "Pick a style preset:"
Options: ["Photo-Realistic", "Illustration", "Flat Design", "Watercolor", "Oil Painting", "3D Render", "Anime/Manga", "Pixel Art", "Sketch/Pencil", "Pop Art"]
Apply the selected preset's prompt modifiers, quality markers, and negatives in Step 3 assembly.
2c. Mood/Details (optional):
AskUserQuestion: "Any specific mood, lighting, or environment details? (Skip for smart defaults)"
Options: ["Skip -- use smart defaults", "Let me describe"]
If skipped: apply sensible defaults from prompt-guide.md based on the subject and style (e.g., natural lighting for photorealistic, soft edges for watercolor).
2d. Aspect Ratio:
AskUserQuestion: "What shape should the image be?"
Options: ["Use a social format preset", "Square (1:1)", "Landscape (16:9)", "Portrait (9:16)", "Wide banner (21:9)", "Standard photo (3:2)"]
Map to API values:
1:116:99:1621:93:2If "Use a social format preset": Load ${CLAUDE_SKILL_DIR}/references/presets.md (if not already loaded).
AskUserQuestion: "Pick a social format:"
Options: ["Instagram Post (1:1)", "Instagram Story (9:16)", "LinkedIn Banner (16:9)", "Twitter/X Header (3:1)", "Facebook Cover (16:9)", "YouTube Thumbnail (16:9)"]
Apply the selected format's aspect ratio and composition guidance to the prompt.
Do NOT ask about Quality or Negative -- apply sensible defaults from prompt-guide.md automatically.
Combine the Visual Descriptor fields into a narrative paragraph following the assembly pattern in prompt-guide.md:
Default quality: "high detail, sharp focus, professional quality" Default negative: "No text, no watermarks, no distorted features, no blurry elements"
Show the assembled prompt to the user for confirmation:
AskUserQuestion: "Here's the assembled prompt. Ready to generate?"
Options: ["Generate this", "Edit prompt first", "Start over"]
If "Edit prompt first": let user modify the text, then re-confirm. If "Start over": return to Step 2.
If this is the FIRST generation in the conversation AND the user did NOT use direct mode ("generate:" prefix), offer prompt optimization:
AskUserQuestion: "Would you like to optimize this prompt with Prompt Master first?"
Options: ["Yes, optimize it", "No, generate as-is"]
If "Yes": Hand off the assembled prompt to prompt-master with context: "The user wants to optimize this image generation prompt for Nano Banana. After optimization, offer to generate the image." Mark as shown -- do NOT repeat on subsequent generations in this session.
If "No" or direct mode: proceed to Step 4.
CRITICAL: No base64 data may ever appear in conversation context. When image generation is available, the entire pipeline (API call + decode + file save) must happen in ONE Bash invocation.
NOTE: Image generation via direct API calls is not available in this environment. The Cowork sandbox blocks outbound HTTP requests (see METHODOLOGY.md -- No Direct API Calls rule).
Pending API Gateway implementation (see docs/architecture/api-gateway-design.md). Image generation through the gateway is planned for v11.0.
Until then, inform the user: "Image generation is not yet available — it will be enabled in a future update via the API Gateway (v11.0). I can help you craft the perfect prompt now so it's ready."
If the user wants to continue with prompt crafting, proceed to Step 6 (present the assembled prompt details without an image file). If the user wants to stop, acknowledge and end gracefully.
Proactive guidance (show once per conversation, during Step 2): Already handled in Step 2 intro text.
On SAFETY block from API:
Suggest alternatives based on what was requested:
Offer to regenerate with the modified prompt:
AskUserQuestion: "Would you like to try with a modified prompt?"
Options: ["Yes, try stylized version", "Try something different", "Done"]
Use the template from ${CLAUDE_SKILL_DIR}/template.md:
## Image Generated
**Prompt:** [assembled prompt]
**Style:** [style used]
**Aspect Ratio:** [ratio]
**File:** `[output file path]`
The image has been saved and opened for viewing.
Track the output file path as last_generated_image for editing mode.
Offer follow-up:
AskUserQuestion: "What would you like to do next?"
Options: ["Generate another", "Edit this image", "Try variations of this", "Different style of same subject", "Done"]
If "Edit this image": go to Editing Mode. If "Try variations of this": go to Variation Mode. If "Different style of same subject": return to Step 2b (style selection) with the same subject, skip 2a.
Log to stats.json:
Read ~/Tandem/stats.json. If it does not exist, create it as []. Append:
{
"type": "image-generation",
"action": "created",
"count": 1,
"timeSavedMinutes": 5,
"description": "Image: [slug]",
"timestamp": "<current ISO 8601 UTC>"
}
Write the updated array back to ~/Tandem/stats.json.
Run /sync:
Follow the /sync workflow from tandem-skills/core/sync/SKILL.md.
If stats.json write or /sync fails: continue -- the image file is the primary deliverable.
Load ${CLAUDE_SKILL_DIR}/references/editing-guide.md for the complete editing workflow.
Overview:
last_generated_image state)last_generated_image to the new fileAskUserQuestion: "What would you like to do next?"
Options: ["Edit again", "Try variations", "Generate something new", "Done"]
CRITICAL: All base64 handling (read source + encode + API call + decode response) happens in ONE Bash invocation. No base64 data in conversation context.
If no previous image exists, tell user: "No recent image to edit. Let's generate one first." and redirect to Step 1.
Load ${CLAUDE_SKILL_DIR}/references/editing-guide.md for the complete variation workflow.
Overview:
AskUserQuestion: "How many variations?"
Options: ["2", "3", "4"]
AskUserQuestion: "Which variation do you prefer?"
Options: [list each filename + "None -- try again"]
last_generated_imageAskUserQuestion: "What would you like to do next?"
Options: ["Edit the selected image", "Generate something new", "Done"]
429 (rate limit / free tier): "Image generation requires a paid Google AI Studio API key. Get one at https://aistudio.google.com/apikey (paid plan required)."
SAFETY finish reason: "This image was blocked by content safety filters." Then suggest an illustrated or stylized alternative based on the subject.
Network error / non-200 response: "Could not reach the API. Check your internet connection and try again."
No inline_data in response: "The API returned text but no image. Try a more specific or different prompt."
API key file missing: API key setup is handled by the API Gateway (v11.0). No local key file needed.
Invalid JSON response: "Received an unexpected response from the API. Try again in a moment."
Source image missing for edit:
If last_generated_image file no longer exists, ask user to generate a new image.
Variation generation partial failure: If some variations succeed and some fail, present the successful ones and note the failures. Offer to retry only the failed variations.