From more-ai
Generates images from text, edits existing images, applies style transfers, composes from multiple references, and supports multi-turn refinement using Google's Gemini API via Python scripts. For logos, stickers, mockups.
How this skill is triggered — by the user, by Claude, or both
Slash command
/more-ai:gemini-imagegenThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate and edit images using Google's Gemini API.
Generate and edit images using Google's Gemini API.
Check if GEMINI_API_KEY is set in the environment. If not present, ask the user to configure it:
GEMINI_API_KEY=your_key_here in the Environment variables dialogUse the venv Python for all script invocations. The scripts require Python 3.10+ with google-genai and Pillow:
SCRIPTS_DIR="${CLAUDE_PLUGIN_ROOT}/skills/gemini-imagegen/scripts"
PYTHON="${SCRIPTS_DIR}/.venv/bin/python3"
If the venv doesn't exist yet, create it:
python3.12 -m venv "${SCRIPTS_DIR}/.venv"
"${SCRIPTS_DIR}/.venv/bin/pip" install google-genai Pillow
Use these scripts for all image generation tasks. Always invoke with $PYTHON (the venv python from above):
Use when: User wants to create a new image from a text description
$PYTHON "${SCRIPTS_DIR}/generate_image.py" "prompt" output.png [--model MODEL] [--aspect RATIO] [--size SIZE]
Use when: User wants to edit or transform an existing image
$PYTHON "${SCRIPTS_DIR}/edit_image.py" input.png "instruction" output.png [--model MODEL] [--aspect RATIO] [--size SIZE]
Use when: User wants to merge elements from multiple images (up to 14)
$PYTHON "${SCRIPTS_DIR}/compose_images.py" "instruction" output.png image1.png [image2.png ...] [--model MODEL] [--aspect RATIO] [--size SIZE]
Use when: User wants to refine an image through multiple rounds of feedback
$PYTHON "${SCRIPTS_DIR}/multi_turn_chat.py" [--model MODEL] [--output-dir DIR]
| Option | Values |
|---|---|
--model | gemini-3.1-flash-image-preview (default, fastest, best instruction following), gemini-3-pro-image-preview (fallback / best quality, text rendering, 4K), gemini-2.5-flash-image (stable/GA, legacy) |
Model selection: Try gemini-3.1-flash-image-preview first (fastest, great quality). If it fails or the result is unsatisfactory (e.g., poor text rendering, complex composition), fall back to gemini-3-pro-image-preview which produces studio-quality output at the cost of speed.
| --aspect | 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9 |
| --size | 512 (3.1 flash only), 1K, 2K, 4K (pro only) |
Use when: User wants to remove the background from an image (uses macOS Shortcuts)
$PYTHON "${SCRIPTS_DIR}/remove_bg.py" input.jpg [output.png]
If no output path is given, saves to input_nobg.png in the same folder. Requires a one-time setup of a macOS Shortcut named "Remove Background" — the script will print setup instructions if the shortcut is missing.
Write prompts like Midjourney experts - be detailed and specific:
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"
"Logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"
npx claudepluginhub nityeshaga/claude-home-base --plugin more-aiGenerates, edits, and composes images using Google's Gemini 3 Pro Image model. Supports text-to-image, multi-image composition, aspect ratio control, and search-grounded generation for data visuals.
Generates and edits images using Google's Gemini Nano Banana Pro model (gemini-3-pro-image-preview) via bash API calls. Supports logos, product mockups, photo edits; requires GEMINI_API_KEY.
Generates and edits images via Gemini API Python SDK. For text-to-image, editing, style transfers, logos, stickers, mockups, multi-turn refinement, and image composition.