Generate, edit, and compose images using Google's Gemini AI API for design workflows and visual content creation
Generate, edit, and compose images using Google's Gemini AI API for design workflows and visual content creation
/plugin marketplace add hirefrank/hirefrank-marketplace/plugin install edge-stack@hirefrank-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
README.mdpackage.jsonscripts/compose-images.tsscripts/edit-image.tsscripts/generate-image.tstsconfig.jsonThis skill provides image generation and manipulation capabilities using Google's Gemini AI API. It's designed for local development workflows where you need to create or modify images using AI assistance.
This skill requires a Gemini API key:
export GEMINI_API_KEY="your-api-key-here"
Get your API key from: https://makersuite.google.com/app/apikey
scripts/generate-image.ts)Create new images from text descriptions.
Usage:
npx tsx scripts/generate-image.ts <prompt> <output-path> [options]
Arguments:
prompt: Text description of the image to generateoutput-path: Where to save the generated image (e.g., ./output.png)Options:
--width <number>: Image width in pixels (default: 1024)--height <number>: Image height in pixels (default: 1024)--model <string>: Gemini model to use (default: 'gemini-2.0-flash-exp')Examples:
# Basic usage
GEMINI_API_KEY=xxx npx tsx scripts/generate-image.ts "a sunset over mountains" output.png
# Custom size
npx tsx scripts/generate-image.ts "modern office workspace" office.png --width 1920 --height 1080
# Using npm script
npm run generate "futuristic city skyline" city.png
scripts/edit-image.ts)Modify existing images based on text instructions.
Usage:
npx tsx scripts/edit-image.ts <source-image> <prompt> <output-path> [options]
Arguments:
source-image: Path to the image to editprompt: Text description of the desired changesoutput-path: Where to save the edited imageOptions:
--model <string>: Gemini model to use (default: 'gemini-2.0-flash-exp')Examples:
# Basic editing
GEMINI_API_KEY=xxx npx tsx scripts/edit-image.ts photo.jpg "add a blue sky" edited.jpg
# Style transfer
npx tsx scripts/edit-image.ts portrait.png "make it look like a watercolor painting" artistic.png
# Using npm script
npm run edit photo.jpg "remove background" no-bg.png
scripts/compose-images.ts)Combine multiple images into a single composition.
Usage:
npx tsx scripts/compose-images.ts <output-path> <image1> <image2> [image3...] [options]
Arguments:
output-path: Where to save the composed imageimage1, image2, ...: Paths to images to combine (2-4 images)Options:
--layout <string>: Layout pattern (horizontal, vertical, grid, custom) (default: 'grid')--prompt <string>: Additional instructions for composition--width <number>: Output width in pixels (default: auto)--height <number>: Output height in pixels (default: auto)Examples:
# Grid layout
GEMINI_API_KEY=xxx npx tsx scripts/compose-images.ts collage.png img1.jpg img2.jpg img3.jpg img4.jpg
# Horizontal layout
npx tsx scripts/compose-images.ts banner.png left.png right.png --layout horizontal
# Custom composition with prompt
npx tsx scripts/compose-images.ts result.png a.jpg b.jpg --prompt "blend seamlessly with gradient transition"
# Using npm script
npm run compose output.png photo1.jpg photo2.jpg photo3.jpg --layout vertical
The package.json includes convenient npm scripts:
npm run generate <prompt> <output> # Generate image from prompt
npm run edit <source> <prompt> <output> # Edit existing image
npm run compose <output> <images...> # Compose multiple images
From the skill directory:
npm install
This installs:
@google/generative-ai: Google's Gemini API SDKtsx: TypeScript execution runtimetypescript: TypeScript compiler# Generate hero image
npm run generate "modern tech startup hero image, clean, professional" hero.png --width 1920 --height 1080
# Create variations
npm run edit hero.png "change color scheme to blue and green" hero-variant.png
# Compose for social media
npm run compose social-post.png hero.png logo.png --layout horizontal
# Generate UI mockup
npm run generate "mobile app login screen, minimalist design" mockup.png --width 375 --height 812
# Iterate on design
npm run edit mockup.png "add a gradient background" mockup-v2.png
# Generate illustrations
npm run generate "technical diagram of cloud architecture" diagram.png
# Create composite images
npm run compose infographic.png chart1.png chart2.png diagram.png --layout vertical
Common errors and solutions:
GEMINI_API_KEY environment variable is setThis skill runs locally and can be used during development:
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.