npx claudepluginhub first-fluke/oh-my-agent --plugin omaThis skill uses the workspace's default tool permissions.
Generate images and visual assets through authenticated multi-vendor routing while preserving prompt clarity, reference-image handling, cost controls, and reproducible output manifests.
Generates images from text prompts and edits reference images using OpenAI gpt-image-2 via CLI. Supports restyling, combining references, inpainting with PNG masks, and dense typography. Outputs PNG/JPEG/WebP to disk.
Generates images from text, edits images with references, performs product placement, style transfer, and multi-image composition using OpenAI DALL-E or Google Gemini.
Generates AI images using OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, GLM-Image, MiniMax, and Replicate APIs. Supports text-to-image, reference images, aspect ratios, batch from prompt files. Use for generate/create/draw requests. [Deprecated: use baoyu-imagine]
Share bugs, ideas, or general feedback.
Generate images and visual assets through authenticated multi-vendor routing while preserving prompt clarity, reference-image handling, cost controls, and reproducible output manifests.
.agents/results/images/ or requested output directorymanifest.json with prompt, vendor, model, and reproducibility metadata--vendor all is usedoma image generate CLI and vendor authenticationresources/vendor-matrix.md, resources/prompt-tips.md, and config/image-config.yamloma image generate with selected vendor(s), prompt, references, and options.--vendor all is requested, require every requested vendor to be available.oma update.| Action | SSL primitive | Evidence |
|---|---|---|
| Validate prompt completeness | VALIDATE | Clarification protocol |
| Select vendor strategy | SELECT | Vendor matrix and auth state |
| Read reference images | READ | --reference paths |
| Call generation CLI/API | CALL_TOOL | oma image generate |
| Write image outputs | WRITE | Image files and manifest |
| Validate result | VALIDATE | Exit code, manifest, files |
| Report output | NOTIFY | Final path summary |
oma image generate, oma image doctor, oma image list-vendorsoma image doctor
oma image generate "<prompt>" --vendor auto --size auto --quality auto --format json
With reference images:
oma image generate --reference "<absolute-path>" --vendor codex "<prompt>"
| Scope | Resource target |
|---|---|
LOCAL_FS | Reference images, generated images, manifests |
PROCESS | Provider CLIs and image router commands |
NETWORK | Pollinations/Gemini or provider APIs |
CREDENTIALS | Provider auth and API keys |
Clarification Protocol below.--vendor all, every requested vendor must be available (strict).$0.20 (configurable). --yes / OMA_IMAGE_YES=1 bypass. Default vendor pollinations (flux/zimage) is free, so auto-triggering on keywords is safe.$PWD require --allow-external-out.manifest.json next to the images for reproducibility.n = 5 — wall-time bound.oma search fetch (0, 1, 2=safety, 3=not-found, 4=invalid-input, 5=auth-required, 6=timeout).Before invoking oma image generate, the calling agent runs this checklist against the user's request. If any answer is "no / unknown", clarify with the user first.
Required signal (must be present or inferable):
Strongly recommended (ask if absent AND not inferable from context):
1024x1024), portrait (1024x1536), landscape (1536x1024)?Amplification shortcut. For brief prompts (e.g. "a red apple"), do not pop clarifying questions if the request is genuinely that simple — instead amplify inline and show the user the expanded version before invoking:
User: "a red apple" Agent: "I'll generate this as: a single glossy red apple centered on a clean white background, soft studio lighting, photorealistic, shallow depth of field, 1024×1024. Shall I proceed, or would you like a different style/composition?"
Skip both clarification and amplification when the user has clearly authored a full creative brief (≥ 2 of: subject + style + lighting + composition). Respect their prompt verbatim.
Category-specific briefs (app mockup, poster, thumbnail, infographic, comic panel, avatar): consult resources/prompt-tips.md → External Prompt Libraries.
Output language. Generation prompts are sent to the provider in English (image models are trained predominantly on English captions). Translate the user's request if they wrote in another language, and show them the translated version during amplification so they can correct misreadings.
This skill follows oh-my-agent's CLI-first concept: whenever a vendor's native CLI can drive generation (and return raw bytes), the subprocess path is preferred over direct API keys. Direct API is only used as a fallback for vendors whose CLI can't yet emit raw image bytes.
| Vendor | Strategy | Models | Trigger |
|---|---|---|---|
codex | CLI-first — codex exec via ChatGPT OAuth (codex login), built-in image_gen | gpt-image-2 | Logged in via Codex CLI (no API key) |
pollinations | Direct HTTP — gen.pollinations.ai/v1/images/generations (free signup for key) | Free: flux, zimage. Credit-gated: qwen-image, wan-image, gpt-image-2, klein, kontext, gptimage, gptimage-large | POLLINATIONS_API_KEY set (free at https://enter.pollinations.ai). No native CLI exists. |
gemini | CLI-first fallback → direct API. gemini -p (stream) is the preferred path but currently disabled at precheck (CLI's agentic loop does not return raw inlineData bytes on stdout as of Gemini CLI 0.38). Until the CLI exposes a non-agentic image surface, the provider falls back to the direct generativelanguage.googleapis.com API. | gemini-2.5-flash-image, gemini-3.1-flash-image-preview | Preferred: gemini auth login. Fallback: GEMINI_API_KEY + billing. |
/oma-image a red apple on white background
/oma-image --vendor all --size 1536x1024 jeju coastline at sunset
/oma-image -n 3 --quality high --out ./hero "minimalist dashboard hero illustration"
oma image generate "<prompt>" [--vendor auto|codex|pollinations|gemini|all] [-n 1..5] \
[--size 1024x1024|1024x1536|1536x1024|auto] \
[--quality low|medium|high|auto] \
[--out <dir>] [--allow-external-out] \
[-r <path>]... \
[--timeout 180] [-y] [--no-prompt-in-manifest] \
[--dry-run] [--format text|json]
oma image doctor
oma image list-vendors
Gemini-only escalation flag: --strategy mcp,stream,api (overrides vendors.gemini.strategies).
-r, --reference)Attach up to 10 reference images (PNG/JPEG/GIF/WebP, ≤ 5MB each) to guide style, subject identity, or composition. Repeatable or comma-separated.
oma image generate -r ~/Downloads/otter.jpeg "same otter in dramatic lighting"
oma image generate -r a.png -r b.png "blend these two styles"
Supported vendors:
| Vendor | Support | How |
|---|---|---|
codex (gpt-image-2) | ✅ | Passes -i <path> to codex exec |
gemini (2.5-flash-image) | ✅ | Inlines base64 inlineData parts in request |
pollinations | ❌ | Rejected with exit code 4 (requires URL hosting; see PR #2 roadmap) |
Paths: absolute or relative to $CWD. Host CLIs usually expose attached images via:
~/.claude/image-cache/<session>/N.png (surfaced in system messages as [Image: source: <path>])When ALL of the following are true, the calling agent MUST pass the attached image via --reference <path> automatically. Never describe the image in prose as a workaround.
[Image: source: <path>], or an Antigravity workspace upload path, or an explicit filesystem path in the user's message.codex or gemini).Required action: invoke oma image generate --reference <absolute-path> --vendor <codex|gemini> "<prompt>". If the user didn't specify a vendor, default to codex (CLI-first, widest availability). Do NOT:
oma image generate --help to verify.If the local CLI is outdated (--reference is missing from --help): tell the user to run oma update once, then retry. Do not silently degrade to prose.
If the reference path is from Claude Code's image-cache: note to the user that the path is session-scoped and suggest copying the file to a durable location if they want to reuse it later. Still proceed with the generation.
Other skills call oma image generate --format json and parse the JSON manifest from stdout.
.agents/results/images/
├── 20260424-143052-ab12cd/ # single-vendor run
│ └── pollinations-flux.jpg
│ (or codex-gpt-image-2.png)
│ manifest.json
└── 20260424-143122-7z9kqw-compare/ # --vendor all run
├── codex-gpt-image-2.png
├── pollinations-flux.jpg
└── manifest.json
Follow resources/execution-protocol.md step by step.
See resources/vendor-matrix.md for strategy precheck rules.
Use resources/prompt-tips.md for writing effective prompts.
Before submitting, run resources/checklist.md.
Project-specific settings: config/image-config.yaml.
Env vars: OMA_IMAGE_DEFAULT_VENDOR, OMA_IMAGE_DEFAULT_OUT, OMA_IMAGE_YES, POLLINATIONS_API_KEY, GEMINI_API_KEY, OMA_IMAGE_GEMINI_STRATEGIES.
resources/execution-protocol.mdresources/vendor-matrix.mdresources/prompt-tips.mdresources/checklist.md../_shared/core/context-loading.md