From gpt-image-2-skill
Generates images from prompts and edits reference images via CLI using OpenAI gpt-image-2 or Codex. Outputs JSON results and progress events for agent parsing, with transparent PNGs, masks, up to 4K sizes.
npx claudepluginhub wangnov/gpt-image-2-skillThis skill uses the workspace's default tool permissions.
Run image generation and editing through one CLI surface that hides provider differences. The Node wrapper at `scripts/gpt_image_2_skill.cjs` resolves an underlying Rust binary (env override → installed binary → Tauri App bundled CLI → repo `cargo run` → cached release → bootstrap download) and forwards every flag.
Generates images from text prompts and edits reference images using OpenAI gpt-image-2 via CLI. Supports restyling, combining references, inpainting with PNG masks, and dense typography. Outputs PNG/JPEG/WebP to disk.
Generates and edits images using OpenAI gpt-image-2 via CLI, Claude Code skill, and prompt gallery with 162 curated prompts for UI mockups, research figures, photography.
Generates or edits raster images using AI for bitmap visuals like photos, illustrations, textures, sprites, mockups in projects. Uses built-in image_gen tool; CLI fallback requires OPENAI_API_KEY. Avoid vector/SVG/code assets.
Share bugs, ideas, or general feedback.
Run image generation and editing through one CLI surface that hides provider differences. The Node wrapper at scripts/gpt_image_2_skill.cjs resolves an underlying Rust binary (env override → installed binary → Tauri App bundled CLI → repo cargo run → cached release → bootstrap download) and forwards every flag.
OPENAI_API_KEY, an OpenAI-compatible base URL, and Codex auth.json without changing command shape.$CODEX_HOME/gpt-image-2-skill/config.json so CLI, App, and Skill use the same default provider.Always pass --json so the result is machine-readable. Add --json-events when progress visibility matters.
# 1. Confirm runtime + provider readiness
node scripts/gpt_image_2_skill.cjs --json config inspect
node scripts/gpt_image_2_skill.cjs --json doctor
node scripts/gpt_image_2_skill.cjs --json auth inspect
# 2. Generate a final transparent PNG deliverable
node scripts/gpt_image_2_skill.cjs --json --json-events \
transparent generate --prompt "..." --out /tmp/asset.png \
--size 2K --quality high
# 3. Generate a normal image (auto-selects provider; OpenAI first, then Codex)
node scripts/gpt_image_2_skill.cjs --json --json-events \
images generate --prompt "..." --out /tmp/out.png \
--format png --size 2K
# 4. Edit a reference image (OpenAI multipart)
node scripts/gpt_image_2_skill.cjs --json --json-events \
images edit --prompt "..." --ref-image /tmp/in.png --out /tmp/out.png
# 5. Remove a controlled background from existing source images
node scripts/gpt_image_2_skill.cjs --json \
transparent extract --input /tmp/source-green.png --out /tmp/asset.png \
--method chroma --matte-color auto --strict
# 6. Verify the final file before delivery
node scripts/gpt_image_2_skill.cjs --json \
transparent verify --input /tmp/asset.png --profile icon --strict
# 7. Raw request escape hatch
node scripts/gpt_image_2_skill.cjs --json \
request create --request-operation generate \
--body-file /tmp/body.json --out-image /tmp/out.png --expect-image
# 8. Self-test (calls doctor + auth inspect)
node scripts/selftest.cjs
Force a provider with --provider openai, --provider codex, or any named provider from config inspect; leave the default --provider auto to use default_provider first. Override the legacy OpenAI base URL with --openai-api-base https://....
Before using newly documented command groups, especially transparent generate, transparent extract, or transparent verify, confirm the actual CLI runtime is fresh enough. Skill files, the Tauri App bundle, npm global CLI, cargo-installed CLI, and wrapper cache can drift independently.
Start by checking the latest upstream CLI version:
npm view gpt-image-2-skill version
Then look for the Tauri App bundled CLI first. Treat the app bundle as user-managed software: if it exists but is older than upstream, ask the user before updating or replacing the app.
APP_CLI="/Applications/GPT Image 2.app/Contents/Resources/bin/gpt-image-2-skill"
[ -x "$APP_CLI" ] && "$APP_CLI" --version
If no app bundle CLI exists, check the shell CLI. If it is missing or older than upstream, silently update the CLI dependency with npm, then verify that the current shell resolves the new binary:
command -v gpt-image-2-skill || true
gpt-image-2-skill --version || true
npm install -g gpt-image-2-skill@latest
command -v gpt-image-2-skill
gpt-image-2-skill --version
After any update, run the skill wrapper smoke checks:
node scripts/gpt_image_2_skill.cjs --json doctor
node scripts/selftest.cjs
If a documented subcommand fails with unrecognized subcommand, diagnose stale runtime first rather than changing prompts or providers. Prefer the wrapper in this skill directory for reproducible skill execution, but keep the bare CLI fresh when examples or user commands call gpt-image-2-skill directly.
Use the CLI config surface when the user asks to add or pin a provider:
node scripts/gpt_image_2_skill.cjs --json config path
node scripts/gpt_image_2_skill.cjs --json config add-provider \
--name my-image-api \
--type openai-compatible \
--api-base https://example.com/v1 \
--api-key sk-... \
--set-default
node scripts/gpt_image_2_skill.cjs --json config test-provider my-image-api
Credential sources supported by CLI, App, and Skill: file, env, and keychain. File credentials are stored in the shared config file; JSON output redacts them.
Output properties (not "what to draw") are flag-controlled. Putting them in the prompt is unreliable and provider-dependent.
| Property | Use this flag, not the prompt |
|---|---|
| Output background (transparent / opaque / auto) | --background auto|transparent|opaque |
| Output dimensions | --size 2K, --size 4K, or --size WIDTHxHEIGHT |
| Output container | --format png|jpeg|webp |
| Compression level | --compression 0..100 |
| Render quality | --quality low|medium|high|auto |
| Number of images | --n <count> (OpenAI only) |
| Edit mask region | --mask <png> (OpenAI only) |
The prompt is for "what is in the picture"; background, size, format, count, and mask are not. For example, to turn a transparent PNG into a white-background PNG, pass --background opaque — describing "white background" only in the prompt is not reliable.
Provider asymmetry: --background, --n, --moderation, --mask, and --input-fidelity are honored only by OpenAI (and OpenAI-compatible bases that proxy them). Codex image_generation does not honor --background; the runtime accepts the flag but the upstream tool drops it. The other four return code: "unsupported_option" if passed with --provider codex.
For transparent output, do not rely on provider-native transparency. Use the transparent command group as the Agent-facing tool layer:
transparent generate — prompt-to-final PNG. It generates a controlled matte source, extracts alpha locally, verifies the result, and only succeeds when the final PNG passes transparency checks.transparent extract — local background removal from controlled source images you generated yourself. It is not a general-purpose background remover for arbitrary photos.transparent verify — final gate for any PNG before delivery. Use --strict and the right --profile when the file must be accepted or fail the task.A transparent deliverable is valid only if the final file has a real PNG alpha channel and passes verification. A visual appearance of transparency, a white background, or a checkerboard pattern is not sufficient.
--strict is profile-based:
| Profile | Use for | Extra strictness |
|---|---|---|
generic | common alpha/file checks | does not over-police unusual assets |
icon | clean single-subject icons and props | requires clean opaque core, margin, low stray noise |
product | product/object cutouts | similar to icon, with residue and edge checks |
sticker | decals, badges, multi-detail props | allows more intentional small components than icon |
seal | stamps, seals, logos with inner marks | allows split components such as ring + center symbol |
translucent | glass, liquid, crystal | requires partial alpha |
glow | light ribbons, flame, smoke, particles | requires partial alpha and transparent margin |
shadow | soft shadow assets | requires partial alpha and transparent margin |
effect | hard-alpha particles, bursts, UI effects | transparent margin without requiring partial alpha |
The CLI is intentionally not a material classifier. The Agent should choose generation prompts and extraction methods based on the asset:
| Asset type | Generation guidance | Extraction guidance |
|---|---|---|
| Opaque object, icon, sticker, product | Single isolated subject, clear margin, perfectly flat chroma matte. Pick a matte color absent from the object. | transparent generate or transparent extract --method chroma --matte-color auto |
| Thin edges, hair, fur, lace, chain, netting | Use high resolution, strong subject/background contrast, no contact shadow, no background-colored details. Try magenta/cyan/green mattes if one contaminates the edge. | Chroma extraction with --spill-suppression when needed, then verify with --expected-matte-color; retry with a different matte if residue remains. |
| Glass, crystal, liquid, hologram | Ask for a centered asset on flat black and flat white backgrounds, keeping geometry identical. Use reference/edit flow when possible to keep alignment. | transparent extract --method dual --dark-image black.png --light-image white.png |
| Glow, flame, smoke, mist, magic particles | Generate dark and light background variants. Avoid textured backgrounds and avoid bloom reaching the image edge unless the edge is intentional. | Prefer dual extraction; verify that partial_pixels is non-zero. |
| Shadows | Decide whether the shadow is part of the asset. If not, explicitly forbid contact shadows. If yes, generate on a flat matte with enough margin. | Chroma for opaque shadow silhouettes; dual extraction for soft translucent shadows. |
| Unknown or unusual material | Do not classify it first. Generate controlled source variants, run extraction candidates, and keep the one that passes verification with the cleanest edge. | Use --report-dir / --keep-sources while iterating, then deliver only the final PNG. |
For chroma extraction, --matte-color auto samples the actual flat source background from the image edges. Prefer it when the source was AI-generated, because prompts like "pure #ff00ff" often produce near-matte colors rather than exact RGB values. Use explicit --matte-color <name|#rrggbb> only when the source background is known exactly.
For extraction tuning, use --material only as a broad hint, not as a subject classifier: standard, soft-3d, flat-icon, sticker, or glow. Manual --threshold, --softness, and --spill-suppression override the selected preset.
For style-locked transparent assets, transparent generate is prompt-only. Use a flat RGB reference image with images edit --ref-image to create a controlled matte source, then run transparent extract. Do not use a transparent PNG as the reference image unless you intentionally want the alpha/composited edge behavior to influence the edit.
Do not ask the image model to render exact UI text, numbers, scores, labels, or logos as part of the bitmap unless distorted text is acceptable. Generate the visual asset without text, then render exact text in the host app or design tool.
Examples:
# Simple asset: final transparent PNG, sources hidden unless there is a failure
node scripts/gpt_image_2_skill.cjs --json --json-events \
transparent generate \
--prompt "a polished fantasy sword game asset, no text, no frame" \
--out /tmp/sword.png --size 2K --quality high
# Agent-controlled chroma flow
node scripts/gpt_image_2_skill.cjs --json --json-events \
images generate \
--prompt "a silver necklace, centered, on a perfectly flat pure magenta background, no shadow" \
--out /tmp/necklace-magenta.png --format png --size 2K
node scripts/gpt_image_2_skill.cjs --json \
transparent extract --method chroma \
--input /tmp/necklace-magenta.png --matte-color auto \
--out /tmp/necklace.png --material sticker --strict
# Semi-transparent material flow
node scripts/gpt_image_2_skill.cjs --json \
transparent extract --method dual \
--dark-image /tmp/glow-on-black.png \
--light-image /tmp/glow-on-white.png \
--out /tmp/glow.png --strict
Always inspect the JSON verification fields before delivery: passed, alpha_min, alpha_max, transparent_ratio, partial_pixels, and warnings. Also inspect quality fields: checkerboard_detected, touches_edge, edge_margin_px, stray_pixel_count, largest_component_ratio, matte_residue_checked, matte_residue_score, halo_score, transparent_rgb_scrubbed, alpha_health_score, residue_score, quality_score, and failure_reasons. If passed is false, do not deliver the file as a transparent PNG. If matte_residue_checked is false for a chroma-derived PNG, run transparent verify again with the source matte via --expected-matte-color.
openai defaults to gpt-image-2; codex defaults to gpt-5.4 and delegates to image_generation.--size, --quality, --format, --compression.--background, --n, --moderation, --mask, --input-fidelity.401 triggers one token refresh + one retry.2K → 2048x2048, 4K → 3840x2160. Custom WxH requires both edges multiples of 16, max edge 3840, max 8,294,400 pixels, max aspect ratio 3:1.Load on demand for deeper detail:
references/providers.md — OpenAI / OpenAI-compatible / Codex selection, auth sources, runtime discovery, update policy, and resolution order.references/sizes-and-formats.md — size aliases, custom constraints, format/quality/compression/background, shared vs OpenAI-only flags.references/transparent-png.md — Agent playbook for prompt design, controlled mattes, dual-background extraction, verification, and retry loops.references/json-output.md — --json stdout schema, success and error envelopes, per-command shapes.references/json-events.md — --json-events JSONL phases (request_started, multipart_prepared, retry_scheduled) and Codex SSE passthrough.references/troubleshooting.md — runtime_unavailable, auth_missing, Codex 401 refresh, retry policy, size rejections, moderation, timeouts.The companion file agents/openai.yaml is read by Codex Skill runtime only (Claude Code ignores it). Both runtimes execute the commands above with cwd at the skill directory, so relative paths like scripts/gpt_image_2_skill.cjs resolve in either harness.