Skill

baoyu-image-gen

Generates AI images using OpenAI, Google, DashScope, and Replicate APIs. Supports text-to-image prompts, reference images, aspect ratios, and quality settings. Activates on requests to generate, create, or draw images.

OpenAI

Bun

Typescript

ai-ml

npx claudepluginhub xy121718/baoyu-skills --plugin ai-generation-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象), Replicate, and xheai (中转站) providers.

Supporting Assets

references/config/first-time-setup.mdreferences/config/preferences-schema.mdscripts/main.tsscripts/providers/dashscope.tsscripts/providers/google.tsscripts/providers/openai.tsscripts/providers/replicate.tsscripts/providers/xheai.tsscripts/types.ts

SKILL.md

Similar Skills

baoyu-image-gen

16.5k

Generates AI images using OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, GLM-Image, MiniMax, and Replicate APIs. Supports text-to-image, reference images, aspect ratios, batch from prompt files. Use for generate/create/draw requests. [Deprecated: use baoyu-imagine]

20 files

jimliu-baoyu-skills-9

canghe-image-gen

204

Generates AI images from text prompts, reference images using OpenAI, Google, DashScope, Canghe APIs. Supports aspect ratios, quality, sequential/parallel modes.

7 files

ai-generation-skills

happy-image-gen

284

Generates still images from text prompts or reference images using CLI across providers like OpenAI DALL·E, Google Gemini, Replicate Flux/SDXL, Stability AI, FAL, Ark, Bailian, SiliconFlow with Bun.

19 files

open-source-prep

Stats

Stars58

Forks6

Last CommitMar 2, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象), Replicate, and xheai (中转站) providers.

Script Directory

Agent Execution:

SKILL_DIR = this SKILL.md file's directory
Script path = ${SKILL_DIR}/scripts/main.ts

Step 0: Load Preferences ⛔ BLOCKING

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.

Check EXTEND.md existence (priority: project → user):

test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

Result	Action
Found	Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2)
Not found	⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue

CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

Path	Location
`.baoyu-skills/baoyu-image-gen/EXTEND.md`	Project directory
`$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md`	User home

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models

Schema: references/config/preferences-schema.md

Usage

# ⚠️ 长prompt用 --promptfiles 或双引号包裹，避免换行解析错误
# 推荐: npx -y bun "${SKILL_DIR}/scripts/main.ts" -p "长prompt" --image out.png
# 或:   npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles prompt.txt --image out.png

# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal or OpenAI edits)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# With reference images (explicit provider/model)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

# Replicate (google/nano-banana-pro)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Replicate with specific model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

# xheai (中转站 - 兼容 OpenAI 格式)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider xheai

# xheai with nano-banana-2
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider xheai --model nano-banana-2

Options

Option	Description
`--prompt <text>`, `-p`	Prompt text
`--promptfiles <files...>`	Read prompt from files (concatenated)
`--image <path>`	Output image path (required)
`--provider google\|openai\|dashscope\|replicate\|xheai`	Force provider (default: auto-detect)
`--model <id>`, `-m`	Model ID (Google: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; OpenAI: `gpt-image-1.5`; xheai: `gemini-3.1-flash-image-preview`, `nano-banana-2`)
`--ar <ratio>`	Aspect ratio (e.g., `16:9`, `1:1`, `4:3`)
`--size <WxH>`	Size (e.g., `1024x1024`)
`--quality normal\|2k`	Quality preset (default: 2k)
`--imageSize 1K\|2K\|4K`	Image size for Google (default: from quality)
`--ref <files...>`	Reference images. Supported by Google multimodal (`gemini-3-pro-image-preview`, `gemini-3-flash-preview`, `gemini-3.1-flash-image-preview`) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI
`--n <count>`	Number of images
`--json`	JSON output

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key (also used for xheai when OPENAI_BASE_URL points to xheai.cc)
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`REPLICATE_API_TOKEN`	Replicate API token
`OPENAI_IMAGE_MODEL`	OpenAI/xheai model override
`GOOGLE_IMAGE_MODEL`	Google model override
`DASHSCOPE_IMAGE_MODEL`	DashScope model override (default: z-image-turbo)
`REPLICATE_IMAGE_MODEL`	Replicate model override (default: google/nano-banana-pro)
`OPENAI_BASE_URL`	Custom OpenAI endpoint (set to https://api.xheai.cc for xheai)
`GOOGLE_BASE_URL`	Custom Google endpoint
`DASHSCOPE_BASE_URL`	Custom DashScope endpoint
`REPLICATE_BASE_URL`	Custom Replicate endpoint
`DEBUG_ENV`	Set to `1` to enable debug output for env loading and provider detection

Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Cross-Platform Paths: ~/.baoyu-skills/.env automatically resolves to:

Windows: C:\Users\<username>\.baoyu-skills\.env
macOS: /Users/<username>/.baoyu-skills/.env
Linux: /home/<username>/.baoyu-skills/.env

Debug Mode: Set DEBUG_ENV=1 to see which .env files are loaded and which API keys are detected:

DEBUG_ENV=1 npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

Model Resolution

Model priority (highest → lowest), applies to all providers:

CLI flag: --model <id>
EXTEND.md: default_model.[provider]
Env var: <PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)
Built-in default

EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.

Agent MUST display model info before each generation:

Show: Using [provider] / [model]
Show switch hint: Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

Replicate Models

Supported model formats:

owner/name (recommended for official models), e.g. google/nano-banana-pro
owner/name:version (community models by version), e.g. stability-ai/sdxl:<version>

Examples:

# Use Replicate default model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Override model explicitly
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

--ref provided + no --provider → auto-select Google first, then OpenAI, then Replicate
--provider specified → use it (if --ref, must be google, openai, or replicate)
Only one API key available → use that provider
Multiple available → default to Google

Quality Presets

Preset	Google imageSize	OpenAI Size	Use Case
`normal`	1K	1024px	Quick previews
`2k` (default)	2K	2048px	Covers, illustrations, infographics

Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
Google Imagen: uses aspectRatio parameter
OpenAI: maps to closest supported size

Generation Mode

Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.

Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel	User explicitly requests, large batches (10+)

Parallel Settings (when requested):

Setting	Value
Recommended concurrency	4 subagents
Max concurrency	8 subagents
Use case	Large batch generation when user requests parallel

Agent Implementation (parallel mode only):

# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry once
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; or OpenAI GPT Image edits)

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.