Guides ComfyUI prompt engineering with CLIP text encoding syntax, emphasis weights, BREAK token for long prompts, and embeddings for Stable Diffusion workflows.
From comfynpx claudepluginhub artokun/comfyui-mcp --plugin comfyThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
ComfyUI uses CLIP (Contrastive Language-Image Pre-training) text encoders to convert text prompts into conditioning tensors. The CLIPTextEncode node takes a text string and a CLIP model, producing a CONDITIONING output for the KSampler.
CLIP processes text in 77-token chunks. Each word is typically 1-3 tokens. Prompts exceeding 77 tokens are silently truncated unless you use the BREAK token or a multi-clip encoding node.
Adjust how strongly the model attends to specific words or phrases:
| Syntax | Effect | Equivalent Weight |
|---|---|---|
(word:1.3) | Increase emphasis by 30% | Explicit weight 1.3 |
(word:0.7) | Decrease emphasis by 30% | Explicit weight 0.7 |
(word) | Slight increase | (word:1.1) |
((word)) | Moderate increase | (word:1.21) — 1.1^2 |
(((word))) | Strong increase | (word:1.331) — 1.1^3 |
[word] | Slight decrease | (word:0.9091) — 1/1.1 |
[[word]] | Moderate decrease | (word:0.8264) — 1/1.1^2 |
((word)) = 1.1 * 1.1 = (word:1.21)(red sports car:1.3) applies weight to the entire phrase(detailed face:1.4), (blurry background:0.6) — combine in one prompta (beautiful:1.3) woman with (flowing red hair:1.2), wearing a blue dress, (sharp focus:1.1)
(masterpiece:1.4), (best quality:1.3), a knight in (ornate armor:1.2), standing on a cliff, (dramatic lighting:1.1), cinematic
The BREAK keyword forces CLIP to end the current 77-token chunk and start processing subsequent text in a new chunk. This is critical for long prompts.
masterpiece, best quality, a beautiful Japanese garden with cherry blossoms,
stone lanterns, koi pond, traditional wooden bridge, morning mist
BREAK
highly detailed, 8k uhd, photorealistic, volumetric lighting,
depth of field, golden hour, award-winning photography
Each chunk is encoded independently and then concatenated as conditioning, ensuring all tokens are processed.
Embeddings (textual inversions) are pre-trained token sets that encode complex concepts into a single trigger word.
embedding:easynegative
embedding:badhandv4
embedding:bad-image-v2-39000
.safetensors or .pt file must be in models/embeddings/| Embedding | Best For | Description |
|---|---|---|
easynegative | SD 1.5 | General quality improvement |
badhandv4 | SD 1.5 | Fixes hand deformities |
bad-image-v2-39000 | SD 1.5 | Reduces artifacts |
negativeXL_D | SDXL | SDXL-specific negative embedding |
ac_neg1 | SDXL | Alternative SDXL negative |
Positive: a portrait of a woman, masterpiece, best quality
Negative: embedding:easynegative, embedding:badhandv4, worst quality, low quality
Negative prompt: IMPORTANT — SD 1.5 is very sensitive to negatives.
Positive prompt structure:
(masterpiece:1.2), (best quality:1.2), subject description, details, style tags
Recommended negative prompt:
worst quality, low quality, normal quality, lowres, watermark, signature,
text, jpeg artifacts, blurry, bad anatomy, bad hands, extra fingers,
missing fingers, extra limbs, deformed, disfigured, mutation, ugly
Key notes:
masterpiece, best quality significantly affect output1girl, long hair, blue eyes, school uniformeasynegative) are very effectiveNegative prompt: Moderate importance — SDXL is less sensitive to negatives than SD 1.5.
Positive prompt structure:
subject description with natural language, detailed description of scene and style
Recommended negative prompt:
blurry, low quality, deformed, ugly, bad anatomy, disfigured, poorly drawn face,
mutation, mutated, extra limbs, watermark, text
Key notes:
CLIPTextEncodeSDXL for separate controlCLIPTextEncodeSDXL has separate text_g (global description) and text_l (local details) fieldsNegative prompt: NOT USED — Flux operates at CFG=1.0 with no negative conditioning.
Positive prompt structure:
Detailed natural language description. Flux excels with descriptive sentences
rather than comma-separated tags. Describe the scene as if writing a paragraph.
Key notes:
A serene Japanese garden in autumn. A stone path leads through a grove of maple
trees with bright red and orange leaves. A small wooden bridge crosses a koi pond
where golden fish swim beneath the surface. Morning mist rises from the water,
and soft sunlight filters through the canopy. The scene is photorealistic with
warm, natural lighting and shallow depth of field.
Negative prompt: Minimal — SD3 needs very little negative guidance.
Positive prompt structure:
Natural language description, supports very long detailed prompts thanks to T5-XXL
Key notes:
low quality, blurry is usually sufficientCLIPTextEncodeSD3 node for model-specific encoding if availablemasterpiece, best quality, highly detaileda young woman, a cyberpunk cityscape, a golden retrieverwith long flowing red hair, wearing a white dressstanding in a field, looking at the camera, runningin a sunlit meadow, at night in a neon-lit streetclose-up portrait, full body shot, wide angledramatic lighting, soft natural light, studio lighting, golden houroil painting, photograph, digital art, watercolor, anime8k, uhd, photorealistic, sharp focus, depth of fieldThese tokens generally improve output quality across SD 1.5 and SDXL:
masterpiece, best quality, highly detailed, 8k, photorealistic,
ultra-detailed, sharp focus, professional, award-winning
For photorealism specifically:
photorealistic, hyperrealistic, RAW photo, DSLR, 8k uhd,
film grain, Fujifilm XT3, sharp focus, natural lighting
For anime/illustration:
masterpiece, best quality, highly detailed, anime,
beautiful detailed eyes, detailed face, illustration
LoRA (Low-Rank Adaptation) models are fine-tuned on specific concepts and require their trigger words to activate the learned concept.
a photo of ohwx woman in a garden (where ohwx is the trigger)in the style of pixar3dLoraLoader node) interacts with prompt weight — usually keep one at default# Character LoRA
a photo of sks person, wearing casual clothes, in a park
# Style LoRA
a landscape painting, autumn forest, in the style of impressionism, masterpiece
# Concept LoRA
a character wearing mecha_armor, standing in a battlefield, detailed
If ComfyUI-Impact-Pack or a wildcard node pack is installed, you can use dynamic prompt syntax:
a {red|blue|green|yellow} car parked on a {sunny|rainy|snowy} street
Each {option1|option2|option3} randomly selects one option per generation.
Wildcard .txt files (one option per line) can be referenced:
a __haircolor__ haired woman wearing a __clothing__ in __location__
Where haircolor.txt, clothing.txt, and location.txt are in the wildcards directory.
| Node | Use Case | Notes |
|---|---|---|
CLIPTextEncode | Standard single-CLIP encoding | Works with all models |
CLIPTextEncodeSDXL | SDXL dual-CLIP with separate G/L fields | Better SDXL control |
CLIPTextEncodeSD3 | SD3 triple-CLIP encoding | For SD3/SD3.5 models |
CLIPTextEncodeFlux | Flux T5-based encoding | For Flux models |
ConditioningCombine | Merge two conditionings | Stack different prompt aspects |
ConditioningSetArea | Regional prompting | Apply conditioning to specific image areas |
ConditioningSetMask | Mask-based conditioning | Apply prompt only where mask is active |
(bright:1.3) (dark:1.3) confuses the modelembedding:name without the .safetensors file installed causes errorsmasterpiece, best quality are meaningless for Flux — describe quality naturally