From zai-glm
Use this skill when the user asks about GLM models, GLM-5, GLM-4.7, GLM-4.6, GLM-4.5, GLM-4V, ChatGLM, CogView, CogVideoX, z.ai model capabilities, model selection for different tasks, or comparing GLM models.
npx claudepluginhub nsheaps/ai-mktpl --plugin zai-glmThis skill uses the workspace's default tool permissions.
The GLM (General Language Model) family is developed by z.ai (formerly Zhipu AI / 智谱AI). These models support text generation, vision, code, embeddings, image generation, and video generation. All recent models are open-weight under MIT license.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Analyzes BMad project state from catalog CSV, configs, artifacts, and query to recommend next skills or answer questions. Useful for help requests, 'what next', or starting BMad.
The GLM (General Language Model) family is developed by z.ai (formerly Zhipu AI / 智谱AI). These models support text generation, vision, code, embeddings, image generation, and video generation. All recent models are open-weight under MIT license.
https://api.z.ai/api/paas/v4/| Model | Architecture | Context | Key Features |
|---|---|---|---|
glm-5 | ~745B MoE (44B active) | 200K in / 128K out | Agentic engineering, tool streaming, long-horizon tasks, MIT |
glm-5-turbo | Same, optimized | 200K in / 128K out | Improved stability for long-chain agent tasks |
glm-4.7 | ~400B MoE | 200K in / 128K out | Coding-focused, Preserved Thinking, Turn-level Thinking, MIT |
glm-4.7-flash | Lightweight | Reduced | Free tier, lighter capability |
glm-4.6 | 355B total | 200K | Strong code benchmarks, agent frameworks, MIT |
glm-4.5 | 355B / 32B active | 128K | Hybrid reasoning (thinking/non-thinking modes), deep thinking |
glm-4.5-x | Premium tier | 128K | Higher capability, premium pricing |
glm-4.5-air | 106B / 12B active | 128K | Compact variant of GLM-4.5 |
glm-4.5-flash | Lightweight | 128K | Free tier |
GLM-4.5+ models support hybrid reasoning — toggle between deep thinking and instant response:
{
"model": "glm-4.7",
"messages": [{ "role": "user", "content": "Solve this step by step" }],
"thinking": { "type": "enabled" }
}
tool_stream: true)| Model | Parameters | Context | Description |
|---|---|---|---|
glm-4.6v | 106B / 12B active | 128K | Vision understanding, function calling |
glm-4.6v-flash | 9B | — | Free, open weights, commercial license |
glm-4.5v | 106B VLM | — | Vision-language model |
curl "https://api.z.ai/api/paas/v4/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-4.6v",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}]
}'
| Model | Category | Description |
|---|---|---|
glm-image | Image generation | Text-to-image (Jan 2026) |
glm-ocr | OCR | Document and image OCR |
cogview-3-plus | Image gen | High-quality text-to-image |
cogvideox | Video gen | Text-to-video generation |
cogvideox-flash | Video gen | Fast video generation |
| Model | Dimensions | Description |
|---|---|---|
embedding-3 | 2048 | General-purpose text embeddings |
embedding-2 | 1024 | Previous generation embeddings |
curl "https://api.z.ai/api/paas/v4/embeddings" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embedding-3",
"input": "What is machine learning?"
}'
| Use Case | Recommended Model | Why |
|---|---|---|
| Agentic tasks | glm-5 | Tool streaming, long-horizon planning |
| Coding | glm-4.7 | Coding-focused, Preserved Thinking |
| Complex reasoning | glm-4.5 | Hybrid reasoning with deep thinking |
| General chat | glm-4.5-flash | Free, good quality |
| High throughput | glm-4.5-air | Compact, fast inference |
| Image understanding | glm-4.6v | Best vision model with function calling |
| Embeddings/search | embedding-3 | Latest generation |
| Image creation | glm-image | Latest generation (Jan 2026) |
| Budget-conscious | glm-4.5-flash | Free tier available |
When using z.ai's Anthropic-compatible endpoint with Claude Code, map models to slots:
| Claude Code Slot | Recommended GLM Model | Rationale |
|---|---|---|
| Opus | glm-5 | Most capable, agentic |
| Sonnet | glm-4.7 | Strong coding, balanced cost |
| Haiku | glm-4.5-air | Fast, cost-effective |
| Model | Input | Output |
|---|---|---|
glm-5 | ~$1.00 | ~$3.20 |
glm-4.7 | $0.60 | $2.20 |
glm-4.7-flash | Free | Free |
glm-4.5 | ~$0.20 | ~$1.10 |
glm-4.5-x | — | $8.90 |
glm-4.5-flash | Free | Free |
glm-4.6v | ~$0.14 | ~$0.41 |
glm-4.6v-flash | Free | Free |
Prices approximate; see docs.z.ai/guides/overview/pricing for current rates. Batch API available at 50% cost.
glm-4.5-flash, glm-4.7-flash, glm-4.6v-flash are free