Build with Claude Messages API using structured outputs for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, and model deprecations. Prevents 12 documented errors. Use when: building chatbots/agents, troubleshooting rate_limit_error, prompt caching issues, or streaming SSE parsing errors.
/plugin marketplace add jezweb/claude-skills/plugin install jezweb-tooling-skills@jezweb/claude-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
README.mdreferences/api-reference.mdreferences/prompt-caching-guide.mdreferences/rate-limits.mdreferences/tool-use-patterns.mdreferences/top-errors.mdreferences/vision-capabilities.mdrules/claude-api.mdscripts/check-versions.shtemplates/basic-chat.tstemplates/cloudflare-worker.tstemplates/error-handling.tstemplates/extended-thinking.tstemplates/nextjs-api-route.tstemplates/nodejs-example.tstemplates/package.jsontemplates/prompt-caching.tstemplates/streaming-chat.tstemplates/tool-use-advanced.tstemplates/tool-use-basic.tsPackage: @anthropic-ai/sdk@0.71.2 Breaking Changes: Oct 2025 - Claude 3.5/3.7 models retired, Nov 2025 - Structured outputs beta Last Updated: 2026-01-09
Major Features:
Guaranteed JSON schema conformance - Claude's responses strictly follow your JSON schema with two modes:
JSON Outputs (output_format) - For data extraction and formatting:
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Extract contact info: John Doe, john@example.com, 555-1234' }],
betas: ['structured-outputs-2025-11-13'],
output_format: {
type: 'json_schema',
json_schema: {
name: 'Contact',
strict: true,
schema: {
type: 'object',
properties: {
name: { type: 'string' },
email: { type: 'string' },
phone: { type: 'string' }
},
required: ['name', 'email', 'phone'],
additionalProperties: false
}
}
}
});
// Guaranteed valid JSON matching schema
const contact = JSON.parse(message.content[0].text);
console.log(contact.name); // "John Doe"
Strict Tool Use (strict: true) - For validated function parameters:
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Get weather for San Francisco' }],
betas: ['structured-outputs-2025-11-13'],
tools: [{
name: 'get_weather',
description: 'Get current weather',
input_schema: {
type: 'object',
properties: {
location: { type: 'string' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location'],
additionalProperties: false
},
strict: true // ← Guarantees schema compliance
}]
});
Requirements:
structured-outputs-2025-11-13 (via betas array)Limitations:
minimum, maximum)When to Use:
Retired (return errors):
Active Models (Nov 2025):
| Model | ID | Context | Best For | Cost (per MTok) |
|---|---|---|---|---|
| Claude Sonnet 4.5 | claude-sonnet-4-5-20250929 | 200k | Balanced performance | $3/$15 (in/out) |
| Claude Opus 4 | claude-opus-4-20250514 | 200k | Highest capability | $15/$75 |
| Claude Haiku 4.5 | claude-3-5-haiku-20241022 | 200k | Near-frontier, fast | $1/$5 |
Clear Thinking Blocks - Automatic thinking block cleanup:
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 4096,
messages: [{ role: 'user', content: 'Solve complex problem' }],
betas: ['clear_thinking_20251015']
});
// Thinking blocks automatically managed
Pre-built skills for Office files (PowerPoint, Excel, Word, PDF):
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Analyze this spreadsheet' }],
betas: ['skills-2025-10-02'],
// Requires code execution tool enabled
});
📚 Docs: https://platform.claude.com/docs/en/build-with-claude/structured-outputs
CRITICAL Error Pattern - Errors occur AFTER initial 200 response:
const stream = anthropic.messages.stream({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello' }],
});
stream
.on('error', (error) => {
// Error can occur AFTER stream starts
console.error('Stream error:', error);
// Implement fallback or retry logic
})
.on('abort', (error) => {
console.warn('Stream aborted:', error);
});
Why this matters: Unlike regular HTTP errors, SSE errors happen mid-stream after 200 OK, requiring error event listeners
CRITICAL Rule - cache_control MUST be on LAST block:
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
system: [
{
type: 'text',
text: 'System instructions...',
},
{
type: 'text',
text: LARGE_CODEBASE, // 50k tokens
cache_control: { type: 'ephemeral' }, // ← MUST be on LAST block
},
],
messages: [{ role: 'user', content: 'Explain auth module' }],
});
// Monitor cache usage
console.log('Cache reads:', message.usage.cache_read_input_tokens);
console.log('Cache writes:', message.usage.cache_creation_input_tokens);
Minimum requirements:
CRITICAL Patterns:
Strict Tool Use (with structured outputs):
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
betas: ['structured-outputs-2025-11-13'],
tools: [{
name: 'get_weather',
description: 'Get weather data',
input_schema: {
type: 'object',
properties: {
location: { type: 'string' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location'],
additionalProperties: false
},
strict: true // ← Guarantees schema compliance
}],
messages: [{ role: 'user', content: 'Weather in NYC?' }]
});
Tool Result Pattern - tool_use_id MUST match:
const toolResults = [];
for (const block of response.content) {
if (block.type === 'tool_use') {
const result = await executeToolFunction(block.name, block.input);
toolResults.push({
type: 'tool_result',
tool_use_id: block.id, // ← MUST match tool_use block id
content: JSON.stringify(result),
});
}
}
messages.push({
role: 'user',
content: toolResults,
});
Error Handling - Handle tool execution failures:
try {
const result = await executeToolFunction(block.name, block.input);
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
content: JSON.stringify(result),
});
} catch (error) {
// Return error to Claude for handling
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
is_error: true,
content: `Tool execution failed: ${error.message}`,
});
}
CRITICAL Rules:
Format validation - Check before encoding:
const validFormats = ['image/jpeg', 'image/png', 'image/webp', 'image/gif'];
if (!validFormats.includes(mimeType)) {
throw new Error(`Unsupported format: ${mimeType}`);
}
⚠️ Model Compatibility:
CRITICAL:
max_tokens (thinking consumes tokens)CRITICAL Pattern - Respect retry-after header with exponential backoff:
async function makeRequestWithRetry(
requestFn: () => Promise<any>,
maxRetries = 3,
baseDelay = 1000
): Promise<any> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await requestFn();
} catch (error) {
if (error.status === 429) {
// CRITICAL: Use retry-after header if present
const retryAfter = error.response?.headers?.['retry-after'];
const delay = retryAfter
? parseInt(retryAfter) * 1000
: baseDelay * Math.pow(2, attempt);
console.warn(`Rate limited. Retrying in ${delay}ms...`);
await new Promise(resolve => setTimeout(resolve, delay));
} else {
throw error;
}
}
}
throw new Error('Max retries exceeded');
}
Rate limit headers:
anthropic-ratelimit-requests-limit - Total RPM allowedanthropic-ratelimit-requests-remaining - Remaining requestsanthropic-ratelimit-requests-reset - Reset timestampCommon Error Codes:
| Status | Error Type | Cause | Solution |
|---|---|---|---|
| 400 | invalid_request_error | Bad parameters | Validate request body |
| 401 | authentication_error | Invalid API key | Check env variable |
| 403 | permission_error | No access to feature | Check account tier |
| 404 | not_found_error | Invalid endpoint | Check API version |
| 429 | rate_limit_error | Too many requests | Implement retry logic |
| 500 | api_error | Internal error | Retry with backoff |
| 529 | overloaded_error | System overloaded | Retry later |
CRITICAL:
retry-after header on 429 errorsThis skill prevents 12 documented issues:
Error: 429 Too Many Requests: Number of request tokens has exceeded your per-minute rate limit
Source: https://docs.claude.com/en/api/errors
Why It Happens: Exceeding RPM, TPM, or daily token limits
Prevention: Implement exponential backoff with retry-after header respect
Error: Incomplete chunks, malformed SSE events Source: Common SDK issue (GitHub #323) Why It Happens: Network interruptions, improper event parsing Prevention: Use SDK stream helpers, implement error event listeners
Error: High costs despite cache_control blocks
Source: https://platform.claude.com/docs/en/build-with-claude/prompt-caching
Why It Happens: cache_control placed incorrectly (must be at END)
Prevention: Always place cache_control on LAST block of cacheable content
Error: invalid_request_error: tools[0].input_schema is invalid
Source: API validation errors
Why It Happens: Invalid JSON Schema, missing required fields
Prevention: Validate schemas with JSON Schema validator, test thoroughly
Error: invalid_request_error: image source must be base64 or url
Source: API documentation
Why It Happens: Incorrect encoding, unsupported formats
Prevention: Validate format (JPEG/PNG/WebP/GIF), proper base64 encoding
Error: Unexpected high costs, context window exceeded Source: Token counting differences Why It Happens: Not accounting for special tokens, formatting Prevention: Use official token counter, monitor usage headers
Error: System prompt ignored or overridden Source: API behavior Why It Happens: System prompt placed after messages array Prevention: ALWAYS place system prompt before messages
Error: invalid_request_error: messages: too many tokens
Source: Model limits
Why It Happens: Long conversations without pruning
Prevention: Implement message history pruning, use caching
Error: No thinking blocks in response Source: Model capabilities Why It Happens: Using retired/deprecated models (3.5/3.7 Sonnet) Prevention: Only use extended thinking with Claude Sonnet 4.5 or Claude Opus 4
Error: CORS errors, security vulnerability Source: Security best practices Why It Happens: Making API calls from browser Prevention: Server-side only, use environment variables
Error: Lower limits than expected Source: Account tier system Why It Happens: Not understanding tier progression Prevention: Check Console for current tier, auto-scales with usage
Error: invalid_request_error: unknown parameter: batches
Source: Beta API requirements
Why It Happens: Missing anthropic-beta header
Prevention: Include anthropic-beta: message-batches-2024-09-24 header
Latest: @anthropic-ai/sdk@0.71.2
{
"dependencies": {
"@anthropic-ai/sdk": "^0.71.2"
},
"devDependencies": {
"@types/node": "^20.0.0",
"typescript": "^5.3.0",
"zod": "^3.23.0"
}
}
Token Efficiency:
Errors prevented: 12 documented issues with exact solutions Key value: Structured outputs (v0.69.0+), model deprecations (Oct 2025), prompt caching edge cases, streaming error patterns, rate limit retry logic
Last verified: 2026-01-09 | Skill version: 2.0.1 | Changes: Updated SDK version to 0.71.2
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.