Complete guide to extended thinking modes and thinking budget configuration.
From claude-code-expertnpx claudepluginhub markus41/claude --plugin claude-code-expertThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agentic engineering workflows: eval-first loops, 15-min task decomposition, model routing (Haiku/Sonnet/Opus), AI code reviews, and cost tracking.
Complete guide to extended thinking modes and thinking budget configuration.
Extended thinking (also called "ultrathink") allows Claude to perform deeper reasoning before responding. This uses additional tokens for internal reasoning that improves quality on complex tasks.
Use these phrases in your prompts to activate thinking levels:
| Phrase | Thinking Tokens | Approx Cost |
|---|---|---|
think | ~4,000 tokens | ~$0.06 |
think hard / megathink | ~10,000 tokens | ~$0.15 |
ultrathink | ~32,000 tokens | ~$0.48 |
Example: "ultrathink about how to refactor the auth module"
Option+T — Toggle extended thinking on/off in interactive modeCLAUDE_CODE_EFFORT_LEVEL=high # low, medium, high
Standard reasoning — Claude thinks internally as needed.
More deliberate reasoning with configurable thinking budget.
Maximum reasoning depth for the most complex problems. Best for architecture decisions, debugging complex issues, and novel problem-solving.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 16000,
thinking: {
type: "enabled",
budget_tokens: 10000 // Tokens allocated for thinking
},
messages: [
{ role: "user", content: "Solve this complex architecture problem..." }
]
});
// Access thinking content
for (const block of response.content) {
if (block.type === "thinking") {
console.log("Thinking:", block.thinking);
} else if (block.type === "text") {
console.log("Response:", block.text);
}
}
| Budget | Use Case |
|---|---|
| 2,000-5,000 | Simple analysis, code review |
| 5,000-10,000 | Architecture decisions, debugging |
| 10,000-20,000 | Complex multi-step reasoning |
| 20,000+ | Deep research, novel problem solving |
const stream = await client.messages.stream({
model: "claude-sonnet-4-6",
max_tokens: 16000,
thinking: {
type: "enabled",
budget_tokens: 10000
},
messages: [{ role: "user", content: "..." }]
});
for await (const event of stream) {
if (event.type === "content_block_start") {
if (event.content_block.type === "thinking") {
console.log("--- Thinking started ---");
}
}
if (event.type === "content_block_delta") {
if (event.delta.type === "thinking_delta") {
process.stdout.write(event.delta.thinking);
} else if (event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
}
Within Claude Code sessions, extended thinking is managed automatically based on:
| Model | Extended Thinking |
|---|---|
| Claude Opus 4.6 | Full support |
| Claude Sonnet 4.6 | Full support |
| Claude Haiku 4.5 | Limited support |
/cost to monitor thinking token usage/compact preserves thinking conclusions but drops raw thinking{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...\n1. First consideration...\n2. Second consideration..."
},
{
"type": "text",
"text": "Based on my analysis, here's what I recommend..."
}
],
"usage": {
"input_tokens": 500,
"output_tokens": 3000,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
}