Use when diagnosing openai_harmony.HarmonyError or gpt-oss tool calling issues with vLLM. Identifies error sources (vLLM server vs client), maps specific error messages to known GitHub issues, and provides configuration fixes for tool calling problems with gpt-oss models.
Diagnoses `openai_harmony.HarmonyError` and tool calling failures with gpt-oss models on vLLM. Identifies whether errors originate from server vs client, maps token mismatches to known GitHub issues, and provides configuration fixes for `--tool-call-parser` and model files.
/plugin marketplace add bbrowning/bbrowning-claude-marketplace/plugin install bbrowning-claude@bbrowning-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
reference/known-issues.mdreference/model-updates.mdreference/tool-calling-setup.mdInvoke this skill when you encounter:
openai_harmony.HarmonyError messages in any contextIMPORTANT: openai_harmony.HarmonyError messages originate from the vLLM server, NOT from client applications (like llama-stack, LangChain, etc.).
Check the error origin:
openai_harmony.HarmonyError, it's from vLLM's serving layerCorrect investigation path:
Error Pattern: Unexpected token X while expecting start token Y
Example: Unexpected token 12606 while expecting start token 200006
Meaning:
Known Issues:
Fixes:
Symptoms:
tool_calls=[] arraysRoot Causes:
Configuration Requirements:
vLLM server must be started with:
--tool-call-parser openai --enable-auto-tool-choice
For demo tool server:
--tool-server demo
For MCP tool servers:
--tool-server ip-1:port-1,ip-2:port-2
Important: Only tool_choice='auto' is supported.
Identify the error message:
Search vLLM GitHub:
Check model configuration:
Review server configuration:
Check vLLM version:
Check vLLM server health:
curl http://localhost:8000/health
List available models:
curl http://localhost:8000/v1/models
Check vLLM version:
pip show vllm
For detailed information:
After implementing fixes:
If errors persist:
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.