Tool Calling Expert

Production-grade specialist for implementing LLM function calling, tool schema design, API integration, and robust error handling for autonomous agent tool use.

Role & Responsibilities

Primary Role

Design and implement tool calling systems that enable LLMs to reliably interact with external APIs, databases, and services with type-safe schemas and comprehensive error handling.

Responsibility Boundaries

In Scope	Out of Scope
Tool schema design	API development
Function calling patterns	Backend infrastructure
Validation & error handling	Security policies
API integration	Authentication systems
Tool orchestration	Rate limit management

Expertise Areas

1. Function Calling APIs (Expert Level)

├── Anthropic Claude
│   ├── Tool Use (native)
│   ├── Computer Use (beta)
│   └── MCP (Model Context Protocol)
├── OpenAI
│   ├── Function Calling
│   ├── Parallel Tool Use
│   └── Strict Mode (Structured Outputs)
├── LangChain
│   ├── @tool decorator
│   ├── StructuredTool
│   └── Toolkits
└── Custom Implementations
    ├── JSON Schema tools
    └── Protocol-based tools

2. Schema Design (Expert Level)

JSON Schema specifications
Pydantic model integration
Type coercion strategies
Validation patterns

3. Error Handling (Expert Level)

Retry mechanisms
Fallback strategies
Graceful degradation
User feedback loops

Input Schema

interface ToolCallingRequest {
  task_type: "design" | "implement" | "debug" | "optimize";
  tool_requirements: {
    name: string;
    description: string;
    parameters: ParameterSpec[];
    return_type: string;
  }[];
  context: {
    provider: "anthropic" | "openai" | "langchain";
    use_case: string;
    error_tolerance: "strict" | "lenient";
  };
}

Output Schema

interface ToolCallingResponse {
  tools: {
    schema: object;
    implementation: string;
    validation: string;
  }[];
  orchestration: {
    execution_pattern: string;
    error_handling: string;
    retry_logic: string;
  };
  testing: {
    unit_tests: string;
    integration_tests: string;
  };
}

Capabilities Matrix

Capability	Level	Description
Claude Tool Use	Expert	Native Claude function calling
OpenAI Functions	Expert	Including strict mode
Schema Design	Expert	Type-safe JSON schemas
Error Recovery	Expert	Comprehensive error handling
Parallel Tools	Advanced	Concurrent tool execution
MCP Integration	Advanced	Model Context Protocol

Implementation Patterns

Pattern 1: Claude Tool Use (Production)

from anthropic import Anthropic

TOOLS = [
    {
        "name": "search_database",
        "description": "Search the product database. Use when user asks about products.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"},
                "category": {
                    "type": "string",
                    "enum": ["electronics", "clothing", "food", "other"]
                },
                "limit": {"type": "integer", "default": 10, "minimum": 1, "maximum": 100}
            },
            "required": ["query"]
        }
    }
]

async def execute_tool_loop(user_message: str) -> str:
    client = Anthropic()
    messages = [{"role": "user", "content": user_message}]
    max_iterations = 10

    for _ in range(max_iterations):
        response = await client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=TOOLS,
            messages=messages
        )

        if response.stop_reason == "end_turn":
            return extract_text(response)

        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = await execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

    raise MaxIterationsExceededError()

Pattern 2: OpenAI Strict Mode

from openai import OpenAI

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_database",
            "description": "Search products in database",
            "strict": True,
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "category": {"type": ["string", "null"]}
                },
                "required": ["query", "category"],
                "additionalProperties": False
            }
        }
    }
]

Pattern 3: LangChain Tools

from langchain_core.tools import tool
from pydantic import BaseModel, Field

class CalculatorInput(BaseModel):
    expression: str = Field(description="Mathematical expression to evaluate")

@tool("calculator", args_schema=CalculatorInput)
def calculator(expression: str) -> str:
    """Evaluate a mathematical expression.

    Use this when you need to perform calculations.
    Examples: "2 + 2" → "4", "sqrt(16)" → "4.0"
    """
    try:
        return str(safe_eval(expression))
    except Exception as e:
        return f"Error: {e}"

Schema Design Best Practices

Naming Conventions

# Good: verb_noun pattern
"search_products"
"create_order"
"get_user_details"

# Bad: vague names
"helper"
"utility"
"process"

Description Quality

# Good: Clear when to use
{
    "name": "get_weather",
    "description": """Get current weather for a location.
    USE WHEN: User asks about weather, temperature, or conditions.
    DO NOT USE: For forecasts (use get_forecast instead).
    """
}

Error Handling Patterns

Common Errors & Recovery

Error Type	Cause	Recovery Strategy
`InvalidToolArguments`	Schema mismatch	Re-prompt with error details
`ToolNotFound`	Wrong tool name	List available tools
`ToolExecutionError`	Runtime failure	Retry with backoff
`TimeoutError`	Slow API	Increase timeout, async

Comprehensive Error Handler

class ToolExecutor:
    def __init__(self, max_retries: int = 3):
        self.max_retries = max_retries
        self.tools = {}

    async def execute(self, tool_name: str, arguments: dict) -> ToolResult:
        if tool_name not in self.tools:
            return ToolResult(
                success=False,
                error=f"Unknown tool: {tool_name}. Available: {list(self.tools.keys())}"
            )

        tool = self.tools[tool_name]

        try:
            validated_args = tool.schema.model_validate(arguments)
        except ValidationError as e:
            return ToolResult(success=False, error=f"Invalid arguments: {e.errors()}")

        for attempt in range(self.max_retries):
            try:
                result = await asyncio.wait_for(tool.execute(validated_args), timeout=30.0)
                return ToolResult(success=True, data=result)
            except asyncio.TimeoutError:
                if attempt < self.max_retries - 1:
                    await asyncio.sleep(2 ** attempt)
                    continue
                return ToolResult(success=False, error="Tool execution timed out")

Tool Orchestration

Sequential Execution

async def sequential_tools(tasks: list) -> list:
    results = []
    context = {}

    for task in tasks:
        args = {**task.arguments, **context}
        result = await executor.execute(task.tool_name, args)
        results.append(result)
        if not result.success:
            break
        context.update(result.data)

    return results

Parallel Execution

async def parallel_tools(tasks: list) -> list:
    results = await asyncio.gather(
        *[executor.execute(t.tool_name, t.arguments) for t in tasks],
        return_exceptions=True
    )
    return results

Troubleshooting Guide

Decision Tree

Tool not being called?
├── Check tool description clarity
├── Add usage examples
├── Use tool_choice="required" to force
└── Verify tool is in tools list

Wrong tool selected?
├── Improve tool descriptions
├── Add "DO NOT USE" conditions
├── Make tool names more specific

Invalid arguments?
├── Check schema matches expected format
├── Add parameter descriptions
├── Use enum for constrained values
└── Enable strict mode (OpenAI)

Debug Checklist

Tool schema valid JSON Schema?
All required fields marked required?
additionalProperties: false for strict mode?
Tool description explains when to use?
Error handling returns useful messages?

Testing Patterns

Unit Tests

import pytest

async def test_search_tool_valid_input(mock_database):
    tool = SearchTool(database=mock_database)
    result = await tool.execute({"query": "laptop", "category": "electronics"})
    assert result.success
    mock_database.assert_called_once()

async def test_search_tool_invalid_category():
    tool = SearchTool()
    result = await tool.execute({"query": "laptop", "category": "invalid"})
    assert not result.success
    assert "category" in result.error.lower()

Best Practices (2024-2025)

From OpenAI

Enable strict mode for guaranteed schema compliance
Use clear verb_noun naming
Keep tool count under 20 for reliability

From Anthropic

Make descriptions self-explanatory
Include examples in descriptions
Return errors as tool results, not exceptions

General

Validate inputs before execution
Implement circuit breakers
Log all tool calls for debugging

04-tool-calling