Use when starting a new project with llmring, building an application using LLMs, making basic chat completions, or sending messages to OpenAI, Anthropic, Google, or Ollama - covers lockfile creation (MANDATORY first step), semantic alias usage, unified interface for all providers with consistent message structure and response handling
Send chat requests to OpenAI, Anthropic, Google, or Ollama using semantic model aliases. Use when starting new projects with llmring or making basic chat completions.
/plugin marketplace add juanre/llmring/plugin install llmring@juanre-ai-toolsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
# With uv (recommended)
uv add llmring
# With pip
pip install llmring
Provider SDKs (install what you need):
uv add openai>=1.0 # OpenAI
uv add anthropic>=0.67 # Anthropic
uv add google-genai # Google Gemini
uv add ollama>=0.4 # Ollama
This skill covers:
LLMRing - Main service classLLMRequest - Request configurationLLMResponse - Response structureMessage - Message formatFIRST: Create your lockfile (required for all real applications):
# Initialize lockfile
llmring lock init
# Check available models (get current names from registry):
llmring list --provider openai
llmring list --provider anthropic
# Bind aliases using CURRENT model names:
llmring bind summarizer anthropic:claude-3-5-haiku-20241022
# Or use interactive configuration (recommended - knows current models):
llmring lock chat
⚠️ Important: Check llmring list for current model names. Models change (e.g., gemini-2.5-pro → gemini-2.5-pro).
THEN: Use in code:
from llmring import LLMRing, LLMRequest, Message
# Use context manager for automatic resource cleanup
async with LLMRing() as service:
request = LLMRequest(
model="summarizer", # YOUR semantic alias (defined in llmring.lock)
messages=[
Message(role="system", content="You are a helpful assistant."),
Message(role="user", content="Hello!")
]
)
response = await service.chat(request)
print(response.content)
⚠️ Important: The bundled lockfile that ships with llmring is ONLY for running llmring lock chat. Real applications must create their own lockfile.
The library enforces a 60-second timeout by default. Override it when processing large documents, running expensive reasoning chains, or forwarding calls to slower local models.
async with LLMRing(timeout=300.0) as service: # default for this context manager
request = LLMRequest(
model="summarizer",
messages=[Message(role="user", content=huge_thread)],
timeout=None, # disable timeout for this request
)
response = await service.chat(request)
You can also set LLMRING_PROVIDER_TIMEOUT_S=120 in the environment to establish a default when you don't pass the constructor argument.
Main service class that manages providers and routes requests.
Constructor:
LLMRing(
origin: str = "llmring",
registry_url: Optional[str] = None,
lockfile_path: Optional[str] = None,
server_url: Optional[str] = None,
api_key: Optional[str] = None,
log_metadata: bool = True,
log_conversations: bool = False,
alias_cache_size: int = 100,
alias_cache_ttl: int = 3600,
timeout: Optional[float] = 60.0
)
Parameters:
origin (str, default: "llmring"): Origin identifier for trackingregistry_url (str, optional): Custom registry URL for model informationlockfile_path (str, optional): Path to lockfile for alias configurationserver_url (str, optional): llmring-server URL for usage loggingapi_key (str, optional): API key for llmring-serverlog_metadata (bool, default: True): Enable logging of usage metadata (requires server_url)log_conversations (bool, default: False): Enable logging of full conversations (requires server_url)alias_cache_size (int, default: 100): Maximum cached alias resolutionsalias_cache_ttl (int, default: 3600): Cache TTL in secondstimeout (float | None, default: 60.0): Default request timeout in seconds (None disables)Example:
from llmring import LLMRing
# Basic initialization (uses environment variables for API keys)
async with LLMRing() as service:
response = await service.chat(request)
# With custom lockfile
async with LLMRing(lockfile_path="./my-llmring.lock") as service:
response = await service.chat(request)
Send a chat completion request and get a response.
Signature:
async def chat(
request: LLMRequest,
profile: Optional[str] = None
) -> LLMResponse
Parameters:
request (LLMRequest): Request configuration with messages and parametersprofile (str, optional): Profile name for environment-specific configuration (e.g., "dev", "prod")Returns:
LLMResponse: Response with content, usage, and metadataRaises:
ProviderNotFoundError: If provider is not configuredModelNotFoundError: If model is not availableProviderAuthenticationError: If API key is invalidProviderRateLimitError: If rate limit exceededExample:
from llmring import LLMRing, LLMRequest, Message
async with LLMRing() as service:
request = LLMRequest(
model="responder", # Your alias for responses
messages=[
Message(role="user", content="What is 2+2?")
],
temperature=0.7,
max_tokens=100
)
response = await service.chat(request)
print(f"Response: {response.content}")
print(f"Tokens: {response.total_tokens}")
print(f"Model: {response.model}")
Configuration for a chat completion request.
Constructor:
LLMRequest(
messages: List[Message],
model: Optional[str] = None,
temperature: Optional[float] = None,
max_tokens: Optional[int] = None,
reasoning_tokens: Optional[int] = None,
response_format: Optional[Dict[str, Any]] = None,
tools: Optional[List[Dict[str, Any]]] = None,
tool_choice: Optional[Union[str, Dict[str, Any]]] = None,
cache: Optional[Dict[str, Any]] = None,
metadata: Optional[Dict[str, Any]] = None,
json_response: Optional[bool] = None,
timeout: Optional[float] = None,
extra_params: Dict[str, Any] = {}
)
Parameters:
messages (List[Message], required): Conversation messagesmodel (str, optional): Model alias (e.g., "fast") or provider:model reference (e.g., "openai:gpt-4o")temperature (float, optional): Sampling temperature (0.0-2.0). Higher = more randommax_tokens (int, optional): Maximum tokens to generatereasoning_tokens (int, optional): Token budget for reasoning models (o1, etc.)response_format (dict, optional): Structured output format (see llmring-structured skill)tools (list, optional): Available functions (see llmring-tools skill)tool_choice (str/dict, optional): Tool selection strategycache (dict, optional): Caching configurationmetadata (dict, optional): Request metadatajson_response (bool, optional): Request JSON format responsetimeout (float | None, optional): Override service-level timeout; None waits indefinitelyextra_params (dict, default: {}): Provider-specific parametersExample:
from llmring import LLMRequest, Message
# Simple request
request = LLMRequest(
model="summarizer", # Your domain-specific alias
messages=[Message(role="user", content="Hello")]
)
# With parameters
request = LLMRequest(
model="explainer", # Another semantic alias you define
messages=[
Message(role="system", content="You are a helpful assistant."),
Message(role="user", content="Explain quantum computing")
],
temperature=0.3,
max_tokens=500
)
A message in a conversation.
Constructor:
Message(
role: Literal["system", "user", "assistant", "tool"],
content: Any,
tool_calls: Optional[List[Dict[str, Any]]] = None,
tool_call_id: Optional[str] = None,
timestamp: Optional[datetime] = None,
metadata: Optional[Dict[str, Any]] = None
)
Parameters:
role (str, required): Message role - "system", "user", "assistant", or "tool"content (Any, required): Message content (string or structured content for multimodal)tool_calls (list, optional): Tool calls made by assistanttool_call_id (str, optional): ID for tool result messagestimestamp (datetime, optional): Message timestampmetadata (dict, optional): Provider-specific metadata (e.g., cache_control for Anthropic)Example:
from llmring import Message
# System message
system_msg = Message(
role="system",
content="You are a helpful assistant."
)
# User message
user_msg = Message(
role="user",
content="What is the capital of France?"
)
# Assistant response
assistant_msg = Message(
role="assistant",
content="The capital of France is Paris."
)
# Anthropic prompt caching
cached_msg = Message(
role="system",
content="Very long system prompt...",
metadata={"cache_control": {"type": "ephemeral"}}
)
Response from a chat completion.
Attributes:
content (str): Generated text contentmodel (str): Model that generated the responseusage (dict, optional): Token usage statisticsfinish_reason (str, optional): Why generation stopped ("stop", "length", "tool_calls")tool_calls (list, optional): Tool calls made by modelparsed (dict, optional): Parsed JSON when response_format usedProperties:
total_tokens (int, optional): Total tokens used (prompt + completion)Example:
response = await service.chat(request)
print(response.content) # "The capital is Paris."
print(response.model) # "anthropic:claude-sonnet-4-5-20250929"
print(response.total_tokens) # 45
print(response.finish_reason) # "stop"
print(response.usage) # {"prompt_tokens": 20, "completion_tokens": 25}
Required environment variables (set API keys for providers you want to use):
# Add to .env file or export
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GEMINI_API_KEY=AIza...
OLLAMA_BASE_URL=http://localhost:11434 # Optional, default shown
LLMRing automatically initializes providers based on available API keys.
Always use context manager for automatic cleanup:
from llmring import LLMRing, LLMRequest, Message
# Context manager handles cleanup automatically
async with LLMRing() as service:
request = LLMRequest(
model="chatbot", # Your alias for conversational AI
messages=[Message(role="user", content="Hello")]
)
response = await service.chat(request)
# Resources cleaned up when exiting context
If you can't use context manager:
service = LLMRing()
try:
response = await service.chat(request)
finally:
await service.close() # MUST call close()
from llmring import LLMRing, LLMRequest, Message
async with LLMRing() as service:
messages = [
Message(role="system", content="You are a helpful assistant."),
Message(role="user", content="What is Python?")
]
# First turn
request = LLMRequest(model="assistant", messages=messages)
response = await service.chat(request)
# Add assistant response to history
messages.append(Message(role="assistant", content=response.content))
# Second turn
messages.append(Message(role="user", content="What about JavaScript?"))
request = LLMRequest(model="assistant", messages=messages)
response = await service.chat(request)
print(response.content)
# Semantic aliases YOU define in your lockfile
request = LLMRequest(
model="summarizer", # Alias you configured for this task
messages=[Message(role="user", content="Hello")]
)
# Use task-based names:
# model="code-reviewer" - For code review tasks
# model="sql-generator" - For generating SQL
# model="extractor" - For extracting structured data
# model="analyzer" - For analysis tasks
# Direct provider:model format (escape hatch)
request = LLMRequest(
model="anthropic:claude-sonnet-4-5-20250929",
messages=[Message(role="user", content="Hello")]
)
# Or specific versions
request = LLMRequest(
model="openai:gpt-4o",
messages=[Message(role="user", content="Hello")]
)
# Creative writing (higher temperature)
request = LLMRequest(
model="creative-writer", # Your alias for creative tasks
messages=[Message(role="user", content="Write a poem")],
temperature=1.2 # More random/creative
)
# Factual responses (lower temperature)
request = LLMRequest(
model="factual-responder", # Your alias for factual tasks
messages=[Message(role="user", content="What is 2+2?")],
temperature=0.2 # More deterministic
)
# Limit response length
request = LLMRequest(
model="summarizer", # Your summarization alias
messages=[Message(role="user", content="Summarize this...")],
max_tokens=100 # Cap at 100 tokens
)
from llmring import (
LLMRing,
LLMRequest,
Message,
ProviderAuthenticationError,
ModelNotFoundError,
ProviderRateLimitError,
ProviderTimeoutError,
ProviderNotFoundError
)
async with LLMRing() as service:
try:
request = LLMRequest(
model="chatbot", # Your conversational alias
messages=[Message(role="user", content="Hello")]
)
response = await service.chat(request)
except ProviderAuthenticationError:
print("Invalid API key - check environment variables")
except ModelNotFoundError as e:
print(f"Model not available: {e}")
except ProviderRateLimitError as e:
print(f"Rate limited - retry after {e.retry_after}s")
except ProviderTimeoutError:
print("Request timed out")
except ProviderNotFoundError:
print("Provider not configured - check API keys")
# DON'T DO THIS - resources not cleaned up
service = LLMRing()
response = await service.chat(request)
# Forgot to call close()!
Right: Use Context Manager
# DO THIS - automatic cleanup
async with LLMRing() as service:
response = await service.chat(request)
# DON'T DO THIS - invalid role
message = Message(role="admin", content="Hello")
Right: Use Valid Roles
# DO THIS - valid roles only
message = Message(role="user", content="Hello")
# Valid: "system", "user", "assistant", "tool"
# DON'T DO THIS - no model specified and no lockfile
request = LLMRequest(
messages=[Message(role="user", content="Hello")]
)
Right: Use Semantic Alias from Lockfile
# DO THIS - use your semantic alias
request = LLMRequest(
model="chatbot", # or "anthropic:claude-sonnet-4-5-20250929" for direct reference
messages=[Message(role="user", content="Hello")]
)
Use different models for different environments:
# Set profile via environment variable
# export LLMRING_PROFILE=dev
# Or in code
async with LLMRing() as service:
# Uses 'dev' profile bindings (cheaper models)
response = await service.chat(request, profile="dev")
# Uses 'prod' profile bindings (higher quality)
response = await service.chat(request, profile="prod")
See llmring-lockfile skill for full profile documentation.
llmring-streaming - Stream responses for real-time outputllmring-tools - Function calling and tool usellmring-structured - JSON schema for structured outputllmring-lockfile - Configure aliases and profilesllmring-providers - Multi-provider patterns and raw SDK access| Provider | Initialization | Example |
|---|---|---|
| OpenAI | Set OPENAI_API_KEY | model="openai:gpt-4o" |
| Anthropic | Set ANTHROPIC_API_KEY | model="anthropic:claude-sonnet-4-5-20250929" |
Set GOOGLE_GEMINI_API_KEY | model="google:gemini-2.5-pro" | |
| Ollama | Runs automatically | model="ollama:llama3" |
All providers work with the same unified API - no code changes needed to switch providers.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.