Skill

ai-chat-studio

AI Chat Studio provides a multi-LLM chat orchestration framework with 300+ assistant presets, intelligent model routing, and conversation management.

npx claudepluginhub itallstartedwithaidea/claude-googleadsagent

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/googleadsagent:ai-chat-studio

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

SKILL.md

167 lines · ~1.7k tokens

Similar Skills

using-superpowers

198.3k

Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.

3 files

superpowers

Stats

LanguageJavaScript

Stars0

MaintenanceGood

Last CommitMay 21, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

AI Chat Studio

Part of Agent Skills™ by googleadsagent.ai™

Description

AI Chat Studio provides a multi-LLM chat orchestration framework with 300+ assistant presets, intelligent model routing, and conversation management. The agent configures and manages interactions across multiple language model providers—OpenAI, Anthropic, Google, open-source models—selecting the optimal model for each task based on capability, cost, and latency requirements.

Not every task needs the most powerful model. A code review benefits from a reasoning-heavy model; a translation task runs well on a mid-tier model; a simple reformatting task wastes money on anything beyond a fast, cheap model. This skill implements intelligent routing that matches task characteristics to model capabilities, reducing cost by 40-60% while maintaining quality where it matters.

The 300+ assistant presets encode domain-specific system prompts, temperature settings, and output format constraints for common tasks: code generation, technical writing, data analysis, creative ideation, customer support, legal review, and more. Each preset is tested against a quality benchmark and tagged with the models it performs best on.

Use When

Configuring multi-provider LLM access in an application
Routing tasks to the optimal model by cost-quality trade-off
Managing conversation history and context windows
Deploying domain-specific AI assistants with curated presets
Building chat interfaces with streaming responses
Comparing model outputs for the same prompt across providers

How It Works

graph TD
    A[User Message] --> B[Task Classifier]
    B --> C{Task Type}
    C -->|Complex Reasoning| D[Claude 4 / GPT-4o]
    C -->|Code Generation| E[Claude 4 / Codestral]
    C -->|Translation| F[GPT-4o-mini / Gemini Flash]
    C -->|Simple Format| G[Haiku / Flash]
    D --> H[Apply Preset: System Prompt + Params]
    E --> H
    F --> H
    G --> H
    H --> I[Manage Context Window]
    I --> J[Stream Response]
    J --> K[Log Usage + Cost]

The task classifier analyzes the incoming message to determine complexity and domain, then routes to the most cost-effective model capable of handling it. Presets provide domain-specific system prompts and parameter tuning.

Implementation

interface ModelConfig {
  provider: "openai" | "anthropic" | "google" | "ollama";
  model: string;
  maxTokens: number;
  costPer1kInput: number;
  costPer1kOutput: number;
  capabilities: string[];
}

const MODEL_REGISTRY: ModelConfig[] = [
  { provider: "anthropic", model: "claude-sonnet-4-20250514", maxTokens: 8192,
    costPer1kInput: 0.003, costPer1kOutput: 0.015, capabilities: ["reasoning", "code", "analysis"] },
  { provider: "openai", model: "gpt-4o-mini", maxTokens: 4096,
    costPer1kInput: 0.00015, costPer1kOutput: 0.0006, capabilities: ["general", "translation", "format"] },
  { provider: "google", model: "gemini-2.0-flash", maxTokens: 8192,
    costPer1kInput: 0.0001, costPer1kOutput: 0.0004, capabilities: ["general", "fast", "multimodal"] },
];

interface AssistantPreset {
  id: string;
  name: string;
  systemPrompt: string;
  temperature: number;
  preferredModels: string[];
  tags: string[];
}

class ChatRouter {
  constructor(private models: ModelConfig[], private presets: Map<string, AssistantPreset>) {}

  route(message: string, presetId?: string): { model: ModelConfig; preset?: AssistantPreset } {
    const preset = presetId ? this.presets.get(presetId) : undefined;
    const taskType = this.classifyTask(message);

    const candidates = this.models.filter(m =>
      m.capabilities.some(c => taskType.requiredCapabilities.includes(c))
    );

    const selected = candidates.sort((a, b) => a.costPer1kInput - b.costPer1kInput)[0];
    return { model: selected, preset };
  }

  private classifyTask(message: string): { type: string; requiredCapabilities: string[] } {
    const lower = message.toLowerCase();
    if (lower.includes("debug") || lower.includes("refactor") || lower.includes("architect"))
      return { type: "complex", requiredCapabilities: ["reasoning", "code"] };
    if (lower.includes("translate") || lower.includes("rewrite"))
      return { type: "simple", requiredCapabilities: ["general", "translation"] };
    return { type: "general", requiredCapabilities: ["general"] };
  }
}

class ConversationManager {
  private history: Array<{ role: string; content: string }> = [];
  private maxContextTokens: number;

  constructor(maxContextTokens: number = 100_000) {
    this.maxContextTokens = maxContextTokens;
  }

  addMessage(role: string, content: string): void {
    this.history.push({ role, content });
    this.trimToContextWindow();
  }

  getHistory(): Array<{ role: string; content: string }> {
    return [...this.history];
  }

  private trimToContextWindow(): void {
    while (this.estimateTokens() > this.maxContextTokens && this.history.length > 2) {
      this.history.splice(1, 1);
    }
  }

  private estimateTokens(): number {
    return this.history.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);
  }
}

Best Practices

Route simple tasks to cheaper models—80% of queries do not need frontier models
Implement streaming responses for all chat interactions to improve perceived latency
Trim conversation history from the middle, preserving the system prompt and recent messages
Log model selection decisions alongside cost to optimize routing rules over time
Test presets against a benchmark dataset before deploying to production
Provide fallback models for every route in case the primary provider is unavailable

Platform Compatibility

Platform	Support	Notes
Cursor	Full	Multi-model configuration
VS Code	Full	Extension-based LLM access
Windsurf	Full	Built-in model routing
Claude Code	Full	Multi-provider support
Cline	Full	Model selection config
aider	Full	Multiple model backends

Related Skills

Keywords

ai-chat multi-llm model-routing assistant-presets conversation-management streaming cost-optimization chat-studio

ai-chat-studio

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

ai-chat-studio

Invocation

Context Preview

SKILL.md

AI Chat Studio

Description

Use When

How It Works

Implementation

Best Practices

Platform Compatibility

Related Skills

Keywords

Similar Skills

Help us improve

AI Chat Studio

Description

Use When

How It Works

Implementation

Best Practices

Platform Compatibility

Related Skills

Keywords