Help us improve
Share bugs, ideas, or general feedback.
From vercel
Build AI features with the Vercel AI SDK (ai-sdk.dev) — streaming text and structured output, model adapters across providers, tool calling, RSC integrations, useChat / useCompletion / useObject hooks, and the wiring patterns that pair the SDK cleanly with the AI Gateway, Fluid Compute, and Vercel-managed storage. Use this skill any time an AI feature is being built on Next.js, when the team is asking "how do we ship a chat UI", when streaming or tool calling is in scope, when RSC + AI is the question, or when an existing AI integration needs to be standardized. Trigger any time AI SDK or `ai` package usage is implicated.
npx claudepluginhub bpainter/composable-dxp-claude-marketplace --plugin vercelHow this skill is triggered — by the user, by Claude, or both
Slash command
/vercel:vercel-ai-sdkThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The Vercel AI SDK (`ai-sdk.dev`) is the developer-facing SDK for building AI features in Next.js (and other frameworks). Streaming-first, model-agnostic, RSC-aware. This skill owns the SDK integration patterns.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
The Vercel AI SDK (ai-sdk.dev) is the developer-facing SDK for building AI features in Next.js (and other frameworks). Streaming-first, model-agnostic, RSC-aware. This skill owns the SDK integration patterns.
Pair with vercel-ai-gateway (route SDK calls through the Gateway for multi-provider + observability + cost control), vercel-fluid-compute (maxDuration configuration for streaming routes), vercel-storage (chat history, embeddings, RAG context), vercel-observability (token / cost dashboards), and software-engineering-ai-engineer for application-side AI architecture.
@ai-sdk/openai, @ai-sdk/anthropic, etc. through the same ai core API. Switching providers is a one-line change.vercel-ai-gateway) on every Slalom build. Gives you observability + fallback + cost control without touching app code.pnpm add ai @ai-sdk/openai @ai-sdk/anthropic
# Plus zod for tool/structured-output schemas
pnpm add zod
Provider adapters install separately. Common: @ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/mistral, @ai-sdk/xai.
For AI Gateway routing, use @ai-sdk/openai adapter pointed at the Gateway's OpenAI-compatible endpoint, or the dedicated AI Gateway provider — see vercel-ai-gateway.
// app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
export const maxDuration = 300; // function config — see vercel-fluid-compute
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai("gpt-4o"),
system: "You are a helpful assistant for Slalom Composable DXP work.",
messages,
maxOutputTokens: 2000,
temperature: 0.7,
});
return result.toDataStreamResponse();
}
Client side:
// app/(chat)/chat/page.tsx
"use client";
import { useChat } from "ai/react";
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat();
return (
<div>
{messages.map((m) => (
<div key={m.id}>{m.role}: {m.content}</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
</form>
</div>
);
}
useChat handles all the streaming wiring — message append, in-flight state, completions, errors. For finer control, streamText exposes the raw stream.
import { generateText } from "ai";
const { text } = await generateText({
model: openai("gpt-4o-mini"),
prompt: "Summarize this in one sentence: ...",
});
Use generateText when you don't need streaming (background jobs, server-side enrichment).
import { generateObject } from "ai";
import { z } from "zod";
const { object } = await generateObject({
model: openai("gpt-4o"),
schema: z.object({
title: z.string(),
tags: z.array(z.string()).max(5),
audience: z.enum(["founder", "investor", "general-counsel"]),
summary: z.string().max(280),
}),
prompt: `Extract metadata from this article: ${article}`,
});
// object is typed: { title: string; tags: string[]; audience: ...; summary: string }
generateObject enforces the Zod schema — the SDK retries with feedback if the LLM produces invalid output. Replaces hand-rolled "parse JSON, validate, retry" loops.
For streaming structured output, use streamObject:
const { partialObjectStream } = streamObject({
model: openai("gpt-4o"),
schema: ProductSchema,
prompt: "...",
});
for await (const partial of partialObjectStream) {
// partial is the in-progress object as fields stream in
yield partial;
}
import { streamText, tool } from "ai";
import { z } from "zod";
import { searchProducts } from "@/lib/products";
import { sendEmail } from "@/lib/email";
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai("gpt-4o"),
messages,
tools: {
searchProducts: tool({
description: "Search the product catalog for items matching a query",
parameters: z.object({
query: z.string(),
limit: z.number().min(1).max(20).default(10),
}),
execute: async ({ query, limit }) => {
return await searchProducts(query, limit);
},
}),
sendEmail: tool({
description: "Send an email to a customer",
parameters: z.object({
to: z.string().email(),
subject: z.string(),
body: z.string(),
}),
execute: async ({ to, subject, body }) => {
await sendEmail(to, subject, body);
return { sent: true };
},
}),
},
maxSteps: 5, // safety bound on tool-call iterations
});
return result.toDataStreamResponse();
}
The LLM picks tools based on conversation; the SDK runs the execute, feeds results back, the LLM continues. maxSteps prevents infinite loops.
For destructive tools (sendEmail, writeToDatabase), require human confirmation:
sendEmail: tool({
description: "...",
parameters: z.object({...}),
// No execute — the LLM proposes the tool call, the UI surfaces it for human approval
}),
The useChat UI can render tool-call previews and let users approve before execution.
@ai-sdk/rsc)Generative UI: the LLM streams React components, not just text.
// app/actions.tsx
"use server";
import { streamUI } from "@ai-sdk/rsc";
import { openai } from "@ai-sdk/openai";
import { ProductCard } from "@/components/blocks/product-card";
export async function generateAnswer(query: string) {
const ui = await streamUI({
model: openai("gpt-4o"),
prompt: query,
text: ({ content }) => <p>{content}</p>,
tools: {
showProduct: {
description: "Show a product card",
parameters: z.object({ id: z.string() }),
generate: async function* ({ id }) {
yield <ProductCardSkeleton />;
const product = await getProduct(id);
return <ProductCard {...product} />;
},
},
},
});
return ui;
}
Server streams the rendered components; client receives them as RSC payloads. Powerful for AI assistants that need to show tables, charts, product cards, not just text.
When to reach for RSC integration:
When to skip:
useChat + text is simpler).Retrieval-Augmented Generation pairs the AI SDK with a vector DB:
import { embed } from "ai";
import { openai } from "@ai-sdk/openai";
// 1. Embed the query
const { embedding } = await embed({
model: openai.embedding("text-embedding-3-small"),
value: userQuery,
});
// 2. Find similar documents (Pinecone / pgvector / Weaviate / etc.)
const docs = await vectorDB.query({ vector: embedding, topK: 5 });
// 3. Generate with context
const result = streamText({
model: openai("gpt-4o"),
system: `Use this context to answer: ${docs.map(d => d.text).join("\n\n")}`,
prompt: userQuery,
});
For Slalom RAG defaults:
text-embedding-3-small (cheap, fast) or text-embedding-3-large (better recall).For deep RAG patterns, see software-engineering-ai-engineer.
AI streaming routes typically need:
// Per-route (App Router)
export const maxDuration = 300; // up to plan limit
export const runtime = "nodejs"; // Edge for tiny calls; Node for tool-heavy
// OR via vercel.json
{
"functions": {
"app/api/chat/route.ts": { "maxDuration": 300, "memory": 1024 }
}
}
Why:
maxDuration (Pro: 60s) is too short for serious chat.See vercel-fluid-compute.
Default Slalom posture: route every AI SDK call through the AI Gateway.
import { createOpenAI } from "@ai-sdk/openai";
const openai = createOpenAI({
baseURL: process.env.AI_GATEWAY_URL,
apiKey: process.env.AI_GATEWAY_KEY,
});
// Now every call through this client lands in the Gateway dashboard
Or use the AI Gateway provider directly (@ai-sdk/gateway or equivalent):
import { gateway } from "@ai-sdk/gateway";
const result = streamText({
model: gateway("openai/gpt-4o"),
// ...
});
Multi-provider routing, fallback, observability — all configured at the Gateway layer, not in app code. See vercel-ai-gateway.
Per request, AI cost has three drivers:
messages + system + tool definitions. Long conversations balloon this.maxOutputTokens caps this.maxSteps caps this.Optimization order:
maxOutputTokens deliberately set per route.gpt-4o-mini for routine; gpt-4o for hard cases. AI Gateway makes the swap easy.For visibility into cost, see vercel-ai-gateway.
maxDuration too short. Bump it.maxSteps not set. Cap it.generateText + manual parse.generateObject instead.useChat re-render loop. Usually missing memoization on the messages array.vercel-fluid-compute.console.log. Don't log full prompts in production; PII risk.# AI Feature: [Name]
## What
{user-facing capability}
## Models
- Primary: {provider/model via Gateway}
- Fallback: {alternative}
## Patterns
- Streaming: {yes/no}
- Tool calling: {tools list, with destructive ones flagged}
- Structured output: {schemas}
- RAG: {if yes — embeddings model, vector DB}
## Function config
- maxDuration: {N}s
- memory: {MB}
- runtime: {nodejs | edge}
## Cost shape
- Expected tokens per call: {input ~N, output ~M}
- Volume: {calls per day}
- Estimated $/month: {N}
- Cap: {hard limit via Gateway}
## Observability
- Gateway dashboard: ✓
- App-side logging: structured JSON, no full prompt
## Failure modes & mitigations
- Provider down → fallback via Gateway
- Token cost spike → Gateway alerts, app-side rate limiting
- Tool error → graceful UI message, retry from user
## Risks
{prompt injection, PII, hallucination, others}
vercel-ai-gateway (multi-provider, observability, cost).vercel-fluid-compute.vercel-storage.vercel-observability.vercel-security.vercel-agent-runtime.software-engineering-ai-engineer.software-engineering-agentic-workflow-engineer.../../references/api-surface.md