Help us improve
Share bugs, ideas, or general feedback.
From vercel
AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
npx claudepluginhub robinebers/converted-plugins --plugin vercelHow this skill is triggered — by the user, by Claude, or both
Slash command
/vercel:ai-generation-persistenceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**AI generations are expensive, non-reproducible assets. Never discard them.**
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
AI generations are expensive, non-reproducible assets. Never discard them.
Every call to an LLM costs real money and produces unique output that cannot be exactly reproduced. Treat generations like database records — assign an ID, persist immediately, and make them retrievable.
nanoid() or createId() from @paralleldrive/cuid2/chat/[id], /generate/[id], /image/[id]The standard UX flow for AI features: create the resource first, then redirect to its page.
// app/api/chat/route.ts
import { nanoid } from "nanoid";
import { db } from "@/lib/db";
import { redirect } from "next/navigation";
export async function POST(req: Request) {
const { prompt, model } = await req.json();
const id = nanoid();
// Create the record BEFORE generation starts
await db.insert(generations).values({
id,
prompt,
model,
status: "pending",
createdAt: new Date(),
});
// Redirect to the generation page — it handles streaming
redirect(`/chat/${id}`);
}
// app/chat/[id]/page.tsx
import { db } from "@/lib/db";
import { notFound } from "next/navigation";
export default async function ChatPage({ params }: { params: Promise<{ id: string }> }) {
const { id } = await params;
const generation = await db.query.generations.findFirst({
where: eq(generations.id, id),
});
if (!generation) notFound();
// Render with streaming if still pending, or show saved result
return <ChatView generation={generation} />;
}
This gives you: shareable URLs, back-button support, multi-tab sessions, and generation history for free.
// lib/db/schema.ts
import { pgTable, text, integer, timestamp, jsonb } from "drizzle-orm/pg-core";
export const generations = pgTable("generations", {
id: text("id").primaryKey(), // nanoid
userId: text("user_id"), // auth user
model: text("model").notNull(), // "openai/gpt-5.4"
prompt: text("prompt"), // input text
result: text("result"), // generated output
imageUrls: jsonb("image_urls"), // Blob URLs for generated images
tokenUsage: jsonb("token_usage"), // { promptTokens, completionTokens }
estimatedCostCents: integer("estimated_cost_cents"),
status: text("status").default("pending"), // pending | streaming | complete | error
createdAt: timestamp("created_at").defaultNow(),
});
| Data Type | Storage | Why |
|---|---|---|
| Text, metadata, history | Neon Postgres via Drizzle | Queryable, relational, supports search |
| Generated images & files | Vercel Blob (@vercel/blob) | Permanent URLs, CDN-backed, no expiry |
| Prompt dedup cache | Upstash Redis | Fast lookup, TTL-based expiry |
Never serve generated images as ephemeral base64 or temporary URLs. Save to Blob immediately:
import { put } from "@vercel/blob";
import { generateText } from "ai";
const result = await generateText({ model, prompt });
// Save every generated image to permanent storage
const imageUrls: string[] = [];
for (const file of result.files ?? []) {
if (file.mediaType?.startsWith("image/")) {
const ext = file.mediaType.split("/")[1] || "png";
const blob = await put(`generations/${generationId}.${ext}`, file.uint8Array, {
access: "public",
contentType: file.mediaType,
});
imageUrls.push(blob.url);
}
}
// Update the generation record with permanent URLs
await db.update(generations)
.set({ imageUrls, status: "complete" })
.where(eq(generations.id, generationId));
Extract usage from every generation and store it. This enables billing, budgeting, and abuse detection:
const result = await generateText({ model, prompt });
const usage = result.usage; // { promptTokens, completionTokens, totalTokens }
const estimatedCostCents = estimateCost(model, usage);
await db.update(generations).set({
result: result.text,
tokenUsage: usage,
estimatedCostCents,
status: "complete",
}).where(eq(generations.id, generationId));
Avoid paying for identical generations. Cache by content hash:
import { Redis } from "@upstash/redis";
import { createHash } from "crypto";
const redis = Redis.fromEnv();
function hashPrompt(model: string, prompt: string): string {
return createHash("sha256").update(`${model}:${prompt}`).digest("hex");
}
// Check cache before generating
const cacheKey = `gen:${hashPrompt(model, prompt)}`;
const cached = await redis.get<string>(cacheKey);
if (cached) return cached; // Return cached generation ID
// After generation, cache the result
await redis.set(cacheKey, generationId, { ex: 3600 }); // 1hr TTL
[id] segments — /api/chat with no ID means generations aren't addressable. Use /chat/[id].