Integrate Gemini API with correct current SDK (@google/genai v1.27+, NOT deprecated @google/generative-ai). Supports text generation, multimodal (images/video/audio/PDFs), function calling, and thinking mode. 1M input tokens. Use when: integrating Gemini API, implementing multimodal AI, using thinking mode for reasoning, function calling with parallel execution, streaming responses, deploying to Cloudflare Workers, building chat, or troubleshooting SDK deprecation, context window, model not found, function calling, or multimodal format errors. Keywords: gemini api, @google/genai, gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3-pro-preview, multimodal gemini, thinking mode, google ai, genai sdk, function calling gemini, streaming gemini, gemini vision, gemini video, gemini audio, gemini pdf, system instructions, multi-turn chat, DEPRECATED @google/generative-ai, gemini context window, gemini models 2025, gemini 1m tokens, gemini tool use, parallel function calling, compositional function calling, gemini 3
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
README.mdreferences/code-execution-patterns.mdreferences/context-caching-guide.mdreferences/function-calling-patterns.mdreferences/generation-config.mdreferences/grounding-guide.mdreferences/models-guide.mdreferences/multimodal-guide.mdreferences/sdk-migration-guide.mdreferences/streaming-patterns.mdreferences/thinking-mode-guide.mdreferences/top-errors.mdrules/google-gemini-api.mdscripts/check-versions.shtemplates/cloudflare-worker.tstemplates/code-execution.tstemplates/combined-advanced.tstemplates/context-caching.tstemplates/function-calling-basic.tstemplates/function-calling-parallel.tsname: google-gemini-api description: | Integrate Gemini API with correct current SDK (@google/genai v1.27+, NOT deprecated @google/generative-ai). Supports text generation, multimodal (images/video/audio/PDFs), function calling, and thinking mode. 1M input tokens.
Use when: integrating Gemini API, implementing multimodal AI, using thinking mode for reasoning, function calling with parallel execution, streaming responses, deploying to Cloudflare Workers, building chat, or troubleshooting SDK deprecation, context window, model not found, function calling, or multimodal format errors.
Version: Phase 2 Complete + Gemini 3 ✅ Package: @google/genai@1.30.0 (⚠️ NOT @google/generative-ai) Last Updated: 2025-11-26 (Package update + FileSearch preview)
DEPRECATED SDK: @google/generative-ai (sunset November 30, 2025)
CURRENT SDK: @google/genai v1.27+
If you see code using @google/generative-ai, it's outdated!
This skill uses the correct current SDK and provides a complete migration guide.
✅ Phase 1 Complete:
✅ Phase 2 Complete:
📦 Separate Skills:
google-gemini-embeddings skill for text-embedding-004Phase 1 - Core Features:
Phase 2 - Advanced Features: 12. Context Caching 13. Code Execution 14. Grounding with Google Search
Common Reference: 15. Error Handling 16. Rate Limits 17. SDK Migration Guide 18. Production Best Practices
CORRECT SDK:
npm install @google/genai@1.30.0
❌ WRONG (DEPRECATED):
npm install @google/generative-ai # DO NOT USE!
export GEMINI_API_KEY="..."
Or create .env file:
GEMINI_API_KEY=...
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Explain quantum computing in simple terms'
});
console.log(response.text);
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
| Feature | 3-Pro (Preview) | 2.5-Pro | 2.5-Flash | 2.5-Flash-Lite |
|---|---|---|---|---|
| Thinking Mode | TBD | ✅ Default ON | ✅ Default ON | ✅ Default ON |
| Function Calling | ✅ | ✅ | ✅ | ✅ |
| Multimodal | ✅ Enhanced | ✅ | ✅ | ✅ |
| Streaming | ✅ | ✅ | ✅ | ✅ |
| System Instructions | ✅ | ✅ | ✅ | ✅ |
| Context Window | TBD | 1,048,576 in | 1,048,576 in | 1,048,576 in |
| Output Tokens | TBD | 65,536 max | 65,536 max | 65,536 max |
| Status | Preview | Stable | Stable | Stable |
ACCURATE (Gemini 2.5): Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!) OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window GEMINI 3: Context window specifications pending official documentation
Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.
Pros:
Cons:
Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility
Pros:
Cons:
Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Write a haiku about artificial intelligence'
});
console.log(response.text);
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: 'Write a haiku about artificial intelligence' }
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
{
text: string, // Convenience accessor for text content
candidates: [
{
content: {
parts: [
{ text: string } // Generated text
],
role: string // "model"
},
finishReason: string, // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
index: number
}
],
usageMetadata: {
promptTokenCount: number,
candidatesTokenCount: number,
totalTokenCount: number
}
}
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Write a 200-word story about time travel'
});
for await (const chunk of response) {
process.stdout.write(chunk.text);
}
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
}),
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() || '';
for (const line of lines) {
if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
if (!line.startsWith('data: ')) continue;
try {
const data = JSON.parse(line.slice(6));
const text = data.candidates[0]?.content?.parts[0]?.text;
if (text) {
process.stdout.write(text);
}
} catch (e) {
// Skip invalid JSON
}
}
}
Key Points:
streamGenerateContent endpoint (not generateContent)data: {json}\n\n[DONE] markersGemini 2.5 models support text + images + video + audio + PDFs in the same request.
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'What is in this image?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
});
console.log(response.text);
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: 'What is in this image?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
Supported Image Formats:
.jpg, .jpeg).png).webp).heic).heif)Max Image Size: 20MB per image
// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Describe what happens in this video' },
{
inlineData: {
data: base64Video,
mimeType: 'video/mp4'
}
}
]
}
]
});
console.log(response.text);
Supported Video Formats:
.mp4).mpeg).mov).avi).flv).mpg).webm).wmv)Max Video Length (inline): 2 minutes Max Video Size: 2GB (use File API for larger files - Phase 2)
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Transcribe and summarize this audio' },
{
inlineData: {
data: base64Audio,
mimeType: 'audio/mp3'
}
}
]
}
]
});
console.log(response.text);
Supported Audio Formats:
.mp3).wav).flac).aac).ogg).opus)Max Audio Size: 20MB
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Summarize the key points in this PDF' },
{
inlineData: {
data: base64Pdf,
mimeType: 'application/pdf'
}
}
]
}
]
});
console.log(response.text);
Max PDF Size: 30MB PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy
You can combine multiple modalities in one request:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Compare these two images and describe the differences:' },
{ inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
{ inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
]
}
]
});
Gemini supports function calling (tool use) to connect models with external APIs and systems.
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Define function declarations
const getCurrentWeather = {
name: 'get_current_weather',
description: 'Get the current weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name, e.g. San Francisco'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit']
}
},
required: ['location']
}
};
// Make request with tools
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What\'s the weather in Tokyo?',
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;
if (functionCall) {
console.log('Function to call:', functionCall.name);
console.log('Arguments:', functionCall.args);
// Execute the function (your implementation)
const weatherData = await fetchWeather(functionCall.args.location);
// Send function result back to model
const finalResponse = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
'What\'s the weather in Tokyo?',
response.candidates[0].content, // Original assistant response with function call
{
parts: [
{
functionResponse: {
name: functionCall.name,
response: weatherData
}
}
]
}
],
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
console.log(finalResponse.text);
}
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'What\'s the weather in Tokyo?' }] }
],
tools: [
{
functionDeclarations: [
{
name: 'get_current_weather',
description: 'Get the current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name'
}
},
required: ['location']
}
}
]
}
]
}),
}
);
const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;
if (functionCall) {
// Execute function and send result back (same flow as SDK)
}
Gemini can call multiple independent functions simultaneously:
const tools = [
{
functionDeclarations: [
{
name: 'get_weather',
description: 'Get weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
},
{
name: 'get_population',
description: 'Get population of a city',
parametersJsonSchema: {
type: 'object',
properties: {
city: { type: 'string' }
},
required: ['city']
}
}
]
}
];
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the weather and population of Tokyo?',
config: { tools }
});
// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
part => part.functionCall
);
console.log(`Model wants to call ${functionCalls.length} functions in parallel`);
import { FunctionCallingConfigMode } from '@google/genai';
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What\'s the weather?',
config: {
tools: [{ functionDeclarations: [getCurrentWeather] }],
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.ANY, // Force function call
// mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
// mode: FunctionCallingConfigMode.NONE, // Never call functions
allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
}
}
}
});
Modes:
AUTO (default): Model decides whether to call functionsANY: Force model to call at least one functionNONE: Disable function calling for this requestSystem instructions guide the model's behavior and set context. They are separate from the conversation messages.
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
contents: 'Explain what a database is'
});
console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
systemInstruction: {
parts: [
{ text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
]
},
contents: [
{ parts: [{ text: 'Explain what a database is' }] }
]
}),
}
);
Key Points:
contents arrayFor conversations with history, use the SDK's chat helpers or manually manage conversation state.
const chat = await ai.models.createChat({
model: 'gemini-2.5-flash',
systemInstruction: 'You are a helpful coding assistant.',
history: [] // Start empty or with previous messages
});
// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);
// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);
// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);
const conversationHistory = [];
// First turn
const response1 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
role: 'user',
parts: [{ text: 'What is TypeScript?' }]
}
]
}),
}
);
const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;
// Add to history
conversationHistory.push(
{ role: 'user', parts: [{ text: 'What is TypeScript?' }] },
{ role: 'model', parts: [{ text: assistantReply1 }] }
);
// Second turn (include full history)
const response2 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
...conversationHistory,
{ role: 'user', parts: [{ text: 'How do I install it?' }] }
]
}),
}
);
Message Roles:
user: User messagesmodel: Assistant responses⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.
Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Solve this complex math problem: ...',
config: {
thinkingConfig: {
thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
}
}
});
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
generationConfig: {
thinkingConfig: {
thinkingBudget: 8192
}
}
}),
}
);
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Solve this complex problem: ...',
config: {
thinkingConfig: {
thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
}
}
});
Thinking Levels:
LOW: Minimal internal reasoning (faster, lower quality)MEDIUM: Balanced reasoning (default)HIGH: Maximum reasoning depth (slower, higher quality)Key Points:
thinkingLevel provides simpler control than thinkingBudget (new in v1.30.0)Customize model behavior with generation parameters.
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Write a creative story',
config: {
temperature: 0.9, // Randomness (0.0-2.0, default: 1.0)
topP: 0.95, // Nucleus sampling (0.0-1.0)
topK: 40, // Top-k sampling
maxOutputTokens: 2048, // Max tokens to generate
stopSequences: ['END'], // Stop generation if these appear
responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
candidateCount: 1 // Number of response candidates (usually 1)
}
});
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Write a creative story' }] }],
generationConfig: {
temperature: 0.9,
topP: 0.95,
topK: 40,
maxOutputTokens: 2048,
stopSequences: ['END'],
responseMimeType: 'text/plain',
candidateCount: 1
}
}),
}
);
| Parameter | Range | Default | Use Case |
|---|---|---|---|
| temperature | 0.0-2.0 | 1.0 | Lower = more focused, higher = more creative |
| topP | 0.0-1.0 | 0.95 | Nucleus sampling threshold |
| topK | 1-100+ | 40 | Limit to top K tokens |
| maxOutputTokens | 1-65536 | Model max | Control response length |
| stopSequences | Array | None | Stop generation at specific strings |
Tips:
Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');
const cache = await ai.caches.create({
model: 'gemini-2.5-flash',
config: {
displayName: 'large-doc-cache', // Identifier for the cache
systemInstruction: 'You are an expert at analyzing legal documents.',
contents: documentText,
ttl: '3600s', // Cache for 1 hour
}
});
console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/cachedContents',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
model: 'models/gemini-2.5-flash',
displayName: 'large-doc-cache',
systemInstruction: {
parts: [{ text: 'You are an expert at analyzing legal documents.' }]
},
contents: [
{ parts: [{ text: documentText }] }
],
ttl: '3600s'
}),
}
);
const cache = await response.json();
console.log('Cache created:', cache.name);
// Generate content using the cache
const response = await ai.models.generateContent({
model: cache.name, // Use cache name as model
contents: 'Summarize the key points in the document'
});
console.log(response.text);
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'Summarize the key points in the document' }] }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
import { UpdateCachedContentConfig } from '@google/genai';
await ai.caches.update({
name: cache.name,
config: {
ttl: '7200s' // Extend to 2 hours
}
});
// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);
await ai.caches.update({
name: cache.name,
config: {
expireTime: in10Minutes
}
});
// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
console.log(cache.name, cache.displayName);
}
// Delete a specific cache
await ai.caches.delete({ name: cache.name });
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Upload video file
const videoFile = await ai.files.upload({
file: fs.createReadStream('./video.mp4')
});
// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
await new Promise(resolve => setTimeout(resolve, 2000));
videoFile = await ai.files.get({ name: videoFile.name });
}
// Create cache with video
const cache = await ai.caches.create({
model: 'gemini-2.5-flash',
config: {
displayName: 'video-analysis-cache',
systemInstruction: 'You are an expert video analyzer.',
contents: [videoFile],
ttl: '300s' // 5 minutes
}
});
// Use cache for multiple queries
const response1 = await ai.models.generateContent({
model: cache.name,
contents: 'What happens in the first minute?'
});
const response2 = await ai.models.generateContent({
model: cache.name,
contents: 'Describe the main characters'
});
When to Use Caching:
TTL Guidelines:
Cost Savings:
Important:
gemini-2.5-flash-001, NOT just gemini-2.5-flash)Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.
Standard Library:
math, statistics, random, datetime, json, csv, recollections, itertools, functoolsData Science:
numpy, pandas, scipyVisualization:
matplotlib, seabornNote: Limited package availability compared to full Python environment
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
config: {
tools: [{ codeExecution: {} }]
}
});
// Parse response parts
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log('Text:', part.text);
}
if (part.executableCode) {
console.log('Generated Code:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('Execution Output:', part.codeExecutionResult.output);
}
}
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
tools: [{ code_execution: {} }],
contents: [
{
parts: [
{ text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
]
}
]
}),
}
);
const data = await response.json();
for (const part of data.candidates[0].content.parts) {
if (part.text) {
console.log('Text:', part.text);
}
if (part.executableCode) {
console.log('Code:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('Result:', part.codeExecutionResult.output);
}
}
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ codeExecution: {} }]
}
});
let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);
response = await chat.sendMessage(
'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);
// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Code:', part.executableCode.code);
if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `
Analyze this sales data and calculate:
1. Total revenue
2. Average sale price
3. Best-selling month
Data (CSV format):
month,sales,revenue
Jan,150,45000
Feb,200,62000
Mar,175,53000
Apr,220,68000
`,
config: {
tools: [{ codeExecution: {} }]
}
});
// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
config: {
tools: [{ codeExecution: {} }]
}
});
// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
if (part.codeExecutionResult) {
// Note: Chart image data would be in output
console.log('Execution completed');
}
}
{
candidates: [
{
content: {
parts: [
{ text: "I'll calculate that for you." },
{
executableCode: {
language: "PYTHON",
code: "def is_prime(n):\n if n <= 1:\n return False\n ..."
}
},
{
codeExecutionResult: {
outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
output: "5117\n"
}
},
{ text: "The sum of the first 50 prime numbers is 5117." }
]
}
}
]
}
for (const part of response.candidates[0].content.parts) {
if (part.codeExecutionResult) {
if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
console.error('Code execution failed:', part.codeExecutionResult.output);
} else {
console.log('Success:', part.codeExecutionResult.output);
}
}
}
When to Use Code Execution:
Limitations:
Best Practices:
outcome field for errorsImportant:
Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.
googleSearch) - Recommended for Gemini 2.5const groundingTool = {
googleSearch: {}
};
Features:
const fileSearchTool = {
fileSearch: {
fileSearchStoreId: 'store-id-here' // Created via FileSearchStore APIs
}
};
Features:
Note: See FileSearch documentation for store creation and management.
googleSearchRetrieval) - Legacy (Gemini 1.5)const retrievalTool = {
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: 'MODE_DYNAMIC',
dynamicThreshold: 0.7 // Only search if confidence < 70%
}
}
};
Features:
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won the euro 2024?',
config: {
tools: [{ googleSearch: {} }]
}
});
console.log(response.text);
// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
console.log('Search was performed!');
console.log('Sources:', response.candidates[0].groundingMetadata);
}
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'Who won the euro 2024?' }] }
],
tools: [
{ google_search: {} }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
if (data.candidates[0].groundingMetadata) {
console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-1.5-flash',
contents: 'Who won the euro 2024?',
config: {
tools: [
{
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
dynamicThreshold: 0.7 // Search only if confidence < 70%
}
}
}
]
}
});
console.log(response.text);
if (!response.candidates[0].groundingMetadata) {
console.log('Model answered from its own knowledge (high confidence)');
}
{
groundingMetadata: {
searchQueries: [
{ text: "euro 2024 winner" }
],
webPages: [
{
url: "https://example.com/euro-2024-results",
title: "UEFA Euro 2024 Final Results",
snippet: "Spain won UEFA Euro 2024..."
}
],
citations: [
{
startIndex: 42,
endIndex: 47,
uri: "https://example.com/euro-2024-results"
}
],
retrievalQueries: [
{
query: "who won euro 2024 final"
}
]
}
}
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ googleSearch: {} }]
}
});
let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);
// Check grounding sources
if (response.candidates[0].groundingMetadata) {
const sources = response.candidates[0].groundingMetadata.webPages || [];
console.log(`Sources used: ${sources.length}`);
sources.forEach(source => {
console.log(`- ${source.title}: ${source.url}`);
});
}
// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);
const weatherFunction = {
name: 'get_current_weather',
description: 'Get current weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' }
},
required: ['location']
}
};
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the weather like in the city that won Euro 2024?',
config: {
tools: [
{ googleSearch: {} },
{ functionDeclarations: [weatherFunction] }
]
}
});
// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in response
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is 2+2?', // Model knows this without search
config: {
tools: [{ googleSearch: {} }]
}
});
if (!response.candidates[0].groundingMetadata) {
console.log('Model answered from its own knowledge (no search needed)');
} else {
console.log('Search was performed');
}
When to Use Grounding:
When NOT to Use:
Cost Considerations:
dynamicThreshold to control when searches happen (Gemini 1.5)Important Notes:
Gemini 2.5 vs 1.5:
googleSearch (simple, recommended)googleSearchRetrieval with dynamicThresholdBest Practices:
groundingMetadata to see if search was used{
error: {
code: 401,
message: 'API key not valid. Please pass a valid API key.',
status: 'UNAUTHENTICATED'
}
}
Solution: Verify GEMINI_API_KEY environment variable is set correctly.
{
error: {
code: 429,
message: 'Resource has been exhausted (e.g. check quota).',
status: 'RESOURCE_EXHAUSTED'
}
}
Solution: Implement exponential backoff retry strategy.
{
error: {
code: 404,
message: 'models/gemini-3.0-flash is not found',
status: 'NOT_FOUND'
}
}
Solution: Use correct model names: gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
{
error: {
code: 400,
message: 'Request payload size exceeds the limit',
status: 'INVALID_ARGUMENT'
}
}
Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.
async function generateWithRetry(request, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await ai.models.generateContent(request);
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
Rate limits vary by model:
Gemini 2.5 Pro:
Gemini 2.5 Flash:
Gemini 2.5 Flash-Lite:
Requires billing account linked to your Google Cloud project.
Gemini 2.5 Pro:
Gemini 2.5 Flash:
Gemini 2.5 Flash-Lite:
Tier 2 (requires $250+ spending and 30-day wait):
Tier 3 (requires $1,000+ spending and 30-day wait):
Tips:
# Remove deprecated SDK
npm uninstall @google/generative-ai
# Install current SDK
npm install @google/genai@1.27.0
Old (DEPRECATED):
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
New (CURRENT):
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly
Old:
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();
New:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: prompt
});
const text = response.text;
Old:
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
console.log(chunk.text());
}
New:
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: prompt
});
for await (const chunk of response) {
console.log(chunk.text);
}
Old:
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;
New:
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available
✅ Use @google/genai (NOT @google/generative-ai) ✅ Set maxOutputTokens to prevent excessive generation ✅ Implement rate limit handling with exponential backoff ✅ Use environment variables for API keys (never hardcode) ✅ Validate inputs before sending to API (save costs) ✅ Use streaming for better UX on long responses ✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed) ✅ Handle errors gracefully with try-catch ✅ Monitor token usage for cost control ✅ Use correct model names: gemini-2.5-pro/flash/flash-lite
❌ Never use @google/generative-ai (deprecated!) ❌ Never hardcode API keys in code ❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens) ❌ Never expose API keys in client-side code ❌ Never skip error handling (always try-catch) ❌ Never use generic rate limits (each model has different limits - check official docs) ❌ Never send PII without user consent ❌ Never trust user input without validation ❌ Never ignore rate limits (will get 429 errors) ❌ Never use old model names like gemini-1.5-pro (use 2.5 models)
npm install @google/genai@1.30.0
export GEMINI_API_KEY="..."
gemini-2.5-pro (1,048,576 in / 65,536 out) - Best for complex reasoninggemini-2.5-flash (1,048,576 in / 65,536 out) - Best price-performance balancegemini-2.5-flash-lite (1,048,576 in / 65,536 out) - Fastest, most cost-effectiveconst response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Your prompt here'
});
console.log(response.text);
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
console.log(chunk.text);
}
contents: [
{
parts: [
{ text: 'What is this?' },
{ inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
]
}
]
config: {
tools: [{ functionDeclarations: [...] }]
}
Last Updated: 2025-11-26 Production Validated: All features tested with @google/genai@1.30.0 Phase: 2 Complete ✅ (All Core + Advanced Features)