From cohere-pack
Migrate from OpenAI/Anthropic/other LLM providers to Cohere, or vice versa. Use when switching LLM providers, migrating embeddings between models, or re-platforming existing AI integrations to Cohere API v2. Trigger with phrases like "migrate to cohere", "switch from openai to cohere", "cohere migration", "replace openai with cohere", "cohere replatform".
npx claudepluginhub flight505/skill-forge --plugin cohere-packThis skill is limited to using the following tools:
Comprehensive guide for migrating to Cohere from OpenAI, Anthropic, or other LLM providers, including embedding re-vectorization, prompt adaptation, and gradual traffic shifting.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Comprehensive guide for migrating to Cohere from OpenAI, Anthropic, or other LLM providers, including embedding re-vectorization, prompt adaptation, and gradual traffic shifting.
| From | Complexity | Duration | Key Challenge |
|---|---|---|---|
| OpenAI → Cohere | Medium | 1-2 weeks | Prompt adaptation, embedding migration |
| Anthropic → Cohere | Medium | 1-2 weeks | Message format, tool definitions |
| Custom/OSS → Cohere | Low | Days | SDK integration |
| Embedding migration | High | 2-4 weeks | Re-vectorize entire corpus |
// --- OpenAI (before) ---
import OpenAI from 'openai';
const openai = new OpenAI();
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: 'Hello' },
],
max_tokens: 500,
temperature: 0.7,
});
const text = response.choices[0].message.content;
// --- Cohere (after) ---
import { CohereClientV2 } from 'cohere-ai';
const cohere = new CohereClientV2();
const response = await cohere.chat({
model: 'command-a-03-2025', // GPT-4o equivalent
messages: [
{ role: 'system', content: 'You are helpful.' }, // Same format!
{ role: 'user', content: 'Hello' },
],
maxTokens: 500, // camelCase, not snake_case
temperature: 0.7,
});
const text = response.message?.content?.[0]?.text; // Different response shape
// OpenAI embeddings: 3072 dims (text-embedding-3-large)
// Cohere embeddings: 1024 dims (embed-v4.0)
// IMPORTANT: You CANNOT mix embeddings from different models in the same vector DB
// Migration plan:
// 1. Create new vector collection with Cohere dimensions
// 2. Re-embed all documents with Cohere
// 3. Switch queries to new collection
// 4. Delete old collection
async function migrateEmbeddings(
documents: Array<{ id: string; text: string }>,
batchSize = 96
) {
const cohere = new CohereClientV2();
let processed = 0;
for (let i = 0; i < documents.length; i += batchSize) {
const batch = documents.slice(i, i + batchSize);
const response = await cohere.embed({
model: 'embed-v4.0',
texts: batch.map(d => d.text),
inputType: 'search_document',
embeddingTypes: ['float'],
});
// Upsert to new vector collection
for (let j = 0; j < batch.length; j++) {
await vectorDB.upsert({
collection: 'docs-cohere', // New collection
id: batch[j].id,
vector: response.embeddings.float[j],
metadata: { text: batch[j].text },
});
}
processed += batch.length;
console.log(`Migrated ${processed}/${documents.length} embeddings`);
}
}
// --- OpenAI tools ---
const openaiTools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
}];
// --- Cohere tools (same format in v2!) ---
const cohereTools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
},
}];
// Tool definitions are identical! The difference is in response handling.
// OpenAI: response.choices[0].message.tool_calls
// Cohere: response.message?.toolCalls
// --- OpenAI streaming ---
const openaiStream = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [...],
stream: true,
});
for await (const chunk of openaiStream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
// --- Cohere streaming ---
const cohereStream = await cohere.chatStream({
model: 'command-a-03-2025',
messages: [...],
});
for await (const event of cohereStream) {
if (event.type === 'content-delta') {
process.stdout.write(event.delta?.message?.content?.text ?? '');
}
}
interface LLMAdapter {
chat(message: string, options?: { system?: string; maxTokens?: number }): Promise<string>;
embed(texts: string[]): Promise<number[][]>;
rerank(query: string, docs: string[], topN?: number): Promise<Array<{ index: number; score: number }>>;
}
class CohereAdapter implements LLMAdapter {
private client = new CohereClientV2();
async chat(message: string, options?: { system?: string; maxTokens?: number }): Promise<string> {
const messages: any[] = [];
if (options?.system) messages.push({ role: 'system', content: options.system });
messages.push({ role: 'user', content: message });
const response = await this.client.chat({
model: 'command-a-03-2025',
messages,
maxTokens: options?.maxTokens,
});
return response.message?.content?.[0]?.text ?? '';
}
async embed(texts: string[]): Promise<number[][]> {
const response = await this.client.embed({
model: 'embed-v4.0',
texts,
inputType: 'search_document',
embeddingTypes: ['float'],
});
return response.embeddings.float;
}
async rerank(query: string, docs: string[], topN = 5): Promise<Array<{ index: number; score: number }>> {
const response = await this.client.rerank({
model: 'rerank-v3.5',
query,
documents: docs,
topN,
});
return response.results.map(r => ({ index: r.index, score: r.relevanceScore }));
}
}
class OpenAIAdapter implements LLMAdapter {
// ... OpenAI implementation
}
// Traffic splitting via feature flag
function getLLMAdapter(): LLMAdapter {
const coherePercentage = getFeatureFlag('cohere_migration_pct'); // 0-100
if (Math.random() * 100 < coherePercentage) {
return new CohereAdapter();
}
return new OpenAIAdapter();
}
async function compareOutputs(message: string): Promise<{
openai: string;
cohere: string;
latencyMs: { openai: number; cohere: number };
}> {
const startOpenAI = Date.now();
const openaiResult = await openaiAdapter.chat(message);
const openaiLatency = Date.now() - startOpenAI;
const startCohere = Date.now();
const cohereResult = await cohereAdapter.chat(message);
const cohereLatency = Date.now() - startCohere;
return {
openai: openaiResult,
cohere: cohereResult,
latencyMs: { openai: openaiLatency, cohere: cohereLatency },
};
}
// Run comparison on sample queries during migration
const testQueries = ['Summarize this text', 'Translate to French', 'Extract key points'];
for (const q of testQueries) {
const result = await compareOutputs(q);
console.log(`Query: ${q}`);
console.log(`OpenAI (${result.latencyMs.openai}ms): ${result.openai.slice(0, 100)}`);
console.log(`Cohere (${result.latencyMs.cohere}ms): ${result.cohere.slice(0, 100)}`);
}
| Feature | Cohere | OpenAI |
|---|---|---|
| Built-in Rerank | cohere.rerank() | Not available |
| RAG with citations | documents param + citations | Manual implementation |
| Connectors (data sources) | connectors param | Not available |
| Classify endpoint | cohere.classify() | Not available |
| Safety modes | safetyMode param | Moderation API (separate) |
# Set feature flag to 0% Cohere traffic
curl -X POST https://flagservice/flags/cohere_migration_pct -d '{"value": 0}'
# Verify traffic is back on old provider
# Monitor error rates for 15 minutes
# If stable, migration is paused safely
| Issue | Cause | Solution |
|---|---|---|
| Embedding dimension mismatch | Mixed providers in same DB | Separate collections per provider |
| Response shape different | Provider-specific format | Use adapter pattern |
| Higher latency on Cohere | Different model size | Try command-r7b for speed |
| Quality difference | Different model strengths | Tune system prompts per provider |
For Cohere-specific architecture patterns, see cohere-reference-architecture.