From cohere-pack
Implements RAG pipelines with Cohere Embed for vectorization, Rerank for relevance, and Chat for grounded answers in TypeScript/Node.js. For retrieval-augmented generation and search-enhanced Q&A.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin cohere-packThis skill is limited to using the following tools:
End-to-end Retrieval-Augmented Generation using Cohere's three core endpoints: Embed (vectorize), Rerank (sort by relevance), Chat (generate grounded answer with citations).
Generates minimal TypeScript examples for Cohere Chat, Embed, Rerank, and streaming. Partial Python versions. Use for Cohere API v2 quick starts, setup testing, or learning.
Build RAG systems for LLM apps using vector databases, embeddings, and retrieval strategies. Use for document Q&A, grounded chatbots, and semantic search.
Builds RAG systems for LLM apps with vector databases, embeddings, semantic search, and reranking. Use for document Q&A, grounded chatbots, and reducing hallucinations.
Share bugs, ideas, or general feedback.
End-to-end Retrieval-Augmented Generation using Cohere's three core endpoints: Embed (vectorize), Rerank (sort by relevance), Chat (generate grounded answer with citations).
cohere-install-auth setupcohere-ai package installedimport { CohereClientV2 } from 'cohere-ai';
const cohere = new CohereClientV2();
// Your knowledge base
const documents = [
{ id: 'doc1', text: 'Cohere Command A has 256K context and supports tool use.' },
{ id: 'doc2', text: 'Embed v4 generates 1024-dim vectors with 128K token context.' },
{ id: 'doc3', text: 'Rerank v3.5 scores relevance from 0 to 1 across 100+ languages.' },
{ id: 'doc4', text: 'The Chat API v2 requires model as a mandatory parameter.' },
{ id: 'doc5', text: 'Cohere supports structured JSON output via response_format.' },
];
// Embed documents for storage
const docEmbeddings = await cohere.embed({
model: 'embed-v4.0',
texts: documents.map(d => d.text),
inputType: 'search_document',
embeddingTypes: ['float'],
});
// Store vectors alongside document text in your vector DB
const vectors = docEmbeddings.embeddings.float;
console.log(`Embedded ${vectors.length} docs, ${vectors[0].length} dimensions each`);
async function searchDocuments(query: string, topK = 10) {
// Embed the query (note: inputType is 'search_query', not 'search_document')
const queryEmbedding = await cohere.embed({
model: 'embed-v4.0',
texts: [query],
inputType: 'search_query',
embeddingTypes: ['float'],
});
const queryVector = queryEmbedding.embeddings.float[0];
// Cosine similarity search (replace with your vector DB query)
const scores = vectors.map((vec, i) => ({
index: i,
score: cosineSimilarity(queryVector, vec),
}));
return scores
.sort((a, b) => b.score - a.score)
.slice(0, topK)
.map(s => documents[s.index]);
}
function cosineSimilarity(a: number[], b: number[]): number {
let dot = 0, magA = 0, magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}
async function rerankResults(query: string, candidates: typeof documents) {
const response = await cohere.rerank({
model: 'rerank-v3.5',
query,
documents: candidates.map(d => d.text),
topN: 3,
});
return response.results.map(r => ({
...candidates[r.index],
relevanceScore: r.relevanceScore,
}));
}
async function ragAnswer(query: string) {
// 1. Retrieve
const candidates = await searchDocuments(query);
// 2. Rerank
const topDocs = await rerankResults(query, candidates);
// 3. Generate with inline citations
const response = await cohere.chat({
model: 'command-a-03-2025',
messages: [{ role: 'user', content: query }],
documents: topDocs.map(d => ({
id: d.id,
data: { text: d.text },
})),
});
const answer = response.message?.content?.[0]?.text ?? '';
const citations = response.message?.citations ?? [];
return { answer, citations, sources: topDocs };
}
// Usage
const result = await ragAnswer('What context length does Command A support?');
console.log('Answer:', result.answer);
console.log('Citations:', result.citations.length);
import { CohereClientV2 } from 'cohere-ai';
const cohere = new CohereClientV2();
async function rag(query: string, knowledgeBase: string[]) {
// 1. Rerank the knowledge base directly (skip embed for small corpora)
const ranked = await cohere.rerank({
model: 'rerank-v3.5',
query,
documents: knowledgeBase,
topN: 5,
});
// 2. Feed top docs to Chat for grounded answer
const docs = ranked.results.map((r, i) => ({
id: `doc-${i}`,
data: { text: knowledgeBase[r.index] },
}));
const response = await cohere.chat({
model: 'command-a-03-2025',
messages: [{ role: 'user', content: query }],
documents: docs,
});
return response.message?.content?.[0]?.text ?? '';
}
| Error | Cause | Solution |
|---|---|---|
input_type is required | Missing embed inputType | Use search_document or search_query |
embedding_types required | Missing for v3+ models | Add embeddingTypes: ['float'] |
| Empty citations | Docs too short/irrelevant | Improve document quality or chunking |
too many documents | >1000 rerank docs | Batch into groups of 1000 |
For tool-use and agents workflow, see cohere-core-workflow-b.