Embedding model configurations and cost calculators
Selects and configures embedding models for RAG pipelines, calculating costs for OpenAI, Cohere, and HuggingFace models. Provides setup scripts and optimization strategies for balancing performance, cost, and privacy requirements.
/plugin marketplace add vanman2024/ai-dev-marketplace/plugin install rag-pipeline@ai-dev-marketplaceThis skill is limited to using the following tools:
README.mdexamples/batch-embedding-generation.pyexamples/embedding-cache.pyscripts/calculate-embedding-costs.pyscripts/setup-cohere-embeddings.shscripts/setup-huggingface-embeddings.shscripts/setup-openai-embeddings.shtemplates/custom-embedding-model.pytemplates/huggingface-embedding-config.pytemplates/openai-embedding-config.pyEmbedding model selection, configuration, and cost optimization for RAG pipelines.
OpenAI Embeddings:
text-embedding-3-small - 1536 dims, $0.02/1M tokens, balanced performancetext-embedding-3-large - 3072 dims, $0.13/1M tokens, highest qualitytext-embedding-ada-002 - 1536 dims, $0.10/1M tokens, legacy modelCohere Embeddings:
embed-english-v3.0 - 1024 dims, multilingual supportembed-english-light-v3.0 - 384 dims, faster/cheaperembed-multilingual-v3.0 - 1024 dims, 100+ languagesSentence Transformers:
all-MiniLM-L6-v2 - 384 dims, 80MB, fast and efficientall-mpnet-base-v2 - 768 dims, 420MB, high qualitymulti-qa-mpnet-base-dot-v1 - 768 dims, optimized for Q&Aparaphrase-multilingual-mpnet-base-v2 - 768 dims, 50+ languagesSpecialized Models:
BAAI/bge-small-en-v1.5 - 384 dims, SOTA small modelBAAI/bge-base-en-v1.5 - 768 dims, excellent retrievalBAAI/bge-large-en-v1.5 - 1024 dims, top performanceintfloat/e5-base-v2 - 768 dims, strong general purposeUse the cost calculator script to estimate embedding costs:
# Calculate costs for different models and volumes
python scripts/calculate-embedding-costs.py \
--documents 100000 \
--avg-tokens 500 \
--model text-embedding-3-small
# Compare multiple models
python scripts/calculate-embedding-costs.py \
--documents 100000 \
--avg-tokens 500 \
--compare
bash scripts/setup-openai-embeddings.sh
Configures OpenAI embedding client with API key management and retry logic.
bash scripts/setup-huggingface-embeddings.sh
Downloads and configures sentence-transformers models locally.
bash scripts/setup-cohere-embeddings.sh
Sets up Cohere embedding client with API credentials.
# templates/openai-embedding-config.py
from openai import OpenAI
client = OpenAI(api_key="your-key")
embeddings = client.embeddings.create(
model="text-embedding-3-small",
input=["Your text here"]
)
# templates/huggingface-embedding-config.py
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(["Your text here"])
# templates/custom-embedding-model.py
# Wrapper for any embedding model with consistent interface
Cost Optimization:
Performance Optimization:
| Model | Dimensions | Size | Speed | Quality | Cost |
|---|---|---|---|---|---|
| text-embedding-3-small | 1536 | API | Fast | Good | $0.02/1M |
| text-embedding-3-large | 3072 | API | Medium | Excellent | $0.13/1M |
| all-MiniLM-L6-v2 | 384 | 80MB | Very Fast | Good | Free |
| all-mpnet-base-v2 | 768 | 420MB | Fast | Excellent | Free |
| bge-base-en-v1.5 | 768 | 420MB | Fast | Excellent | Free |
| embed-english-v3.0 | 1024 | API | Fast | Excellent | $0.10/1M |
Batch Embedding Generation:
# examples/batch-embedding-generation.py
# Process large document collections efficiently
Embedding Cache:
# examples/embedding-cache.py
# Cache embeddings to avoid redundant API calls
Use OpenAI when:
Use Cohere when:
Use HuggingFace/Local when:
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.