From tei
This skill should be used when the user wants to generate text embeddings, rerank a list of candidates by relevance, tokenize text, or inspect the loaded model on their Hugging Face Text Embeddings Inference (TEI) server. Triggers include: "embed this text", "convert to a vector", "rerank these results", "what model is my TEI server running", "tokenize this sentence", "check TEI health".
How this skill is triggered — by the user, by Claude, or both
Slash command
/tei:teiThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Hugging Face Text Embeddings Inference server — embed text, rerank candidates, tokenize. Talk to it directly over its HTTP API.
Hugging Face Text Embeddings Inference server — embed text, rerank candidates, tokenize. Talk to it directly over its HTTP API.
Read the base URL from ~/.lab/.env, then curl the TEI API:
TEI_URL=$(grep -E '^TEI_URL=' ~/.lab/.env | cut -d= -f2-)
TEI runs unauthenticated by default. If your deployment is behind auth, add the appropriate header.
| Intent | Request |
|---|---|
| Health | curl -sS "$TEI_URL/health" -w '\nHTTP %{http_code}\n' |
| Loaded model / runtime info | curl -sS "$TEI_URL/info" |
| Embed text | curl -sS -X POST "$TEI_URL/embed" -H 'Content-Type: application/json' -d '{"inputs":"hello world"}' |
| Embed (batch) | curl -sS -X POST "$TEI_URL/embed" -H 'Content-Type: application/json' -d '{"inputs":["a","b"]}' |
| Sparse embeddings (SPLADE) | curl -sS -X POST "$TEI_URL/embed_sparse" -H 'Content-Type: application/json' -d '{"inputs":"hello"}' |
| Rerank against a query | curl -sS -X POST "$TEI_URL/rerank" -H 'Content-Type: application/json' -d '{"query":"fruit","texts":["apple","car"]}' |
| Tokenize | curl -sS -X POST "$TEI_URL/tokenize" -H 'Content-Type: application/json' -d '{"inputs":"hello world"}' |
| OpenAI-compatible embeddings | curl -sS -X POST "$TEI_URL/v1/embeddings" -H 'Content-Type: application/json' -d '{"input":"hello","model":"tei"}' |
/embed and /rerank depend on the loaded model: an embedding model serves /embed (and /rerank returns a 424 model is not a re-ranker error), while a reranker model serves /rerank. Check /info to see which is loaded. /rerank accepts at most 100 texts per call — split larger batches across requests.
Full API reference: https://huggingface.github.io/text-embeddings-inference/
TEI_URL lives in ~/.lab/.env. Verify connectivity:
curl -sS "$TEI_URL/health" -w '\nHTTP %{http_code}\n'
qdrant skill.npx claudepluginhub jmagar/dendrite --plugin teiCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.