Skill

qdrant-patterns

Qdrant vector database pattern library — open-source vector DB written in Rust for high throughput and low latency, self-hosted or Qdrant Cloud, collections + points + payloads (metadata) architecture, HNSW indexing with tunable parameters (m, ef_construct, ef), exact vs approximate search, payload indexing for fast filtering (keyword, integer, float, geo, datetime, text), distance metrics (Cosine, Dot, Euclidean, Manhattan), quantization (Scalar / Product / Binary) for 4-32x storage reduction, named vectors for multi-perspective embeddings, sparse vectors for hybrid retrieval, multi-tenancy via payload isolation or separate collections, snapshots for backup, replication factor for HA, sharding for horizontal scale, gRPC + HTTP REST APIs, Qdrant Web UI for visual exploration, integrations with LangChain / LlamaIndex / Haystack, and migration from Pinecone or Weaviate. Use when needing maximum throughput on commodity hardware, self-hosting requirement (data sovereignty, on-prem), tight cost control (open source vs Pinecone managed), or building Rust-native applications. Differentiates from Pinecone by self-hostable + open source + lower cost at scale, and from Weaviate by raw performance + simpler API + no GraphQL overhead.

Install

npx claudepluginhub arnwaldn/atum-plugins-collection --plugin atum-ai-ml

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Patterns canoniques pour utiliser **Qdrant** — particulièrement pour **self-hosting**, **performance maximale** et **cost control**.

SKILL.md

Similar Skills

subagent-driven-development

3 files

Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.

superpowers

150.3k

requesting-code-review

1 file

Dispatches code-reviewer subagent to evaluate code changes via git SHAs after tasks, major features, or before merging, with focused context on implementation and requirements.

superpowers

150.3k

receiving-code-review

Processes code review feedback technically: verify suggestions against codebase, clarify unclear items, push back if questionable, implement after evaluation—not blind agreement.

superpowers

150.3k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitApr 8, 2026

Actions

View Source View Plugin View on GitHub View README

Qdrant Patterns

Patterns canoniques pour utiliser Qdrant — particulièrement pour self-hosting, performance maximale et cost control.

1. Setup

Self-hosted via Docker

docker run -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

Qdrant Cloud (managed)

from qdrant_client import QdrantClient

client = QdrantClient(
    url="https://your-cluster.qdrant.tech",
    api_key=os.environ["QDRANT_API_KEY"],
)

2. Create collection

from qdrant_client.models import Distance, VectorParams, HnswConfigDiff

client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,
        distance=Distance.COSINE,
    ),
    hnsw_config=HnswConfigDiff(
        m=16,
        ef_construct=200,
    ),
    on_disk_payload=True,  # payloads sur disque vs RAM
)

3. Insert points

from qdrant_client.models import PointStruct

client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=embedding_1,
            payload={
                "text": "Comment configurer Postgres",
                "source": "guide.pdf",
                "page": 12,
                "category": "tutorial",
                "language": "fr",
                "published_at": "2026-04-08",
            },
        ),
        # ... more points
    ],
)

4. Search avec filters

from qdrant_client.models import Filter, FieldCondition, MatchValue

results = client.search(
    collection_name="documents",
    query_vector=query_embedding,
    query_filter=Filter(
        must=[
            FieldCondition(key="category", match=MatchValue(value="tutorial")),
            FieldCondition(key="language", match=MatchValue(value="fr")),
        ],
    ),
    limit=10,
    with_payload=True,
)

for hit in results:
    print(f"{hit.score:.3f} - {hit.payload['source']}")

5. Payload indexing (filters rapides)

from qdrant_client.models import PayloadSchemaType

# Index keyword pour filters exacts
client.create_payload_index(
    collection_name="documents",
    field_name="category",
    field_schema=PayloadSchemaType.KEYWORD,
)

# Index full-text pour recherche dans le payload text
client.create_payload_index(
    collection_name="documents",
    field_name="text",
    field_schema=PayloadSchemaType.TEXT,
)

# Index datetime pour range queries
client.create_payload_index(
    collection_name="documents",
    field_name="published_at",
    field_schema=PayloadSchemaType.DATETIME,
)

Sans index payload, les filters sont O(N) — lents sur grosse collection. Avec index, O(log N).

6. Quantization (4-32x storage reduction)

from qdrant_client.models import ScalarQuantization, ScalarQuantizationConfig, ScalarType

# Scalar quantization: int8, ~4x storage reduction, ~5% perte qualité
client.update_collection(
    collection_name="documents",
    quantization_config=ScalarQuantization(
        scalar=ScalarQuantizationConfig(
            type=ScalarType.INT8,
            always_ram=True,  # garde en RAM pour perf
        ),
    ),
)

Quantization	Storage reduction	Quality loss
Scalar (int8)	4x	~5%
Product (PQ)	4-64x	~10-30% selon paramètres
Binary (BQ)	32x	~15-30%

Règle : Scalar int8 par défaut, suffisamment précis pour 95% des cas.

7. Hybrid search (sparse + dense)

from qdrant_client.models import SparseVectorParams, NamedVector, NamedSparseVector, SparseVector

# Schema avec sparse + dense
client.recreate_collection(
    collection_name="hybrid_docs",
    vectors_config={
        "dense": VectorParams(size=1536, distance=Distance.COSINE),
    },
    sparse_vectors_config={
        "bm25": SparseVectorParams(),
    },
)

# Insert
client.upsert(
    collection_name="hybrid_docs",
    points=[{
        "id": 1,
        "vector": {
            "dense": dense_emb,
            "bm25": SparseVector(indices=[10, 45, 234], values=[0.5, 0.8, 0.3]),
        },
        "payload": {"text": "..."},
    }],
)

# Query avec fusion RRF
results = client.query_points(
    collection_name="hybrid_docs",
    prefetch=[
        {"query": dense_query, "using": "dense", "limit": 20},
        {"query": SparseVector(indices=qids, values=qvalues), "using": "bm25", "limit": 20},
    ],
    query={"fusion": "rrf"},  # Reciprocal Rank Fusion natif
    limit=10,
)

8. Multi-tenancy

Option A : Payload-based (1 collection partagée)

client.upsert(
    collection_name="multi_tenant_docs",
    points=[PointStruct(
        id=1,
        vector=embedding,
        payload={"tenant_id": "acme-corp", "text": "..."},
    )],
)

# Query avec filter strict
results = client.search(
    collection_name="multi_tenant_docs",
    query_vector=query_embedding,
    query_filter=Filter(must=[FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp"))]),
    limit=10,
)

Option B : Collection-per-tenant (isolation forte)

client.create_collection(collection_name=f"docs_{tenant_id}", ...)

Plus sûr mais coût overhead par collection.

9. Snapshots (backup)

# Créer snapshot
snapshot_info = client.create_snapshot(collection_name="documents")
print(snapshot_info.name)

# Restore depuis snapshot
client.recover_snapshot(
    collection_name="documents",
    location="https://my-bucket.s3.amazonaws.com/snapshots/documents-2026-04-08.snapshot",
)

10. Anti-patterns

Pas d'API key en prod — accès anonyme
Pas de payload index sur les fields filtrés — slow filters
on_disk_payload=False sur grosse collection — RAM saturée
Pas de quantization sur grosse collection — storage cost excessif
exact=True par défaut — slow (utiliser HNSW approximate)
HNSW m trop bas (<8) — recall faible
HNSW m trop haut (>32) — RAM excessive
ef runtime trop bas (<32) — recall faible
Pas de snapshots automatiques — backup oublié
Single node en prod — pas de HA, replication factor=1
Pas de monitoring Prometheus + Grafana — perf invisible

Checklist

Quand déléguer

Architecture RAG → agent rag-architect (ce plugin)
Pinecone managed → skill pinecone-patterns (ce plugin)
Weaviate hybrid natif → skill weaviate-patterns (ce plugin)
pgvector → skill supabase-patterns (atum-stack-backend)

qdrant-patterns

Install

Tool Access

Preview

SKILL.md

Similar Skills

qdrant-patterns

Install

Tool Access

Preview

SKILL.md

Qdrant Patterns

1. Setup

Self-hosted via Docker

Qdrant Cloud (managed)

2. Create collection

3. Insert points

4. Search avec filters

5. Payload indexing (filters rapides)

6. Quantization (4-32x storage reduction)

7. Hybrid search (sparse + dense)

8. Multi-tenancy

Option A : Payload-based (1 collection partagée)

Option B : Collection-per-tenant (isolation forte)

9. Snapshots (backup)

10. Anti-patterns

Checklist

Quand déléguer

Ressources

Similar Skills

Qdrant Patterns

1. Setup

Self-hosted via Docker

Qdrant Cloud (managed)

2. Create collection

3. Insert points

4. Search avec filters

5. Payload indexing (filters rapides)

6. Quantization (4-32x storage reduction)

7. Hybrid search (sparse + dense)

8. Multi-tenancy

Option A : Payload-based (1 collection partagée)

Option B : Collection-per-tenant (isolation forte)

9. Snapshots (backup)

10. Anti-patterns

Checklist

Quand déléguer

Ressources