pgvector vs Qdrant vs Weaviate: Which Vector Database Holds Up at 100M Vectors?

pgvector vs Qdrant vs Weaviate: Which Vector Database Holds Up at 100M Vectors?

Most "vector database benchmark" articles top out at 1 million vectors and call it a day. That's fine for a side project. It's not what production RAG looks like. A real production index for an enterprise documentation corpus, a content recommendation system, or a customer-support knowledge base routinely sits between 50M and 500M vectors. That's where pgvector, Qdrant, and Weaviate stop behaving like equivalent options.

This article walks through what actually changes when you push these three to 100M vectors. No synthetic micro-benchmarks where everything looks great. Just the structural reasons each one wins or breaks at that scale.

The three contestants in one paragraph each

pgvector is a Postgres extension. You get vector search inside the same database that holds your application data. As of 0.7+ it has solid HNSW indexing alongside IVFFLAT. Operationally, it's the simplest thing in the world: there's no new database to run, no new SDK to learn, no new on-call rotation. It's just Postgres.

Qdrant is a Rust-built, purpose-designed vector database. It treats vectors as first-class citizens, has excellent payload filtering with attribute indexing, supports on-disk persistence by default, and scales horizontally with sharding. It's what you reach for when you've outgrown bolting vectors onto Postgres.

Weaviate is a Go-based vector database with a GraphQL API, native multi-tenancy, hybrid search (vector + keyword), and a heavier feature surface than Qdrant. Strong fit for content-platform use cases where you need filters, full-text search, and vectors in one query.

The 100M vector scenario

For this comparison, assume:

  • 100 million vectors at 768 dimensions (standard for embeddings from all-MiniLM-L6-v2-class models or older OpenAI embeddings)
  • HNSW indexing with sane parameters (M=16, ef_construction=200)
  • Mixed workload: 1,000 queries per second, ~10K writes per second, occasional bulk reindexing
  • Filtered queries: most production lookups need at least one metadata filter (tenant ID, document type, date range)

At this scale, the things that matter aren't raw query speed in isolation. They're memory footprint, filter performance, indexing time, and what happens when you need to add the 100,000,001st vector.

Where each one wins (and breaks)

pgvector at 100M

Memory math first. HNSW costs roughly dim * 4 bytes + M * 8 bytes per vector for graph links. For 100M × 768-dim vectors with M=16, that's around 320 GB just for the index in RAM. You can offload to disk with effective_cache_size tuning, but query latency degrades fast once the index doesn't fit.

Where pgvector wins: if your query patterns are dominated by metadata filters that already match good Postgres indexes, the planner can use B-tree indexes first and HNSW second on a smaller subset. For a multi-tenant SaaS with strong tenant isolation, this is gold: most queries effectively run against 100K-1M vectors per tenant, not the whole index.

Where pgvector breaks: unfiltered nearest-neighbor queries across the full 100M. The HNSW probe cost grows with index size, and Postgres connection pooling becomes the bottleneck before the index does. Also, reindexing a 100M-row table is a multi-hour operation.

Qdrant at 100M

Built for this. Qdrant's storage layer separates payload, vectors, and the HNSW graph, with explicit memory-mapped disk support. You can put vectors on disk and keep only the graph and frequently-accessed payload in RAM. That cuts memory needs by 3-5x compared to fully in-memory pgvector.

Where Qdrant wins: filtered search at scale. Qdrant's payload indexing builds dedicated indexes for common filter fields (tenant ID, type, timestamp ranges) that integrate tightly with HNSW traversal. Filter-then-search is fast even on 100M vectors. Horizontal sharding is straightforward when you outgrow a single node.

Where Qdrant breaks: if you need full-text search alongside vector search, you'll end up bolting on Elasticsearch or running keyword search elsewhere. Qdrant's hybrid search is improving but not yet a replacement for Weaviate's BM25 integration.

Weaviate at 100M

Hybrid search is the differentiator. Weaviate combines vector similarity with BM25 keyword scoring in a single query, which matters a lot for content platforms where exact-term matches need to outrank semantic-only matches.

Where Weaviate wins: multi-tenancy. Native tenant isolation means each tenant's index lives in a separate shard, with independent HNSW graphs. For a 100M-vector multi-tenant app where each tenant has 50K-500K vectors, this is the cleanest model on the list.

Where Weaviate breaks: raw single-tenant query throughput. Weaviate's per-query overhead is higher than Qdrant's because of the GraphQL layer and the hybrid search machinery. If your workload is "give me k nearest neighbors as fast as possible, no filters, no full-text," Qdrant wins on latency.

The comparison

Capability at 100M pgvector Qdrant Weaviate
Memory efficiency Index in RAM only Disk-backed, low RAM Mid (depends on tenant model)
Filtered queries Strong (uses Postgres indexes) Strongest (payload index integration) Strong (with tenant isolation)
Hybrid search Manual (Postgres FTS) Improving Native BM25 + vector
Multi-tenancy Schema or row-level Collection per tenant Native tenant isolation
Operational complexity Lowest (it's Postgres) Medium Medium-high
Horizontal sharding Citus or partitioning Built-in Built-in

Practical recommendations

Pick pgvector when: you already run Postgres, your workload is heavily multi-tenant with small per-tenant vector counts, and operational simplicity beats raw performance. Most SaaS RAG apps fit this profile up to about 50M vectors.

Pick Qdrant when: you need filtered queries at scale, your workload is read-heavy with tight latency budgets, or you've already hit the wall on pgvector. Qdrant is the default "graduated from pgvector" choice in 2026.

Pick Weaviate when: hybrid search (vector + keyword) is core to your product, you need strong multi-tenancy, or your team prefers GraphQL over REST. Common in content platforms and internal knowledge tools.

Run any of them on Elestio

pgvector (via Postgres), Qdrant, and Weaviate are all in Elestio's catalog. Managed Postgres for pgvector, dedicated VMs for Qdrant or Weaviate, automated daily backups, and one-click upgrades. Pick a VM size that matches your scale and skip the operational work.

Thanks for reading ❤️ See you in the next one 👋