Why This Comparison Exists
Choosing a vector database is one of those decisions that feels reversible but is not. Migration costs are real — re-indexing millions of embeddings, rewriting query logic, updating infrastructure. At TwilightCore, we have deployed all three of these solutions across different projects, and the right choice depends entirely on your constraints.
This is not a feature checklist. It is an honest account of what we have experienced running these systems in production with real workloads.
The Contenders
pgvector is a PostgreSQL extension that adds vector similarity search to your existing Postgres database. It is the "no new infrastructure" option.
Pinecone is a fully managed vector database service. You get an API, a dashboard, and zero operational burden.
Qdrant is an open-source vector search engine written in Rust. You can self-host it or use their managed cloud offering.
Benchmark Results
We ran benchmarks on a standardized workload: 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-small output size), measured on comparable hardware.
| Metric | pgvector (HNSW) | Pinecone (s1) | Qdrant (self-hosted) |
|---|---|---|---|
| Index build time | 47 min | N/A (managed) | 12 min |
| Query latency P50 | 8ms | 22ms | 4ms |
| Query latency P99 | 45ms | 89ms | 18ms |
| Recall@10 (ef=128) | 0.97 | 0.98 | 0.99 |
| Memory usage | 4.2 GB | N/A | 3.1 GB |
| Throughput (QPS) | 850 | 400 | 2,200 |
| Filtered query P50 | 15ms | 28ms | 6ms |
Benchmarks Are Context-Dependent
These numbers reflect our specific workload, hardware, and tuning. Pinecone's latency includes network round-trip since it is a remote service — for a fair comparison of raw engine performance, subtract ~15ms. Your results will vary based on dimensionality, index parameters, and query patterns.
pgvector: When Your Database Is Enough
Setup
-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table with a vector column
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding vector(1536),
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now()
);
-- HNSW index — this is where the magic happens
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
-- Composite index for filtered queries
CREATE INDEX ON documents (created_at);
CREATE INDEX ON documents USING gin (metadata jsonb_path_ops);
-- Query with metadata filtering
SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE metadata @> '{"category": "engineering"}'
AND created_at > now() - INTERVAL '30 days'
ORDER BY embedding <=> $1::vector
LIMIT 10;Strengths
pgvector's killer advantage is no new infrastructure. If you already run Postgres — and you almost certainly do — you can add vector search without introducing a new database, new backups, new monitoring, or new failure modes. Your vectors live alongside your relational data, which means joins are trivial and transactional consistency is free.
Where It Struggles
pgvector starts to strain above 5-10 million vectors on a single instance. HNSW index builds are slow and memory-intensive. There is no built-in sharding — if you outgrow a single machine, you are on your own with Citus or manual partitioning.
Pinecone: Maximum Convenience
Setup
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="your-api-key")
# Create a serverless index
pc.create_index(
name="documents",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)
index = pc.Index("documents")
# Upsert vectors with metadata
index.upsert(
vectors=[
{
"id": "doc-001",
"values": embedding,
"metadata": {
"category": "engineering",
"author": "twilightcore-team",
"word_count": 1500,
},
}
],
namespace="blog-posts",
)
# Query with metadata filtering
results = index.query(
vector=query_embedding,
top_k=10,
filter={
"category": {"$eq": "engineering"},
"word_count": {"$gte": 500},
},
include_metadata=True,
namespace="blog-posts",
)Strengths
Pinecone is genuinely zero-ops. No capacity planning, no index tuning, no infrastructure management. Their serverless offering scales automatically and you pay per query. For teams without dedicated infrastructure engineers, this is significant.
Namespaces are elegant for multi-tenancy — each customer's vectors are isolated without managing separate indexes.
Where It Struggles
Cost scales faster than alternatives. At 10 million vectors with moderate query volume, we were paying roughly 4x what equivalent self-hosted Qdrant cost us. Network latency is unavoidable — every query is an HTTP round-trip. And vendor lock-in is real; there is no standard vector database migration format.
Qdrant: The Performance Champion
Strengths
Qdrant consistently delivers the best raw performance in our benchmarks. Its Rust implementation and HNSW optimizations produce remarkable throughput. The filtering system is particularly strong — it applies filters during the HNSW traversal rather than post-query, which means filtered queries are nearly as fast as unfiltered ones.
The API is well-designed, with first-class support for batch operations, named vectors (multiple vector types per point), and payload indexing.
Where It Struggles
Self-hosting means you own the operational burden: backups, monitoring, scaling, upgrades. Their managed cloud offering mitigates this but is still newer and less battle-tested than Pinecone's. Documentation, while improving, has gaps in advanced clustering configurations.
Cost Analysis at Scale
Cost is often the deciding factor. Here is what we have seen at the 5 million vector scale with moderate query loads (roughly 100 QPS average).
| Cost Component | pgvector | Pinecone Serverless | Qdrant (self-hosted) | Qdrant Cloud |
|---|---|---|---|---|
| Compute/hosting | $0 (existing DB)* | Included | ~$200/mo (dedicated) | ~$150/mo |
| Storage | ~$10/mo | ~$75/mo | ~$15/mo | ~$40/mo |
| Query costs | $0 | ~$120/mo at 100 QPS | $0 | Included |
| Operational overhead | Low | None | Medium-High | Low |
| Estimated monthly total | ~$10 | ~$195 | ~$215 | ~$190 |
*pgvector's compute cost is effectively zero if your Postgres instance has headroom. If you need to upgrade your instance to handle vector workloads, factor in that cost.
The Hidden Cost: Engineering Time
These numbers do not capture engineering time. Setting up monitoring, writing migration scripts, debugging index performance — these hours add up. Pinecone's higher dollar cost often pays for itself in reduced engineering overhead, especially for smaller teams.
Decision Framework
After deploying all three, we have settled on a simple framework:
Choose pgvector when:
- You have fewer than 5 million vectors
- Your queries always combine vector search with relational filters
- You want zero additional infrastructure
- Latency under 50ms is acceptable
Choose Pinecone when:
- Your team has no dedicated infrastructure capacity
- You need multi-tenant isolation (namespaces)
- You are willing to pay more for operational simplicity
- You need to scale past 10 million vectors without planning
Choose Qdrant when:
- Raw query performance is critical
- You have the engineering capacity for self-hosting (or use their cloud)
- You need advanced features like named vectors or complex filtering
- Cost at scale is a primary concern
Migration Strategies
When you do need to migrate — and we have done this twice — the process is:
- Dual-write first. Write new vectors to both old and new systems. Do not try a big-bang migration.
- Backfill in batches. Re-embed and insert historical data in chunks of 10,000. Rate-limit to avoid overwhelming either system.
- Shadow-read. Query both systems and compare results. Log discrepancies. We ran shadow reads for two weeks before cutting over.
- Gradual cutover. Route 10% of read traffic to the new system, then 50%, then 100%. Monitor recall and latency at each stage.
- Decommission after a soak period. Keep the old system running for 30 days after full cutover, just in case.
What We Use Today
For most projects, we start with pgvector. It eliminates an entire category of operational complexity, and for datasets under a few million vectors with moderate query loads, it performs admirably. When a project outgrows it — and we have a clear signal that it has, not a premature optimization hunch — we migrate to Qdrant.
We reserve Pinecone for client projects where the client's team will own operations after handoff and does not have infrastructure expertise. The managed experience is worth the premium in those cases.
The best vector database is the one that matches your current scale and team capacity. pgvector is not glamorous, but it has saved us from managing additional infrastructure on a dozen projects. When you genuinely need more — and benchmarks on your actual workload prove it — migrate deliberately with dual-writes and shadow reads. The vector database landscape is evolving fast; avoid locking in prematurely.