Vector Databases Compared: pgvector vs Pinecone vs Qdrant

Why This Comparison Exists

Choosing a vector database is one of those decisions that feels reversible but is not. Migration costs are real — re-indexing millions of embeddings, rewriting query logic, updating infrastructure. At TwilightCore, we have deployed all three of these solutions across different projects, and the right choice depends entirely on your constraints.

This is not a feature checklist. It is an honest account of what we have experienced running these systems in production with real workloads.

The Contenders

pgvector is a PostgreSQL extension that adds vector similarity search to your existing Postgres database. It is the "no new infrastructure" option.

Pinecone is a fully managed vector database service. You get an API, a dashboard, and zero operational burden.

Qdrant is an open-source vector search engine written in Rust. You can self-host it or use their managed cloud offering.

Benchmark Results

We ran benchmarks on a standardized workload: 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-small output size), measured on comparable hardware.

Metric	pgvector (HNSW)	Pinecone (s1)	Qdrant (self-hosted)
Index build time	47 min	N/A (managed)	12 min
Query latency P50	8ms	22ms	4ms
Query latency P99	45ms	89ms	18ms
Recall@10 (ef=128)	0.97	0.98	0.99
Memory usage	4.2 GB	N/A	3.1 GB
Throughput (QPS)	850	400	2,200
Filtered query P50	15ms	28ms	6ms

Benchmarks Are Context-Dependent

These numbers reflect our specific workload, hardware, and tuning. Pinecone's latency includes network round-trip since it is a remote service — for a fair comparison of raw engine performance, subtract ~15ms. Your results will vary based on dimensionality, index parameters, and query patterns.

pgvector: When Your Database Is Enough

Setup

pgvector_setup.sql

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;
 
-- Create a table with a vector column
CREATE TABLE documents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content TEXT NOT NULL,
    embedding vector(1536),
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT now()
);
 
-- HNSW index — this is where the magic happens
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
 
-- Composite index for filtered queries
CREATE INDEX ON documents (created_at);
CREATE INDEX ON documents USING gin (metadata jsonb_path_ops);
 
-- Query with metadata filtering
SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE metadata @> '{"category": "engineering"}'
  AND created_at > now() - INTERVAL '30 days'
ORDER BY embedding <=> $1::vector
LIMIT 10;

Strengths

pgvector's killer advantage is no new infrastructure. If you already run Postgres — and you almost certainly do — you can add vector search without introducing a new database, new backups, new monitoring, or new failure modes. Your vectors live alongside your relational data, which means joins are trivial and transactional consistency is free.

Where It Struggles

pgvector starts to strain above 5-10 million vectors on a single instance. HNSW index builds are slow and memory-intensive. There is no built-in sharding — if you outgrow a single machine, you are on your own with Citus or manual partitioning.

Pinecone: Maximum Convenience

Setup

pinecone_setup.py

from pinecone import Pinecone, ServerlessSpec
 
pc = Pinecone(api_key="your-api-key")
 
# Create a serverless index
pc.create_index(
    name="documents",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)
 
index = pc.Index("documents")
 
# Upsert vectors with metadata
index.upsert(
    vectors=[
        {
            "id": "doc-001",
            "values": embedding,
            "metadata": {
                "category": "engineering",
                "author": "twilightcore-team",
                "word_count": 1500,
            },
        }
    ],
    namespace="blog-posts",
)
 
# Query with metadata filtering
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={
        "category": {"$eq": "engineering"},
        "word_count": {"$gte": 500},
    },
    include_metadata=True,
    namespace="blog-posts",
)

Strengths

Pinecone is genuinely zero-ops. No capacity planning, no index tuning, no infrastructure management. Their serverless offering scales automatically and you pay per query. For teams without dedicated infrastructure engineers, this is significant.

Namespaces are elegant for multi-tenancy — each customer's vectors are isolated without managing separate indexes.

Where It Struggles

Cost scales faster than alternatives. At 10 million vectors with moderate query volume, we were paying roughly 4x what equivalent self-hosted Qdrant cost us. Network latency is unavoidable — every query is an HTTP round-trip. And vendor lock-in is real; there is no standard vector database migration format.

Qdrant: The Performance Champion

Strengths

Qdrant consistently delivers the best raw performance in our benchmarks. Its Rust implementation and HNSW optimizations produce remarkable throughput. The filtering system is particularly strong — it applies filters during the HNSW traversal rather than post-query, which means filtered queries are nearly as fast as unfiltered ones.

The API is well-designed, with first-class support for batch operations, named vectors (multiple vector types per point), and payload indexing.

Where It Struggles

Self-hosting means you own the operational burden: backups, monitoring, scaling, upgrades. Their managed cloud offering mitigates this but is still newer and less battle-tested than Pinecone's. Documentation, while improving, has gaps in advanced clustering configurations.

Cost Analysis at Scale

Cost is often the deciding factor. Here is what we have seen at the 5 million vector scale with moderate query loads (roughly 100 QPS average).

Cost Component	pgvector	Pinecone Serverless	Qdrant (self-hosted)	Qdrant Cloud
Compute/hosting	$0 (existing DB)*	Included	~$200/mo (dedicated)	~$150/mo
Storage	~$10/mo	~$75/mo	~$15/mo	~$40/mo
Query costs	$0	~$120/mo at 100 QPS	$0	Included
Operational overhead	Low	None	Medium-High	Low
Estimated monthly total	~$10	~$195	~$215	~$190

*pgvector's compute cost is effectively zero if your Postgres instance has headroom. If you need to upgrade your instance to handle vector workloads, factor in that cost.

The Hidden Cost: Engineering Time

These numbers do not capture engineering time. Setting up monitoring, writing migration scripts, debugging index performance — these hours add up. Pinecone's higher dollar cost often pays for itself in reduced engineering overhead, especially for smaller teams.

Decision Framework

After deploying all three, we have settled on a simple framework:

Choose pgvector when:

You have fewer than 5 million vectors
Your queries always combine vector search with relational filters
You want zero additional infrastructure
Latency under 50ms is acceptable

Choose Pinecone when:

Your team has no dedicated infrastructure capacity
You need multi-tenant isolation (namespaces)
You are willing to pay more for operational simplicity
You need to scale past 10 million vectors without planning

Choose Qdrant when:

Raw query performance is critical
You have the engineering capacity for self-hosting (or use their cloud)
You need advanced features like named vectors or complex filtering
Cost at scale is a primary concern

Migration Strategies

When you do need to migrate — and we have done this twice — the process is:

Dual-write first. Write new vectors to both old and new systems. Do not try a big-bang migration.
Backfill in batches. Re-embed and insert historical data in chunks of 10,000. Rate-limit to avoid overwhelming either system.
Shadow-read. Query both systems and compare results. Log discrepancies. We ran shadow reads for two weeks before cutting over.
Gradual cutover. Route 10% of read traffic to the new system, then 50%, then 100%. Monitor recall and latency at each stage.
Decommission after a soak period. Keep the old system running for 30 days after full cutover, just in case.

What We Use Today

For most projects, we start with pgvector. It eliminates an entire category of operational complexity, and for datasets under a few million vectors with moderate query loads, it performs admirably. When a project outgrows it — and we have a clear signal that it has, not a premature optimization hunch — we migrate to Qdrant.

We reserve Pinecone for client projects where the client's team will own operations after handoff and does not have infrastructure expertise. The managed experience is worth the premium in those cases.

Start Boring, Scale Intentionally

The best vector database is the one that matches your current scale and team capacity. pgvector is not glamorous, but it has saved us from managing additional infrastructure on a dozen projects. When you genuinely need more — and benchmarks on your actual workload prove it — migrate deliberately with dual-writes and shadow reads. The vector database landscape is evolving fast; avoid locking in prematurely.

Vector Databases Compared: pgvector vs Pinecone vs Qdrant

Why This Comparison Exists

The Contenders

Benchmark Results

pgvector: When Your Database Is Enough

Setup

Strengths

Where It Struggles

Pinecone: Maximum Convenience

Setup

Strengths

Where It Struggles

Qdrant: The Performance Champion

Strengths

Where It Struggles

Cost Analysis at Scale

Decision Framework

Migration Strategies

What We Use Today

TwilightCore Team

Scaling WebSocket Connections to 100k Concurrent Users

Prompt Engineering Beyond the Basics