Why You Need a Vector Database
A vector database is a specialized database that stores text, images, and other data as numerical vectors (embeddings) and searches based on semantic similarity. It’s like finding books in a library not by keyword but by the meaning of their content.
In a RAG (Retrieval-Augmented Generation) pipeline, the vector DB is critical infrastructure. It stores documents as embeddings and quickly retrieves documents similar to user queries to pass as context to the LLM.
Major Vector DBs at a Glance
| Feature | Chroma | Pinecone | Weaviate | pgvector |
|---|---|---|---|---|
| Type | Open source | Managed SaaS | Open source + Cloud | PostgreSQL extension |
| Language | Python | - (API-based) | Go | C (PG extension) |
| Hosting | Local/Self-hosted | Cloud only | Local/Cloud | Add to existing PG |
| Index algorithm | HNSW | Proprietary | HNSW, Flat | IVFFlat, HNSW |
| Metadata filtering | Supported | Supported | Supported (GraphQL) | SQL WHERE clause |
| Max vector dimensions | Unlimited | 20,000 | 65,535 | 2,000 |
| Free tier | Completely free | 1 index | Self-hosted free | Free (PG extension) |
Chroma: Best Choice for Local Development
Chroma is the easiest vector DB to get started with. Installation is simple and it integrates naturally with the Python ecosystem.
# pip install chromadb
import chromadb
# Create client with persistent storage
client = chromadb.PersistentClient(path="./chroma_db")
# Create collection (similar concept to a table)
collection = client.get_or_create_collection(
name="documents",
metadata={"hnsw:space": "cosine"} # Use cosine similarity
)
# Add documents (embeddings are auto-generated)
collection.add(
documents=[
"Python is a widely used language for data analysis.",
"JavaScript is the standard for web frontend development.",
"Rust is a systems language that guarantees memory safety."
],
ids=["doc1", "doc2", "doc3"],
metadatas=[
{"category": "data"},
{"category": "web"},
{"category": "system"}
]
)
# Similarity search
results = collection.query(
query_texts=["A good programming language for data analysis"],
n_results=2, # Top 2 results
where={"category": "data"} # Metadata filtering
)
print(results["documents"])
# Output: [['Python is a widely used language for data analysis.']]
Chroma pros and cons:
- Pros: Ready to use immediately after install, built-in embedding model, excellent LangChain/LlamaIndex integration
- Cons: Lacking performance validation at large scale, no distributed processing support
Pinecone: The Convenience of a Managed Service
Pinecone is a fully managed service that provides vector search without infrastructure management. It automatically handles scaling, backups, and monitoring.
# pip install pinecone
from pinecone import Pinecone, ServerlessSpec
# Initialize client
pc = Pinecone(api_key="your-api-key")
# Create index (serverless mode)
pc.create_index(
name="my-documents",
dimension=1536, # OpenAI embedding dimensions
metric="cosine",
spec=ServerlessSpec(
cloud="aws",
region="us-east-1"
)
)
index = pc.Index("my-documents")
# Upsert vectors (embeddings must be generated separately)
index.upsert(
vectors=[
{
"id": "doc1",
"values": [0.1, 0.2, ...], # 1536-dimensional embedding vector
"metadata": {"category": "ai", "source": "blog"}
},
{
"id": "doc2",
"values": [0.3, 0.4, ...],
"metadata": {"category": "web", "source": "docs"}
}
],
namespace="articles" # Separate data by namespace
)
# Similarity search
results = index.query(
vector=[0.1, 0.2, ...], # Query embedding vector
top_k=5,
include_metadata=True,
filter={"category": {"$eq": "ai"}} # Metadata filter
)
for match in results["matches"]:
print(f"ID: {match['id']}, Similarity: {match['score']:.4f}")
# Output: ID: doc1, Similarity: 0.9521
Pinecone pros and cons:
- Pros: No infrastructure management, auto-scaling, high availability
- Cons: Vendor lock-in, data stored externally, costs scale with usage
Weaviate: The Hybrid Search Powerhouse
Weaviate natively supports hybrid search combining vector search and keyword search. It provides a GraphQL API for complex queries.
# pip install weaviate-client
import weaviate
import weaviate.classes as wvc
# Connect to local Weaviate (running via Docker)
client = weaviate.connect_to_local()
# Create collection
articles = client.collections.create(
name="Article",
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
properties=[
wvc.config.Property(name="title", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="content", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="category", data_type=wvc.config.DataType.TEXT),
]
)
# Insert data (vectorization handled automatically)
articles.data.insert_many([
{"title": "RAG Guide", "content": "RAG stands for Retrieval-Augmented Generation.", "category": "ai"},
{"title": "Docker Intro", "content": "Docker is a container technology.", "category": "devops"},
])
# Hybrid search (vector + keyword combined)
response = articles.query.hybrid(
query="AI search technology",
alpha=0.7, # 0=keyword only, 1=vector only, 0.7=vector weighted
limit=3,
return_metadata=wvc.query.MetadataQuery(score=True)
)
for obj in response.objects:
print(f"{obj.properties['title']} (Score: {obj.metadata.score:.4f})")
# Output: RAG Guide (Score: 0.8934)
client.close()
Weaviate pros and cons:
- Pros: Native hybrid search, GraphQL API, modular vectorizers
- Cons: Docker required for local execution, learning curve exists
pgvector: Adding Vectors to Existing PostgreSQL
If you’re already using PostgreSQL, you can add the pgvector extension for vector search without a separate vector DB.
-- Install pgvector extension
CREATE EXTENSION vector;
-- Create table with vector column
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
category TEXT,
embedding vector(1536), -- 1536-dimensional vector
created_at TIMESTAMP DEFAULT NOW()
);
-- Create HNSW index (fast approximate nearest neighbor search)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
-- Insert vector
INSERT INTO documents (title, content, embedding)
VALUES ('RAG Guide', 'RAG stands for Retrieval-Augmented Generation.', '[0.1, 0.2, ...]');
-- Cosine similarity search (top 5)
SELECT title, content,
1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
FROM documents
WHERE category = 'ai'
ORDER BY embedding <=> '[0.1, 0.2, ...]' -- <=> is the cosine distance operator
LIMIT 5;
-- Result: title: RAG Guide, similarity: 0.9234
pgvector pros and cons:
- Pros: Leverages existing PostgreSQL infrastructure, SQL integration, transaction support
- Cons: Performance limitations vs dedicated vector DBs, slower on large datasets
Recommendations by Use Case
| Scenario | Recommended DB | Reason |
|---|---|---|
| Prototype/local development | Chroma | Ready instantly, start with 3 lines of code |
| Small production (under 100K vectors) | pgvector | No extra infrastructure, SQL integration |
| Medium production | Weaviate | Hybrid search, self-hosting possible |
| Large-scale SaaS | Pinecone | Auto-scaling, minimal management overhead |
| Need keyword + semantic search | Weaviate | Native hybrid search support |
| Already using PostgreSQL | pgvector | No additional infrastructure cost |
Cost Comparison (Monthly, Based on 1M Vectors)
| DB | Estimated Cost | Notes |
|---|---|---|
| Chroma | Free (self-hosted server costs) | Only server costs apply |
| Pinecone | $70—$200 | Serverless pricing, varies by query volume |
| Weaviate Cloud | $25—$100 | Varies by node size |
| pgvector | Free (existing PG costs) | May need additional memory/CPU |
Practical Tips
- Start with Chroma: Quickly validate during the prototype stage, then migrate to a DB that fits your requirements.
- Embedding model choice matters more: The quality of the embedding model has a far greater impact on search accuracy than differences between vector DBs.
- Actively use metadata filtering: Vector similarity alone is often insufficient. Combining metadata filters for date, category, etc. significantly improves search quality.
- Consider hybrid search: Proper nouns, code, and abbreviations are often more accurately found with keyword search. Consider Weaviate’s hybrid search or combining a separate BM25 search.
- Tune your index settings: Adjusting HNSW’s
ef_constructionandef_searchparameters lets you balance accuracy and speed.