Compressing Vector DB data

by marjavamitjava · June 18, 2025

Question: If a system has 100 million user-product interactions, and if each order is mapped to
a vector of 768 dimensions, how much space will the system require to store all vectors?
Answer: 100 million interactions = 10
8
interactions.
Each order is stored as a 768-dimensional vector. That’s 7.68 x 10
10 values.
If every dimensional value is stored as a 32-bit floating point, we need 7.68 x 10
10 x 32 bits.
= 7.68 x 10
10 x 4 Bytes
= 3.072 x 10
11 Bytes
= 307.2 GB
Caching this data is very challenging.
This makes the system slow to query, hurting user experience.
Why Compress
The goal of compression is not saving disk space. Compression helps:
● Fitting more vectors in RAM
● Reducing search latency
● Making SIMD or GPU compute feasible
How It Works

Product Quantization (PQ)
● Break each vector into subb-vectors (e.g., split a 128D vector into 8 chunks of 16D).
● For each chunk, find the nearest centroid from a pre-trained codebook.
● Store only the index of the centroid, not the float values.
So instead of storing 128 floats (512 bytes), you store 8 integers (8 bytes).

That’s a 64x reduction.
PQ is used heavily in Facebook’s FAISS, Milvus, and other modern vector DBs.

Compressing Vector DB data

You may also like...

Isolating Web Applications in Apache Tomcat Using Custom Class Loaders

How to Retrieve the Container ID in Java: Working with Docker and Kubernetes

Understanding Text Embeddings: How AI Converts Words into Vectors