• Uncategorised
  • 0

Compressing Vector DB data

Question: If a system has 100 million user-product interactions, and if each order is mapped to
a vector of 768 dimensions, how much space will the system require to store all vectors?
Answer: 100 million interactions = 10
8
interactions.
Each order is stored as a 768-dimensional vector. That’s 7.68 x 10
10 values.
If every dimensional value is stored as a 32-bit floating point, we need 7.68 x 10
10 x 32 bits.
= 7.68 x 10
10 x 4 Bytes
= 3.072 x 10
11 Bytes
= 307.2 GB
Caching this data is very challenging.
This makes the system slow to query, hurting user experience.
Why Compress
The goal of compression is not saving disk space. Compression helps:
● Fitting more vectors in RAM
● Reducing search latency
● Making SIMD or GPU compute feasible
How It Works

  1. Product Quantization (PQ)
    ● Break each vector into subb-vectors (e.g., split a 128D vector into 8 chunks of 16D).
  2. ● For each chunk, find the nearest centroid from a pre-trained codebook.
  3. ● Store only the index of the centroid, not the float values.
  4. So instead of storing 128 floats (512 bytes), you store 8 integers (8 bytes).

That’s a 64x reduction.
PQ is used heavily in Facebook’s FAISS, Milvus, and other modern vector DBs.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *