A layman example of Vector database

by marjavamitjava · May 22, 2025

Here’s a layman-friendly example of representing a word as a 2D vector (just two numbers) — even though real embeddings are often 384, 768, or 1536 dimensions.

Let’s take the word: “king”

We want to represent this word using just two numbers, like:

arduinoCopyEdit"king" → [0.8, 0.4]

What do these numbers mean?

Imagine you’re placing words on a 2D graph, like an X-Y plane.

Let’s define:

X-axis → “Power” (from weak to powerful)
Y-axis → “Gender” (from female to male)

Example placements:

Word	Vector	Meaning
“king”	[0.8, 0.4]	High power, more male
“queen”	[0.8, -0.4]	High power, more female
“man”	[0.2, 0.4]	Less power, male
“woman”	[0.2, -0.4]	Less power, female
“child”	[0.1, 0.0]	Low power, gender-neutral

Now, using math on these vectors:

“king” – “man” + “woman” = ?

csharpCopyEdit[0.8, 0.4] - [0.2, 0.4] + [0.2, -0.4] = [0.8, -0.4] → "queen"

That’s how semantic meaning can be captured with simple vector math — even in just 2 dimensions.

In the real world, we use 768D or 1536D vectors where such patterns become incredibly powerful and accurate.

How do we arrive at [0.8, 0.4] for king

Great question — this is the core of embeddings: how do we get from "king" to [0.8, 0.4] (or [0.01, 0.02, ..., 0.99] in real systems)?

Short Answer:

We don’t manually assign numbers like [0.8, 0.4]. Instead, we train a machine learning model (like Word2Vec, GloVe, BERT, or OpenAI’s embedding models) on massive text corpora, and the model learns these numbers based on how words appear in context.

Layman Analogy:

Imagine you’re trying to understand a new word by looking at how it’s used in thousands of books and articles.

If “king” is often used near “throne,” “queen,” “kingdom,” “rule,” and “crown” — and rarely near “microwave” or “banana” — the model will learn its meaning through patterns.

How it’s done (simplified):

Massive text is scanned, sentence by sentence: “The king sat on his throne.”
The model notices:
- “king” appears near “throne”, “queen”, “rule”, etc.
- It adjusts its internal vector for “king” to match these relationships.
It does this for every word, adjusting vector positions so that:
- Similar words have similar vectors
- Relationships like “king – man + woman ≈ queen” are possible via vector math

Example with Word2Vec (classic method):

Suppose the model sees these pairs:

(“king”, “queen”) → very close in meaning
(“king”, “man”) → similar gender
(“king”, “ruler”) → similar role

The algorithm adjusts the vector for “king” so it’s:

Near “queen” in vector space
Far from unrelated words like “car” or “spoon”

This might converge to something like:

iniCopyEditking = [0.8, 0.4]
queen = [0.8, -0.4]
man = [0.2, 0.4]

Final Note:

The actual numbers like [0.8, 0.4] are:

Learned from context
Optimized using neural networks
Stored in embedding tables that you can query later (like from OpenAI or HuggingFace)