• Uncategorised

Scalable Stats Collection in a URL Shortener

A URL shortener may handle millions of redirects per second.
Each redirect is also a click event we want to count.

But there’s a challenge:

Redirects must be ultra fast, while stats storage is write-heavy.

If we update the database on every click, the system collapses.
So we use an asynchronous event pipeline with aggregation.


⚠️ The Naive Approach (Doesn’t Scale)

Every redirect doing this:

UPDATE stats SET clicks = clicks + 1 WHERE shortCode = 'abc123';

At scale, this causes:

  • DB overload
  • Lock contention
  • Slow redirects

So analytics must be decoupled from the redirect path.

User Request

Redirect Service
↓ (async event)
Message Queue

Stream Aggregator

Analytics Database


Redirect returns immediately. Stats processing happens later.

🧩 Step 1: Event Generation

For every redirect, we emit a lightweight event:

{
  shortCode: "abc123",
  timestamp: 10:01:05
}

If URL gets 10 hits in 1 second → 10 events are pushed to the queue.

This is cheap and non-blocking.


🔄 Step 2: Grouping (Aggregation)

We do windowed aggregation.

Grouping Key

We group events using:

(shortCode, time_bucket)

Where:

time_bucket = timestamp truncated to minute

Example:

Event TimeBucket
10:01:0510:01
10:01:3210:01
10:01:5810:01

So grouping key becomes:

("abc123", "10:01")

⚙️ Stream Processor Logic

It keeps an in-memory table:

KeyCount
(“abc123”, “10:01”)10

Each event increments:

count[key] += 1

📝 Step 3: Writing to Database

When the 1-minute window closes:

shortCode = abc123
minute = 10:01
clicks = 10

Only one DB write instead of 10.


📉 Write Reduction

Raw EventsAfter Aggregation
101
10001
1M1

This massively reduces database load.

You may also like...