Caching Strategy in a Planet-Scale URL Shortener

by marjavamitjava · February 2, 2026

When designing a large-scale URL shortener (like Bitly), the biggest challenge isn’t storing URLs — it’s serving billions of redirects with low latency.

The solution? Multi-layer caching.

This article explains how caching works across layers, how CDNs help, and how we handle hot keys, invalidation, and failures.

📌 Why Caching Is Critical

Typical traffic pattern:

100M new URLs/day
10B redirects/day
Read:Write ≈ 100:1

Without caching, the database would collapse.
With caching, 95–99% of traffic never reaches the DB.

🧅 The Multi-Layer Cache Model

User
 ↓
CDN Edge Cache
 ↓
Load Balancer
 ↓
App Server (Local LRU Cache)
 ↓
Redis Distributed Cache
 ↓
Database

Each layer reduces load on the next.

🌍 1. CDN Cache (Global Edge Layer)

A CDN like Cloudflare or Akamai has edge servers worldwide.

What CDN caches

It caches the HTTP redirect response, not the final website.

Example response:

HTTP 302 Found
Location: https://example.com
Cache-Control: public, max-age=86400

CDN stores:

Request: short.ly/abc123
Response: 302 + Location header

So future users are redirected directly from the edge, without hitting your servers.

Benefits

Offloads 80–95% of traffic
Reduces latency
Protects backend during spikes
Handles viral URLs

💻 2. Browser Cache

If headers allow:

Cache-Control: public, max-age=3600

The browser remembers the redirect.
Next visit = no network call at all.

⚙️ 3. App Server Local Cache (LRU)

Each app instance keeps a small in-memory cache:

abc123 → https://example.com

Why?

Memory lookup = nanoseconds
Avoids Redis network hop
Great for repeated hits on same server

🔴 4. Redis Distributed Cache

Primary backend cache.

Key: shortCode
Value: longURL

Why Redis?

In-memory
~1 ms lookup
Can handle massive QPS
Shared across app servers

🥶 Cache Lookup Flow

Request arrives

1️⃣ Check local LRU
2️⃣ If miss → check Redis
3️⃣ If miss → query DB
4️⃣ Store result in Redis + LRU

🔥 The Hot Key Problem

A single URL goes viral:

abc123 → 5 million req/sec

One Redis node becomes overloaded.

Solutions

CDN absorbs most traffic (primary defense)
Replicate hot keys across multiple cache nodes
Local LRU caches reduce Redis pressure
Use consistent hashing with virtual nodes

🔄 Cache Invalidation (URL Deletion)

When a short URL is deleted:

Layer	Action
DB	Mark as deleted
Redis	Delete key
App LRU	Clear via pub/sub
CDN	Call purge API
Browser	TTL expiration only

Because CDN invalidation isn’t instant, we use soft delete flags in DB to prevent stale redirects.

📦 Handling Large-Scale Purges

If 1M URLs must be deleted:

Don’t purge one-by-one
Batch purge requests
Use async queue + workers
Apply TTL as safety net
Use cache versioning to avoid purge

⚠️ Failure Handling

Failure	What Happens
Redis down	App falls back to DB
CDN down	Traffic hits backend (autoscale needed)
Cache stale	TTL eventually expires

🎯 Key Design Principle

This system is cache-first, database-last.

Goal traffic split:

Layer	Traffic
CDN	95%
Redis	4%
DB	1%