Caching Strategy in a Planet-Scale URL Shortener
When designing a large-scale URL shortener (like Bitly), the biggest challenge isn’t storing URLs — it’s serving billions of redirects with low latency.
The solution? Multi-layer caching.
This article explains how caching works across layers, how CDNs help, and how we handle hot keys, invalidation, and failures.
📌 Why Caching Is Critical
Typical traffic pattern:
- 100M new URLs/day
- 10B redirects/day
- Read:Write ≈ 100:1
Without caching, the database would collapse.
With caching, 95–99% of traffic never reaches the DB.
🧅 The Multi-Layer Cache Model
User
↓
CDN Edge Cache
↓
Load Balancer
↓
App Server (Local LRU Cache)
↓
Redis Distributed Cache
↓
Database
Each layer reduces load on the next.
🌍 1. CDN Cache (Global Edge Layer)
A CDN like Cloudflare or Akamai has edge servers worldwide.
What CDN caches
It caches the HTTP redirect response, not the final website.
Example response:
HTTP 302 Found
Location: https://example.com
Cache-Control: public, max-age=86400
CDN stores:
Request: short.ly/abc123
Response: 302 + Location header
So future users are redirected directly from the edge, without hitting your servers.
Benefits
- Offloads 80–95% of traffic
- Reduces latency
- Protects backend during spikes
- Handles viral URLs
💻 2. Browser Cache
If headers allow:
Cache-Control: public, max-age=3600
The browser remembers the redirect.
Next visit = no network call at all.
⚙️ 3. App Server Local Cache (LRU)
Each app instance keeps a small in-memory cache:
abc123 → https://example.com
Why?
- Memory lookup = nanoseconds
- Avoids Redis network hop
- Great for repeated hits on same server
🔴 4. Redis Distributed Cache
Primary backend cache.
Key: shortCode
Value: longURL
Why Redis?
- In-memory
- ~1 ms lookup
- Can handle massive QPS
- Shared across app servers
🥶 Cache Lookup Flow
Request arrives
1️⃣ Check local LRU
2️⃣ If miss → check Redis
3️⃣ If miss → query DB
4️⃣ Store result in Redis + LRU
🔥 The Hot Key Problem
A single URL goes viral:
abc123 → 5 million req/sec
One Redis node becomes overloaded.
Solutions
- CDN absorbs most traffic (primary defense)
- Replicate hot keys across multiple cache nodes
- Local LRU caches reduce Redis pressure
- Use consistent hashing with virtual nodes
🔄 Cache Invalidation (URL Deletion)
When a short URL is deleted:
| Layer | Action |
|---|---|
| DB | Mark as deleted |
| Redis | Delete key |
| App LRU | Clear via pub/sub |
| CDN | Call purge API |
| Browser | TTL expiration only |
Because CDN invalidation isn’t instant, we use soft delete flags in DB to prevent stale redirects.
📦 Handling Large-Scale Purges
If 1M URLs must be deleted:
- Don’t purge one-by-one
- Batch purge requests
- Use async queue + workers
- Apply TTL as safety net
- Use cache versioning to avoid purge
⚠️ Failure Handling
| Failure | What Happens |
|---|---|
| Redis down | App falls back to DB |
| CDN down | Traffic hits backend (autoscale needed) |
| Cache stale | TTL eventually expires |
🎯 Key Design Principle
This system is cache-first, database-last.
Goal traffic split:
| Layer | Traffic |
|---|---|
| CDN | 95% |
| Redis | 4% |
| DB | 1% |