High-Performance Caching — Redis, CDNs, DNS & The Cache-Aside Pattern (2026)

We have perfected our database models and normalized our schemas. But relational databases store data on physical disk drives. No matter how well you index a PostgreSQL table, physical disk I/O will always be the bottleneck of the internet. To scale to millions of users, we must stop reading the disk.

The Speed of Light Limitation

An educational infographic titled "Memory Over Matter" explaining caching domains (Hardware, Redis, CDN/DNS), the cache-aside pattern, and performance benchmarks comparing RAM (Redis) vs. Disk (PostgreSQL).

If an API requests 50 user records from a database, the DB engine must parse the SQL, query the index, move the disk reader, load the blocks into memory, serialize the data, and send it back over the network. This takes milliseconds. In computing, a millisecond is an eternity.

Caching is the architectural practice of storing a copy of frequently accessed, expensive-to-compute data in an ultra-fast, temporary storage layer. Today, we architect the cache.

1. The Three Domains of Caching

Caching is a fundamental law of physics applied at every layer of computation:

Hardware-Based (L1/L2/L3 Cache): Silicon built directly into your CPU. Instead of waiting for data from the computer's main RAM, the CPU keeps a tiny amount of critical instructions physically microns away from the processing cores. It operates in nanoseconds.
Software-Based (Redis/Memcached): An application layer cache. Data retrieved from a slow database (Disk) is stored as a Key-Value pair in the server's main memory (RAM). It operates in microseconds.
Network-Based (CDN/DNS): The outermost shield. Data is stored on remote proxy servers geographically closer to the end user. It operates in milliseconds, preventing the request from ever reaching your primary servers.

2. Network Caching: CDNs and DNS

Why do we use network caching? Because fiber optic cables are fast, but they still obey the speed of light. If your server is in New York, a user in Tokyo will experience at least 150ms of latency just waiting for light to bounce across the ocean.

Similarly, DNS (Domain Name System) caches the translation of www.google.com to its IP address 142.250.190.46. Your computer, your router, and your ISP all cache this translation so they don't have to constantly hit the global root servers, which would instantly buckle under the weight of the entire internet.

3. The Entire Flow of Caching (The Cache-Aside Pattern)

Returning to the software layer: how do we implement RAM caching in Python? The most robust architecture is the Cache-Aside pattern. The application code is entirely responsible for managing both the cache and the database.

Step 1: The Application asks Redis (the Cache) for the data using a specific key (e.g., users:top_50).
Step 2 (Cache Hit): If found, Redis returns the data instantly. The API request finishes without ever touching the database.
Step 3 (Cache Miss): If not found in Redis, the Application queries PostgreSQL (the DB).
Step 4: The Application saves the Postgres data into Redis so the next request succeeds.
Step 5: The Application returns the data to the user.

4. Disk vs RAM: Postgres vs Redis Execution Results

Redis is an open-source, in-memory data store. Because it bypasses the physical SSD, it is terrifyingly fast. I have implemented a full asyncpg and redis-py engine in today's GitHub repository. Here is the exact implementation and the resulting execution benchmark when fetching 50 rows.

Python Cache-Aside Implementation

async def get_top_50_users(self):
    cache_key = "users:top_50"

    # 1. Attempt RAM Fetch (Cache)
    cached_data = await self.redis_client.get(cache_key)
        
    if cached_data:
        print("[✅ CACHE HIT] Data found in Redis.")
        return orjson.loads(cached_data) 
            
    else:
        print("[❌ CACHE MISS] Data not in Redis. Hitting PostgreSQL...")
        # 2. Fallback to Disk Fetch
        users = await self.fetch_users_from_db()
            
        # 3. Store in RAM for future requests with a 60-second TTL
        await self.redis_client.setex(cache_key, 60, orjson.dumps(users))
        return users

Live Execution Benchmark

--- FIRST REQUEST (Expecting Miss) ---
[❌ CACHE MISS] Data not in Redis. Hitting PostgreSQL...
--> Postgres Fetch Time: 15.420 ms
[SYSTEM] Data written to Redis with 60s TTL.

--- SECOND REQUEST (Expecting Hit) ---
[✅ CACHE HIT] Data found in Redis.
--> Redis Fetch Time: 0.315 ms

# CONCLUSION: Redis (RAM) is ~49x faster than PostgreSQL (Disk).

5. The YouTube Architecture: Edge Caching via DNS

How does YouTube flawlessly stream 4K video to millions of users simultaneously without its central database catching fire? It uses a profound combination of DNS and Network Edge Caching.

YouTube's central servers do not serve videos directly. Google installs massive physical hardware caches called Google Global Cache (GGC) nodes directly inside the datacenters of your local ISP (Internet Service Provider) like Comcast or Verizon.

6. The Laws of Ephemerality: Time To Live (TTL)

RAM is extremely expensive. You cannot cache your entire database. Caching requires strict architectural rules to prevent memory exhaustion and stale data.

Data in the cache must have an expiration date. In the code above, setex("key", 60, data) tells Redis to enforce a TTL (Time To Live). Redis will automatically delete the data after exactly 60 seconds. This ensures that if the database is updated (e.g., a user changes their username), the cache will naturally purge the old record and reflect the new reality within one minute, minimizing the risk of Stale Data.

7. Eviction Policies

What happens if your server only has 2GB of RAM, you set a very long TTL, and you try to insert 3GB of cache? Redis will execute an Eviction Policy.

When Redis reaches its maxmemory limit, it must decide what to delete to make room for new data. The most common enterprise standard is LRU (Least Recently Used). Redis tracks which keys haven't been requested by the API in a while and silently deletes those neglected keys to make room for new, highly-requested data.

Alternatively, LFU (Least Frequently Used) deletes keys that have the lowest overall click count, regardless of when they were last touched.

Search This Blog

The Dharma of Development: Finding Purpose in Every Line of Code

Featured