Database Indexing, B-Trees, and Query Optimization (2026)
Day 9: B-Trees, Cardinality, and the Query Planner's Betrayal
A startup CTO once hired me because their primary dashboard was taking 14 seconds to load. He told me, "I don't understand, I added an index to every single column in the table, and it actually got slower!" This is the danger of textbook development. Junior developers view indexes as magic "go fast" buttons. Senior developers view indexes as physically duplicated B-Tree data structures that manipulate the disk. Today, we rip open the database engine to understand exactly how a query is executed, and why the database optimizer sometimes decides your index is garbage.
1. The Anatomy of a Query
When you send a SQL string like SELECT * FROM users WHERE tenant_id = 5; to Postgres, it doesn't just go to the hard drive. It passes through three architectural layers:
- The Parser: Checks your SQL for syntax errors.
- The Query Planner / Optimizer: The brain of the database. It looks at your table, looks at available indexes, looks at data statistics (how many rows exist), and calculates the absolute cheapest physical path to get your data.
- The Executor: Takes the plan and physically reads the disk/RAM.
If your query is slow, it is because the Query Planner decided the best available physical path was still a terrible path. To fix it, you have to understand the path it wants to take.
2. The B+ Tree: The Engine of the Internet
When we say "Database Indexing," 99% of the time, we are talking about a B-Tree Index (specifically a B+ Tree). You cannot optimize a database if you cannot visualize this tree.
3. Table Scans vs. Index Scans
When Postgres finds the leaf node in the B-Tree, what is actually inside it?
In a Non-Clustered Index (the default in Postgres), the leaf node does NOT contain your user's email or login date. It contains a Pointer (a physical disk address) to the actual row sitting in the "Heap" (the main table data on the hard drive).
Therefore, an Index Scan is a two-step process:
1. Traverse the B-Tree to find the pointer in RAM/Cache.
2. Jump to the physical SSD to grab the actual row data.
A Clustered Index physically rearranges the actual table data on the disk to match the index order. The leaf node is the data. This makes reads violently fast, but makes INSERTS incredibly punishing because the database has to physically move rows around on the disk to maintain the sorted order.
4. Index Cardinality: The Optimizer's Betrayal
Let's say you have an e-commerce database with 10 million orders. You frequently query SELECT * FROM orders WHERE is_delivered = true;. Because it's slow, you add an index to the is_delivered boolean column.
You run the query again. It is exactly the same speed. You check Postgres using EXPLAIN ANALYZE, and the Optimizer states it is doing a Full Table Scan. It completely ignored your index. Why?
Index Cardinality. Cardinality is the uniqueness of data. A UUID column has High Cardinality (every row is unique). A boolean column has Low Cardinality (only 2 possible values). If 90% of your 10 million orders are delivered, the B-Tree leaf node will return 9 million disk pointers.
The Query Optimizer does the math: "If I use this index, I have to traverse the tree, and then perform 9 million random physical jumps to the SSD. That is insanely expensive. It is actually faster for me to ignore the tree entirely and just read the hard drive from top to bottom."
5. Composite Indexes & The Left-Prefix Rule
When you query multiple columns, like WHERE tenant_id = 5 AND last_login > '2026-01-01', you create a Composite Index: CREATE INDEX idx_tenant_login ON users(tenant_id, last_login);
The order you define those columns in the index is a matter of life and death for the query, governed by the Left-Prefix Rule.
Think of a Composite Index like a telephone book sorted by (Last Name, First Name). If you know the Last Name is "Smith", you jump straight to the 'S' section. If you know the Last Name is "Smith" and the First Name is "Adam", you jump to the 'S' section and find Adam immediately.
But what if your query only asks WHERE last_login > '2026-01-01'? In our phonebook analogy, you are searching for everyone with the First Name "Adam", but you don't know their Last Name. The phonebook is useless. You have to read every single page. The database will drop the index and run a Full Table Scan.
6. The Holy Grail: The Covering Index
Remember in Section 3 how an Index Scan requires two steps? Finding the pointer in the tree, and jumping to the physical disk (the Heap) to get the data? What if we could eliminate the jump to the disk?
If your query is SELECT email FROM users WHERE id = 5;, the database uses the B-Tree to find ID 5, then jumps to the disk to retrieve the `email` column.
A Covering Index includes the data you want to `SELECT` directly inside the B-Tree leaf node alongside the pointer. In Postgres, you do this using the `INCLUDE` keyword:
Creating a Covering Index (SQL)
-- The B-Tree is sorted by 'id'.
-- But we attach the 'email' payload directly onto the B-Tree leaf node.
CREATE INDEX idx_users_id_covering ON users(id) INCLUDE (email);
When you run the query now, Postgres traverses the B-Tree, finds ID 5, and sees the email address sitting right there on the leaf node. It performs an Index-Only Scan. It never touches the physical hard drive table. This is how you achieve single-digit millisecond read latencies at massive scale.
🛠️ Day 9 Project: EXPLAIN ANALYZE
Stop guessing why your ORM is slow. Prove it.
- Take any slow query from your current project.
- Run `EXPLAIN ANALYZE SELECT ...` in your Postgres console.
- Look for the words
Seq Scan(Sequential Table Scan). This is your enemy. Add an index to the column in the `WHERE` clause, run it again, and verify the output changes toIndex Scan.
B-Trees are amazing, but they are physically large. If you are logging millions of events per day (like we did in Day 8), a B-Tree index on `created_at` will eventually become so massive it exceeds your server's RAM, causing performance to plummet.
The Solution: Block Range Indexes (BRIN). Because log data is naturally appended sequentially over time, a BRIN index simply stores the min and max timestamp for massive physical blocks of data. It is 99% smaller than a B-Tree, takes milliseconds to update, and allows Postgres to skip millions of rows during a time-series query.
We know how to index data. Now we look at how your framework pulls it out. Tomorrow, we conclude the Database trilogy by exposing the N+1 Query Problem—the silent ORM design flaw that destroys 90% of Django, Rails, and Prisma architectures in production.
📚 Deep Diver Resources
- Use The Index, Luke! - The definitive, free, online textbook for SQL indexing. Required reading for all backend engineers.
- PostgreSQL Official Docs: Using EXPLAIN - Learn how to read the complex output trees generated by the query planner.
- Logic & Legacy: Advanced SQLite & Indices - A lighter introduction to indexing concepts using local SQLite before jumping into Postgres server architecture.
Frequently Asked Questions
Should I put an index on every column just to be safe?
Absolutely not. Every index you create must be updated every time a row is inserted, updated, or deleted. Adding unnecessary indexes will cripple your database's Write Performance (The Write Penalty from Day 8) while consuming massive amounts of RAM and disk space.
What is the difference between `EXPLAIN` and `EXPLAIN ANALYZE`?
`EXPLAIN` only asks the Query Planner for its estimate of how it will execute the query. It does not actually run the query. `EXPLAIN ANALYZE` actually executes the query against the database, records the exact execution time, and compares the actual performance against the planner's estimates.
Can I use the Left-Prefix rule to optimize `LIKE '%search'` queries?
No. A standard B-Tree index reads from left to right. If you search `LIKE 'John%'`, the B-Tree can jump to 'J' and scan. If you search `LIKE '%John'`, the database has no idea what letter the word starts with. The Left-Prefix is broken, and Postgres will fall back to a Full Table Scan. For wildcard text searches, you need specialized indexes like `GIN` or `Trigram` indexes.
Comments
Post a Comment
?: "90px"' frameborder='0' id='comment-editor' name='comment-editor' src='' width='100%'/>