The Textbook Analogy
Imagine a 500-page textbook with no table of contents and no index at the back. To find information about "Binary Search Trees," you'd have to read every single page from page 1 to page 500. That's painfully slow.
Now imagine the same book WITH an index at the back:
Index:
Binary Search Trees ......... page 234
Hash Maps ................... page 156
Linked Lists ................ page 89
You look up "Binary Search Trees" in the index, go directly to page 234. Done in seconds.
Database indexing works exactly the same way. Without an index, the database scans every row (full table scan). With an index, it jumps directly to the matching rows.
The Impact
Users table: 10 million rows
Without index:
SELECT * FROM users WHERE email = 'sahil@email.com'
→ Scans all 10 million rows
→ Time: 30 seconds 🐌
With index on email column:
SELECT * FROM users WHERE email = 'sahil@email.com'
→ Looks up index, jumps to exact row
→ Time: 5 milliseconds ⚡
That's a 6,000x improvement. This is why indexing is the single most impactful database optimization.
How Indexes Work (B-Tree)
Most databases use a B-Tree (Balanced Tree) for indexes. Think of it as a sorted, hierarchical structure:
[M]
/ \
[D, H] [R, W]
/ | \ / | \
[A-C][E-G][I-L][N-Q][S-V][X-Z]
To find "Sahil":
Only 3 steps to search through millions of records. This is O(log n) complexity.
Types of Indexes
Primary Index
Automatically created on the primary key. Every table has one.
Secondary Index
Created manually on columns you frequently search by.
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
Composite Index
Index on multiple columns. Order matters!
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- This query uses the index ✅
SELECT * FROM orders WHERE user_id = 123 AND created_at > '2026-01-01';
-- This query also uses the index ✅ (leftmost prefix)
SELECT * FROM orders WHERE user_id = 123;
-- This query does NOT use the index ❌
SELECT * FROM orders WHERE created_at > '2026-01-01';
Rule: Composite indexes work left-to-right. If you index (A, B, C), queries on A, (A,B), or (A,B,C) use the index. Queries on just B or C don't.
Unique Index
Ensures no duplicate values in the column.
CREATE UNIQUE INDEX idx_users_email ON users(email);
-- Now two users can't have the same email
The Trade-Off
Indexes aren't free. They come with costs:
Storage: Each index takes disk space (can be 10-30% of table size)
Write speed: Every INSERT, UPDATE, DELETE must also update the index
Maintenance: Too many indexes slow down writes significantly
Reads: With index = FAST ⚡ | Without index = SLOW 🐌
Writes: With index = SLOWER | Without index = FASTER
Rule of thumb: Index columns that you frequently use in WHERE, JOIN, and ORDER BY clauses. Don't index everything.
When NOT to Index
Real-World Example
At Spense, when a user checks their transaction history:
-- Without index: scans ALL transactions for ALL users
SELECT * FROM transactions WHERE user_id = 12345 ORDER BY created_at DESC;
-- Time: 15 seconds with 50 million rows
-- With composite index on (user_id, created_at):
-- Jumps directly to user 12345's transactions, already sorted
-- Time: 8 milliseconds
Key Takeaway
Database indexing = "Build a table of contents for your data." It's the #1 performance optimization for databases. In interviews, whenever someone mentions slow queries, your first question should be: "Do we have the right indexes?" It shows practical, real-world thinking.