← Back to Learn
Beginner2 April 20264 min read

Database Indexing — The Table of Contents for Your Data

Why is your database query taking 30 seconds? Probably missing an index. Let's understand database indexing with a textbook analogy.

databaseindexingperformance
Share:

The Textbook Analogy

Imagine a 500-page textbook with no table of contents and no index at the back. To find information about "Binary Search Trees," you'd have to read every single page from page 1 to page 500. That's painfully slow.

Now imagine the same book WITH an index at the back:

Index:
Binary Search Trees ......... page 234
Hash Maps ................... page 156
Linked Lists ................ page 89

You look up "Binary Search Trees" in the index, go directly to page 234. Done in seconds.

Database indexing works exactly the same way. Without an index, the database scans every row (full table scan). With an index, it jumps directly to the matching rows.

The Impact

Users table: 10 million rows

Without index:
SELECT * FROM users WHERE email = 'sahil@email.com'
→ Scans all 10 million rows
→ Time: 30 seconds 🐌

With index on email column:
SELECT * FROM users WHERE email = 'sahil@email.com'
→ Looks up index, jumps to exact row
→ Time: 5 milliseconds ⚡

That's a 6,000x improvement. This is why indexing is the single most impactful database optimization.

How Indexes Work (B-Tree)

Most databases use a B-Tree (Balanced Tree) for indexes. Think of it as a sorted, hierarchical structure:

                    [M]
                   /   \
              [D, H]   [R, W]
             / | \     / | \
           [A-C][E-G][I-L][N-Q][S-V][X-Z]

To find "Sahil":

1. Start at root: S > M → go right
2. Next level: S is between R and W → go middle
3. Leaf level: Found "Sahil" in [S-V] bucket

Only 3 steps to search through millions of records. This is O(log n) complexity.

Types of Indexes

Primary Index

Automatically created on the primary key. Every table has one.

Secondary Index

Created manually on columns you frequently search by.

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);

Composite Index

Index on multiple columns. Order matters!

CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

-- This query uses the index ✅
SELECT * FROM orders WHERE user_id = 123 AND created_at > '2026-01-01';

-- This query also uses the index ✅ (leftmost prefix)
SELECT * FROM orders WHERE user_id = 123;

-- This query does NOT use the index ❌
SELECT * FROM orders WHERE created_at > '2026-01-01';

Rule: Composite indexes work left-to-right. If you index (A, B, C), queries on A, (A,B), or (A,B,C) use the index. Queries on just B or C don't.

Unique Index

Ensures no duplicate values in the column.

CREATE UNIQUE INDEX idx_users_email ON users(email);
-- Now two users can't have the same email

The Trade-Off

Indexes aren't free. They come with costs:

Storage: Each index takes disk space (can be 10-30% of table size)

Write speed: Every INSERT, UPDATE, DELETE must also update the index

Maintenance: Too many indexes slow down writes significantly

Reads:  With index = FAST ⚡  |  Without index = SLOW 🐌
Writes: With index = SLOWER   |  Without index = FASTER

Rule of thumb: Index columns that you frequently use in WHERE, JOIN, and ORDER BY clauses. Don't index everything.

When NOT to Index

• Small tables (< 1000 rows) — full scan is fast enough
• Columns with low cardinality (e.g., gender with only M/F values)
• Tables with heavy writes and few reads
• Columns you rarely search or filter by

Real-World Example

At Spense, when a user checks their transaction history:

-- Without index: scans ALL transactions for ALL users
SELECT * FROM transactions WHERE user_id = 12345 ORDER BY created_at DESC;
-- Time: 15 seconds with 50 million rows

-- With composite index on (user_id, created_at):
-- Jumps directly to user 12345's transactions, already sorted
-- Time: 8 milliseconds

Key Takeaway

Database indexing = "Build a table of contents for your data." It's the #1 performance optimization for databases. In interviews, whenever someone mentions slow queries, your first question should be: "Do we have the right indexes?" It shows practical, real-world thinking.

👨‍💻
Sahil Sudan

Software Engineer at Spense. I write about system design, web development, and fintech — explained simply for students and developers.

📬 Stay Updated

Get a new System Design or fintech insight every week. No spam, unsubscribe anytime.

Share: