Intro to SQL Indexes

In this reading, you’ll learn what SQL indexes are, why they can drastically improve lookup performance, and why they come with tradeoffs for insertion and deletion. By the end, you’ll be able to:

What is a SQL Index?

A SQL index is a special data structure used by SQL databases to speed up searches by reordering or sorting the data in specific columns. If you have a table with tens of thousands—or even millions—of rows, searching without an index becomes slow because the database must scan every row (an O(n) operation). With an index, the database uses a sorted structure (usually a B-tree) to find matching entries in O(log n) time.

For example, a primary key or UNIQUE constraint automatically creates an index on that column. If you query by the primary key, you benefit from the indexed structure.

Bakers and Cookies Schema (placeholder image)
Schema showing how a cookies table might reference a bakers table by a baker_id column.

Suppose you have a cookies table with a baker_id column, and you frequently query:

SELECT * FROM cookies WHERE baker_id = 3;

If baker_id is indexed, the lookup is much faster than if there were no index. This concept holds especially for more complex queries like:

SELECT * FROM cookies
WHERE type = 'sugar' AND chocolate IS TRUE;

If we frequently query by type and chocolate, a multi-column index can optimize these lookups further, as it sorts the table by both columns at once.

How Does SQL Use Indexes?

The power of indexing comes down to time complexity:

Example #1: Lookup with NO Index

Lookup without Index (placeholder)
A table unsorted by baker_id means scanning all rows for matches.

If you’re searching for baker_id = 3, the DB scans every row. That’s O(n) time as it compares each row’s baker_id.

Example #2: Lookup with Index

Once baker_id is indexed, the DB keeps the column values in a B-tree, each entry pointing to the row in the original table. That means:

Lookup with Index (placeholder)
Index on baker_id allows quick O(log n) lookups, since the column is sorted.

With a sorted structure, finding “3” only needs log n comparisons, then the DB retrieves the matching row(s) efficiently.

When Should You Add Indexes?

While indexes make lookups faster, they also make insertions and deletions slower:

This overhead occurs because the database must update the B-tree structure whenever a new row is added or removed.

Example #1: Insertion with NO Index

Insertion without Index (placeholder)
Appending to the end of the table is O(1) with no index to maintain.

Example #2: Insertion with Index

With an index on baker_id, the DB has to insert the new row normally and update the B-tree. That’s a cost of O(log n).

Insertion with Index (placeholder)
The index must place the new baker_id entry in the correct sorted order.

Over large data sets, O(log n) can be close to O(1), but there’s still some overhead. Therefore:

Only use indexes if you read/lookup more often than you insert/delete on those columns.

For example, if your cookies table rarely changes, but you search by type and chocolate constantly, indexing those columns is beneficial. If you frequently insert or delete rows in that table, the index overhead may not be worth it.

Big O Overview

Here’s a rough diagram comparing O(1) and O(log n). Notice that O(log n) grows slowly:

Big-O Chart (placeholder)
As n increases, O(log n) doesn’t explode in size.

Hence, indexing is often a huge net gain for large datasets that see many lookups.

Wrapping Up

Key points:

To learn more, check out this deep-dive on indexing. In the next reading, you’ll discover how to create a SQL index, whether single-column or multi-column, to optimize your queries further.