Sparse & Dense Indexes in DBMS

#database #sql #systemdesign

Database Indexes are most crucial for efficient query performance, as discussed in my previous post,Understanding Database Indexes: A Library Analogy

Continuing with the library analogy, let's explore two types of indexes: dense and sparse, focusing on the book_title column as a unique key.
Here is the table from the previous post.

Book Title	Stream	Section	Rack	Position
Database Internals	Engineering	A	5	2
SQL Performance Tuning	Engineering	A	4	1
NoSQL Databases: A Practical Guide	Engineering	A	6	3
Quantum Circuits	Engineering	A	3	4
Regenerative Medicine	Medical	B	2	5

Dense Indexing

In dense indexing , there's an entry for every record in the table. Imagine a library catalog listing every book with its exact location (e.g., Section A, Rack 5, Position 2). The index table stores the key (book_title) and a record pointer- a reference to the record's location on disk.

Caption: A catalog listing every book in the library, with each entry pointing to its exact location.

Pros

Fast Lookups: Direct access to every record makes queries like SELECT * FROM books where book_title='Database Internals'; extremely quick.
Ideal for Random Access: Perfect for frequent lookups on unique keys, such as product_id in an e-commerce database. ### Cons
High Storage Overhead: As the table grows, the index table grows proportionally, consuming significant disk space.
Slow Write Operations: Insertion, deletions and updates require modifying both the main table and the index table, increasing memory and time costs.

Sparse Indexing

In sparse indexing , the index table stores entries for only a subset of records-typically one per data block (a group of records stored together on disk). Think of a library catalog listing only the first book on each shelf, requiring you to search within the shelf to find the exact book.

Caption: A catalog listing only the first book on each shelf, requiring a search within the shelf to find the exact book.

How does Sparse index work?

Let's look up Regenerative Medicine. The sparse index has entries for Quantum Circuits and SQL Performance Tuning. Since Regenerative Medicine falls alphabetically between these, the database starts at the block pointed to by Quantum Circuits and performs a sequential scan within that block and find a record.

Note: Sparse indexes work effectively when the data is ordered (e.g., in a clustered index or a B-tree), as this ensures records within blocks are predictable.

Pros

Low Storage Overhead: Indexes only a subset of records, making it space-efficient for large database.
Efficient for Range Queries: Ideal for queries SELECT * FROM books WHERE book_title BETWEEN 'Quantum Circuits' AND 'SQL Performance Tuning'; as it scans fewer index entries.
Lower Maintainance: Insertions and deletions have less impact since the index table updates less frequently.

Cons

Slower Lookups for Random Access: Requires scanning within a block to find the exact record, making it slower than dense indexes for specific lookups.
Requires Ordered Data: Sparse indexes are most effective when the data is ordered, which may not always be the case.

Conclusion

Dense and sparse indexes offer distinct trade-offs in DBMS. Dense indexes excel in random lookups but consume more space, while sparse indexes save storage and optimize range queries.

DEV Community

Sparse & Dense Indexes in DBMS

Dense Indexing

Pros

Sparse Indexing

How does Sparse index work?

Pros

Cons

Conclusion

Top comments (0)