Database Indexes are most crucial for efficient query performance, as discussed in my previous post,Understanding Database Indexes: A Library Analogy
Continuing with the library analogy, let's explore two types of indexes: dense and sparse, focusing on the book_title column as a unique key.
Here is the table from the previous post.
Book Title | Stream | Section | Rack | Position |
---|---|---|---|---|
Database Internals | Engineering | A | 5 | 2 |
SQL Performance Tuning | Engineering | A | 4 | 1 |
NoSQL Databases: A Practical Guide | Engineering | A | 6 | 3 |
Quantum Circuits | Engineering | A | 3 | 4 |
Regenerative Medicine | Medical | B | 2 | 5 |
Dense Indexing
In dense indexing , there's an entry for every record in the table. Imagine a library catalog listing every book with its exact location (e.g., Section A, Rack 5, Position 2). The index table stores the key (book_title) and a record pointer- a reference to the record's location on disk.
Caption: A catalog listing every book in the library, with each entry pointing to its exact location.
Pros
-
Fast Lookups: Direct access to every record makes queries like
SELECT * FROM books where book_title='Database Internals';
extremely quick. -
Ideal for Random Access: Perfect for frequent lookups on unique keys, such as
product_id
in an e-commerce database. ### Cons - High Storage Overhead: As the table grows, the index table grows proportionally, consuming significant disk space.
- Slow Write Operations: Insertion, deletions and updates require modifying both the main table and the index table, increasing memory and time costs.
Sparse Indexing
In sparse indexing , the index table stores entries for only a subset of records-typically one per data block (a group of records stored together on disk). Think of a library catalog listing only the first book on each shelf, requiring you to search within the shelf to find the exact book.
Caption: A catalog listing only the first book on each shelf, requiring a search within the shelf to find the exact book.
How does Sparse index work?
Let's look up Regenerative Medicine
. The sparse index has entries for Quantum Circuits
and SQL Performance Tuning
. Since Regenerative Medicine falls alphabetically between these, the database starts at the block pointed to by Quantum Circuits and performs a sequential scan within that block and find a record.
Note: Sparse indexes work effectively when the data is ordered (e.g., in a clustered index or a B-tree), as this ensures records within blocks are predictable.
Pros
- Low Storage Overhead: Indexes only a subset of records, making it space-efficient for large database.
-
Efficient for Range Queries: Ideal for queries
SELECT * FROM books WHERE book_title BETWEEN 'Quantum Circuits' AND 'SQL Performance Tuning';
as it scans fewer index entries. - Lower Maintainance: Insertions and deletions have less impact since the index table updates less frequently.
Cons
- Slower Lookups for Random Access: Requires scanning within a block to find the exact record, making it slower than dense indexes for specific lookups.
- Requires Ordered Data: Sparse indexes are most effective when the data is ordered, which may not always be the case.
Conclusion
Dense and sparse indexes offer distinct trade-offs in DBMS. Dense indexes excel in random lookups but consume more space, while sparse indexes save storage and optimize range queries.
Top comments (0)