10

Hi i was wondering what the performance overheads are like for single column vs multiple column indexes in terms of inserts. So for example if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns. When i'm talking about performance i'm interested in raw speed.

4 Answers 4

7

The INSERT overhead of INDEXes is normally negligible regardless of single or multi column.

Unless writes significantly out weigh reads, always include whatever indexes are necessary to improve the performance of your queries.

In some case multiple single column indexes sufficiently boost the performance of many more queries than a single multi-column index. More often, a single multi-column index further boosts performance of a smaller set of queries.

The general rule is to consider the query performance, not the insert performance, unless you can predict or see a particular issue with inserts.

Sign up to request clarification or add additional context in comments.

4 Comments

I've found insert overhead to be significant. What basis do you have for "normally negligible"?
@JNK: Except in rare scenarios tables are written to far less frequently than they are read from. By orders of magnitude. Also, when records are being written, the search to position them in an index is a simple search, but when they are being read the searches are mostly more complex. When doing capacity profiling it is most common to see reads significantly dwarf writes. So, in relative terms, optimising writes at the expense of reads is usually counter productive. (And in the scenarios where that is not the case, the designer is usually experienced enough to not ask this on SO.)
@Dems seems that is pretty subjective to the the system or more specifically a particular table.
@StarShip3000: Yes, it is, which is why I say usually. But even when looking at each table individually, I normally see that in the majority of cases the majority activity is reads. Even in cases where extreme volumes of granular data is being recorded, reads more often out-weight writes. Even in the case of holding tables or data-marts, the separation of design has been as more often to stop writes from blocking reads, rather than reads blocking writes. It's just a normal state of affairs that if you're going to store data, you'll probably read it more often than you write it.
4

if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns.

For operations that modify data, a single index with 3 columns should be faster than 3 indexes with single column, for following reasons:

  • Updating an index requires searching through the B-Tree for the correct place to perform the modification. Performing search in one index (even when it is composite) tends to be faster than performing 3 searches in 3 indexes.
  • Each B-Tree leaf contains a row "pointer". 1 index will have 3 times less row pointers than 3 indexes (disregarding the effects of NULL, which is typically not indexed). And smaller is often faster due effects of caching.
  • 1 index may tend to have less splitting and coalescing of the B-Tree nodes compared to 3 indexes.

Consider the following MS SQL Server benchmark:

CREATE TABLE ONE_INDEX (
    ID int PRIMARY KEY NONCLUSTERED,
    F1 uniqueidentifier NOT NULL,
    F2 uniqueidentifier NOT NULL,
    F3 uniqueidentifier NOT NULL
);
CREATE INDEX ONE_INDEX_IE1 ON ONE_INDEX (F1, F2, F3);

CREATE TABLE THREE_INDEXES (
    ID int PRIMARY KEY NONCLUSTERED,
    F1 uniqueidentifier NOT NULL,
    F2 uniqueidentifier NOT NULL,
    F3 uniqueidentifier NOT NULL
);
CREATE INDEX THREE_INDEXES_IE1 ON THREE_INDEXES (F1);
CREATE INDEX THREE_INDEXES_IE2 ON THREE_INDEXES (F2);
CREATE INDEX THREE_INDEXES_IE3 ON THREE_INDEXES (F3);
GO



SET NOCOUNT ON
DECLARE @t DATETIME;
DECLARE @id INT;
DECLARE @count INT;
SET @count = 100000;



PRINT 'ONE_INDEX:'

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    INSERT INTO ONE_INDEX VALUES(@id, NEWID(), NEWID(), NEWID());
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    UPDATE ONE_INDEX SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
DELETE FROM ONE_INDEX;
PRINT '    DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';



PRINT 'THREE_INDEXES:'

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    INSERT INTO THREE_INDEXES VALUES(@id, NEWID(), NEWID(), NEWID());
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    UPDATE THREE_INDEXES SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
DELETE FROM THREE_INDEXES;
PRINT '    DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
GO



DROP TABLE ONE_INDEX;
DROP TABLE THREE_INDEXES;
GO

Which (on my MS SQL Server Express 2008 R2 under virtual machine) prints:

ONE_INDEX:
    INSERT 100000 rows: 4173 ms
    UPDATE 100000 rows: 5530 ms
    DELETE 100000 rows: 2706 ms
THREE_INDEXES:
    INSERT 100000 rows: 6640 ms
    UPDATE 100000 rows: 10436 ms
    DELETE 100000 rows: 3516 ms

Increasing the @count to 1000000 results in:

ONE_INDEX:
    INSERT 1000000 rows: 40143 ms
    UPDATE 1000000 rows: 55796 ms
    DELETE 1000000 rows: 95576 ms
THREE_INDEXES:
    INSERT 1000000 rows: 61360 ms
    UPDATE 1000000 rows: 91766 ms
    DELETE 1000000 rows: 99500 ms

Please note that in most workloads reads outnumber writes, so the optimization efforts usually prioritize covering SELECTs with indexes at the expense of INSERTs, UPDATEs and DELETEs. Also, a composite index will influence query performance in a different way compared to 3 separate indexes.

Only you can perform the appropriate tests and reach the right balance between all these concerns.

Comments

3

I think the only way to answer is to test a few scenarios out on your hardware with your systems usage patterns. Typically index overhead is huge compared to not having indexes. So the more columns in a index the more overhead. But the user perceived impact may seem nil, but profile the different tests and look at the numbers. I've worked at places where they just throw indexes on everything. I don't agree with this approach. I think you should put an index on when you can prove it will be of value. They take up space in addition to adding overhead. Again you will need to test your own setup to answer your question though.

2 Comments

I agree with the sentiment of put an index on when you can prove it will be of value. My answer simply expounds the view that the benefit of an index (that is being used) is rarely less than its cost.
I can agree with that. No need to not use indexes because you think they will hurt performance. In most cases the benefit of the index will far out way the performance hit of having the index.
0

Keep in mind that a composite index made up of 3 columns like LastName, FirstName and City is not useful when you need to search on City. So in this case you would need to have an index just for City.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.