Performance of a sequence of SQL queries using SqlDataReader

Question

I need to perform a sequence of SQL queries in C#. The queries look similar to these:

...
select top 1 [time],[value] from [table] where [time]<='T0' and [param]='1' order by [time] desc
select top 1 [time],[value] from [table] where [time]>='T0' and [param]='1' order by [time] asc
select top 1 [time],[value] from [table] where [time]<='T1' and [param]='1' order by [time] desc
select top 1 [time],[value] from [table] where [time]>='T1' and [param]='1' order by [time] asc
select top 1 [time],[value] from [table] where [time]<='T2' and [param]='1' order by [time] desc
select top 1 [time],[value] from [table] where [time]>='T2' and [param]='1' order by [time] asc
select top 1 [time],[value] from [table] where [time]<='T3' and [param]='1' order by [time] desc
select top 1 [time],[value] from [table] where [time]>='T3' and [param]='1' order by [time] asc
...

As you can see, I request pairs of (time, value)-tuples for each time T in (T0, T1, T2, ...) where on query of each pair is the tuple "right before" and the other "right after" a certain point in time T. Each single request takes less than 1 millisecond (according to SQL Profiler in SMSS 2012 Express).

In my program (C#.NET) I perform a sequence of those queries using SqlDataReader. However, each query takes approximately 12-20 milliseconds which is far beyond what I expected and which is just not acceptable for my purposes. It seems to me that the overhead of the SqlDataReader is the problem, isn't it?

To code looks similar to this:

foreach (int x = 0; x < screen.width; ++x)
{
    time T = pixel2time(x);

    string cmd = "select top 1 [time],[value] from [table] where [time]<='" + T.ToString() + "' and [param]='1' order by [time] desc";
    SqlCommand scmd = new SqlCommand(cmd, con);

    // The time from here ...
    SqlDataReader reader = scmd.ExecuterReader();
    // ... to here takes about 12-20 milliseconds
    // the same query in SQL Profiler takes 
    // "0 milliseconds"

    if (reader.Read())
    {
        ...
    }
}

(I am plotting a time-value sequence and request for each pixel representing a certain time T on the x-axis the time-value tuple "right before" T and "right after" T to determine the y-value of this pixel. Hence, depending on the screen/window width I may have about 1000 pixels, with conequently 2 x 1000 queries, each taking ~12 milliseconds = 24 seconds. This is far too much for plotting a graph. Further, I have a sequence of 10,000,000 million entries in the database with indexing etc which should give an access time for each query of O(log n), so the database with a query time of less than one millisecond is fine. The problem is just the .NET framework, (or maybe networking?) and the issue that I can't find a more efficient solution.)

How can I solve this performance problem?

I tried/considered the following approaches:

(1) Combining the sequence to a single query to be executed by a single SqlDataReader using "union" statement didn't worked. I guess this is because of some kind incompatibility of "union" with the "order by" statement. Do you know more about this?

Edit: (Update)

select top 1 [time],[value] from [table] where [time]<='T0' and [param]='1' order by [time] desc
union all
select top 1 [time],[value] from [table] where [time]>='T0' and [param]='1' order by [time] asc

Gives the error 'Msg 156, Level 15, State 1, Line 2 Incorrect syntax near the keyword 'union'.'

Each separate query works fine. Do I have a syntax error? Thanks.

(2) I am sure that a stored procedure does not give any benefit since the time for executing a single query takes less than one millisecond in SMSS.

Use Union All rather than union. Also, how many rows does the table have in total? — Magnus
– Magnus, Commented Sep 14, 2013 at 8:30
Thanks, I added an update. I have exactly 10,000,000 rows in my table. — sema
– sema, Commented Sep 14, 2013 at 8:45

Guffa · Accepted Answer · 2013-09-14 08:50:35Z

Going with unions is the right way, but you need to use subqueries as there is only one order by for the entire union, and you should use union all instead of union so that it doesn't remove duplicates:

select [time],[value] from (select top 1 [time],[value] from [table] where [time]<='T0' and [param]='1' order by [time] desc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]>='T0' and [param]='1' order by [time] asc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]<='T1' and [param]='1' order by [time] desc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]>='T1' and [param]='1' order by [time] asc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]<='T2' and [param]='1' order by [time] desc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]>='T2' and [param]='1' order by [time] asc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]<='1453511000' and [param]='1' order by [time] desc) x
union all
select [time],[value] from (select top 1 [time],[value] from [table] where [time]>='1453511000' and [param]='1' order by [time] asc) x

What does the 'x' at the end the queries do? Anyway, with 'x' it works and without it doesn't. Thanks.
@sema: It's the name of the subquery, as each subquery needs a name. It's not specifically used in the query, but you could use it as select x.[time],x.[value] in the outer queries.

BRAHIM Kamel · Accepted Answer · 2013-09-14 08:42:40Z

1

what I can suggest here is to try to create n threads at the some time and make all your queries in parallel I'm sure that you will increase the performance of your queries.Because it's not a datareader issue it depends for example: - on the network packet - marshaling and unmarshalling the result etc ...... I Hope this can help you

answered Sep 14, 2013 at 8:42

BRAHIM Kamel

13.8k1 gold badge38 silver badges47 bronze badges

1 Comment

sema Over a year ago

I will try this, but I honestly prefer a solution that may reduce the overhead instead of parallelizing the queries + overhead. If there is no better way of increasing the performance, I will mark your answer as the "correct answer". Thanks.

Matt · Accepted Answer · 2013-09-14 08:56:56Z

0

The main problem is running the query inside a loop.

find the largest and smallest time value you will need.
query all your points in one go - SELECT time,value FROM [table] WHERE time BETWEEN [smallest]AND[largest]...
loop through the results and work out which pixel (x) value to plot for the result.

Do you really need to plot a point for very pixel or time value? If you want to stick with the pixel by pixel approach it would be faster to get all the results, as above, and get the points by using a linq query to get the value for the time you are plotting.

answered Sep 14, 2013 at 8:56

Matt

2744 silver badges15 bronze badges

2 Comments

sema Over a year ago

This solution does not scale since, I will receive 10,000,000 rows from this query.

Matt Over a year ago

Only if you are plotting every row in the table. Are your time values evenly spaced? Do you have an identity column - if you do there are much faster ways of getting a range of points to plot.

Collectives™ on Stack Overflow

Performance of a sequence of SQL queries using SqlDataReader

3 Answers 3

2 Comments

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

2 Comments

Related