3

I have a query that I am writing in my C# code to retrieve results from database. Based on the user input, my query will change.
I have a user input say 'challenge'. And 'challenge' can have three values 0,1 and 2, based on what user selects in UI.
If user selects challenge 0, the query will be different.
If user selects challenge 1, the query will be different.
And if user selects challenge 2, the query will be union of query for challenge 0 and query for challenge 1.

Below is how my code for generating sql query looks at present:

         string sql = @"SELECT
            tbl.given_id, name, message, tbl.emplid, category, created_date
        FROM
            dbo.myTable tbl

        WHERE
           created_date >= @dtFrom
            AND created_date < @dtTo
        ";


    //Include only challenge 0  
         if(challenge == 0)
                        {
                            sql += " AND emplid IN (" + sb.ToString() + ")";  //sb.ToString() is list of emplids selected from UI.
                        }

    //Include only challenge 1    
         if (challenge == 1)
                        {
                            sql += " AND given_Id IN (SELECT DISTINCT rec.given_Id  from dbo.Recipient rec WHERE rec.given_Id = tbl.given_Id AND rec.emplid IN (" + sb.ToString() + ")) ";
                        }  


//Include both challenge 0 and 1

    if (challenge == 2)
                    {
                        sql = String.Format(@"
    SELECT * FROM
    (
        (
            SELECT
                tbl.given_id, name, message, tbl.emplid, category, created_date
            FROM
                dbo.myTable tbl             
            WHERE
                created_date >= @dtFrom
                AND created_date < @dtTo
                AND tbl.emplid IN ({0})
        )
        UNION
        (
           SELECT
                tbl.given_id, name, message, tbl.emplid, category, created_date
            FROM
                dbo.myTable tbl             
            WHERE
                created_date >= @dtFrom
                AND created_date < @dtTo
                AND given_Id IN (SELECT DISTINCT rec.given_Id  from dbo.Recipient rec WHERE rec.given_Id = tbl.given_Id AND rec.emplid IN (" + sb.ToString() + "))

        )

    )
    ", sb.ToString());

                    }   

So, if(challenge == 2), is generating the query by the union of query generated in challenge = 0 and challenge = 1.
The query code in challenge = 2, is just repetitive of both the above queries.
Is there any other way of writing such query, so that I can improve my query and increase the perfomance?

Thanks.

Edit - Writing just the queries generated to simplify my question:

Query 1 -

SELECT
                tbl.given_id, name, message, tbl.emplid, category, created_date
            FROM
                dbo.myTable tbl             
            WHERE
                created_date >= @dtFrom
                AND created_date < @dtTo
                AND tbl.emplid IN (@lst_emplids)  

Query 2

SELECT
                    tbl.given_id, name, message, tbl.emplid, category, created_date
                FROM
                    dbo.myTable tbl             
                WHERE
                    created_date >= @dtFrom
                    AND created_date < @dtTo
                    AND given_Id IN (SELECT DISTINCT rec.given_Id  from dbo.Recipient rec WHERE rec.given_Id = tbl.given_Id AND rec.emplid IN (@lst_emplids)  

Query 3

SELECT * FROM
        (
            (
                SELECT
                    tbl.given_id, name, message, tbl.emplid, category, created_date
                FROM
                    dbo.myTable tbl             
                WHERE
                    created_date >= @dtFrom
                    AND created_date < @dtTo
                    AND tbl.emplid IN (@lst_emplids)
            )
            UNION
            (
               SELECT
                    tbl.given_id, name, message, tbl.emplid, category, created_date
                FROM
                    dbo.myTable tbl             
                WHERE
                    created_date >= @dtFrom
                    AND created_date < @dtTo
                    AND given_Id IN (SELECT DISTINCT rec.given_Id  from dbo.Recipient rec WHERE rec.given_Id = tbl.given_Id AND rec.emplid IN (@lst_emplids)  

            )

        )
11
  • 5
    That's just Epic! Have you heard of SP? Commented Aug 25, 2015 at 14:29
  • Can you simply show the second sql query that is relevant without the C# code which is irrelevant? It also helps if you show us the involved columns and types and mention available indices. Commented Aug 25, 2015 at 14:30
  • 4
    Also, be aware, you have a big Sql Injection problem with this code. Commented Aug 25, 2015 at 14:32
  • 2
    @Christina: what i've meant is: show us the complete generated sql since the C# code is just confusing and not relevant. Commented Aug 25, 2015 at 14:39
  • 2
    Using ToString on a List<string> gives you "System.Collections.Generic.List`1[System.String]", that's not very useful in a SQL query. That's why I asked what type it was. Commented Aug 25, 2015 at 14:48

5 Answers 5

2

There are a number of things you can do. First, it would be worthwhile to have SQL do the heavy lifting. You can pass the parameters to it and let it figure out how to arrange the query based on the arguments.

To take best advantage, you'd wrap your SQL up in a pre-defined procedure or function. In this case, I think I'd go with a function - but either is fine. My example is a function.

In either case, you can have SQL treat your list of employeeIDs as a table. To have that support, you'd create a table type in SQL:

create type dbo.IntIdsType as table ( Id int )

Table types are just declarations of structure - they're not table's per se. You can subsequently declare variables of that type.

Once you have a table type, your function would take all the arguments and figure out what to do:

Edit 2: I provided some sketchy conditional logic earlier - this has been modified in the function below:

create function dbo.GetMyTable
( 
    @from datetime, 
    @to datetime, 
    @challenge int, 
    @employeeIds dbo.IntIdsType readonly 
)
returns table as return
    select
        tbl.given_id, name, message, tbl.emplid, category, created_date
    from
        dbo.myTable tbl             
    where
        created_date >= @dtFrom 
        and
        created_date < @dtTo 
        and
        (
            (
                @challenge = 0 
                and 
                tbl.emplid in ( select Id from @employeeIds )
            )
            or
            (
                @challenge = 1
                and
                given_Id in 
                (
                    select 
                        rec.given_Id  
                    from 
                        dbo.Recipient rec 
                    where 
                        rec.given_Id = tbl.given_Id 
                        and 
                        rec.emplid in ( select Id from @employeeIds )
                )
            )
            or
            (
                @challenge = 2
                and
                (
                    (
                        @challenge = 0 
                        and 
                        tbl.emplid in ( select Id from @employeeIds )
                    )
                    or
                    (
                        given_Id in 
                        (
                            select 
                                rec.given_Id  
                            from 
                                dbo.Recipient rec 
                            where 
                                rec.given_Id = tbl.given_Id 
                                and 
                                rec.emplid in ( select Id from @employeeIds )
                        )
                    )
                )
            )
        )

Note the lack of a union - and how this lets SQL decide to use a predicate based on the value of the arguments. Also - notice how the passed in table-valued parameter is used just like it was a table.

Setting up the call is a little noisy - maybe something like:

async Task GetMyData( DateTime fromDate, DateTime toDate, int challenge, params int[ ] emloyeeIds )
{

  using ( var connection = new SqlConnection( "my connection string" ) )
  {
    connection.Open( );  // forgot this earlier
    using ( var command = connection.CreateCommand( ) )
    {
      //--> set up command basics...
      command.CommandText = @"
        select 
          given_id, name, message, emplid, category, created_date 
        from 
          GetMyTable( @from, @to, @challenge, @employeeIds )";
      command.CommandType = System.Data.CommandType.Text;

      //--> easy parameters...
      command.Parameters.AddWithValue( "@from", fromDate );
      command.Parameters.AddWithValue( "@to", toDate );
      command.Parameters.AddWithValue( "@challenge", challenge );

      //--> table-valued parameter...
      var table = new DataTable( "EmployeeIds" );
      table.Columns.Add( "Id", typeof( int ) );
      foreach ( var Id in emloyeeIds ) table.Rows.Add( Id );
      var employeeIdsParameter = new SqlParameter( "@employeeIds", SqlDbType.Structured );
      employeeIdsParameter.TypeName = "dbo.IntIdsType";
      employeeIdsParameter.Value = table;
      command.Parameters.Add( employeeIdsParameter );

      //--> do the work...
      using ( var reader = await command.ExecuteReaderAsync( ) )
      {
        while ( await reader.ReadAsync( ) )
        {
          //...
        }
      }
    }
  }
}

An advantage of all this - besides the hoped-for performance gains - you avoid weird logic to build out the SQL strings and the SQL injection attack surface that approach brings.

EDIT:

In light of the comments: In the c# code, it's not returning a string to a caller - it's showing the setup of the command object, how to add parameters to it, and then how to execute it. This is probably currently being done in the procedure that is calling your string-building code now. This is a fairly typical setup for things like this - but there are lots of variations. For example, the calling code could hand you the command object, and you fill it out. Lots of different patterns get employed.

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the response. I am following your idea here. But I am not getting how the function, dbo.GetMyTable, that we created in sql server will be called from our query that we are generating in C# code in command.CommandText? Sorry if my question sounds like a real beginner, but I am really new to this.
Notice in the c# part (the last snippet) we're calling the function we created in the next-to-last snippet. We're selecting from that SQL function with arguments. So, your SQL all went into the definition of the SQL function - while the c# code only calls the SQL function. Its a somewhat better separation of concerns. If you've never seen user-defined SQL functions, it can be a bit of a mystery - but it's a habit for me - and so it doesn't look so scary :-)
So - again, if you're not familiar with SQL functions - that "create function" snippet gets run once to define the function. You might need to get your dbas to look at it if you and they live in separate worlds. Both the "create type" statement - and the "create function" statements. I will edit my answer to be a little more clear about the plumbing.
Thank you so much for the detailed description and for your response. I just tried to implement it. But looks like somehow the conditions are not working as expected. So, what's happening is, If challenge = 2, only then it's retrieving the records and that too not both (as when challenge = 2, it's supposed to return the records where challenge = 0 and challenge = 1). At present, for challenge = 2, its returning the records where challenge = 0 and not for 2 (when meant to return both). And when challenge = 0 or challenge = 1, its not returning any record.
I am gonna try to modify the conditions, but if you have some idea, that will be really appreciable. Again, thanks a lot :)
|
2

The nested IN business in the second and third queries doesn't look very good. It might be personal preference over anything else but I have seen cases where the optimizer can't create an optimal plan because of nested IN operators. Especially if your DISTINCT gets in the way - I'm not sure whether this will be dutifully ignored or actually executed, when all you really need is a single EXISTS (see the bottom of this answer). Without seeing the generated plan I can't say too much more though.

In fact the way you have written the subquery is redundant - you are using both WHERE given_Id IN and a correlation WHERE rec.given_Id = tbl.given_Id. This is probably the real problem. We only want one of these.

What would I do? First of all, I would pull the correlated subquery stuff out as a join:

select tbl.given_id, tbl.name, tbl.message, tbl.emplid, tbl.category, tbl.created_date
from dbo.myTable tbl
join dbo.Recipient r on tbl.given_Id = r.given_Id
where created_date >= @dtFrom
and created_date < @dtTo
and r.emplid IN ( ... )

Second of all, in the "both" case I would run away from UNION in favor of OR. The duplicate elimination is unnecessary and costly:

select tbl.given_id, tbl.name, tbl.message, tbl.emplid, tbl.category, tbl.created_date
from dbo.myTable tbl
join dbo.Recipient r on tbl.given_Id = r.given_Id
where created_date >= @dtFrom
and created_date < @dtTo
and (r.emplid IN ( ... ) or tbl.emplid IN ( ... ))

Note I am assuming your sb.ToString() is safe, which is probably a bad idea. In my book, dynamic SQL is an ultimate last resort. You could write this into a stored procedure; instead of the IN, you would pass a table-valued parameter then join against it.

I'm also assuming some things about your schema; you may indeed want a single DISTINCT on that last query. Or frankly you can change it to an EXISTS:

select tbl.given_id, tbl.name, tbl.message, tbl.emplid, tbl.category, tbl.created_date
from dbo.myTable tbl
where created_date >= @dtFrom
and created_date < @dtTo
and (tbl.emplid IN ( ... ) or exists (
    select 1
    from dbo.Recipient r
    where r.given_Id = tbl.given_Id
    and r.emplid IN ( ... )))

Comments

-1

Why not write a single query along the lines of:

SELECT <stuff in first query>
WHERE ... 
AND @Challenge in (1,3)
UNION
SELECT <stuff in second query>
WHERE ...
AND @Challenge in (2,3)

And let the query optimizer do the hard work for you.

Comments

-1

I don't like concatenated SQL queries. Prefer to have separate query for each condition or one query which operates this condition. Depending on difficulty.

As there are only two separate conditions and one else combining it - we could use boolean logic.

SELECT
    tbl.given_id, 
    name, 
    [message], 
    tbl.emplid, 
    category, 
    created_date
FROM
    dbo.myTable tbl             
WHERE 
    created_date >= @dtFrom
    AND created_date < @dtTo
    AND ((@challenge <> 2 AND tbl.emplid IN (@lst_emplids)) 
      OR (@challenge <> 1 AND given_Id IN (SELECT DISTINCT rec.given_Id  from dbo.Recipient rec WHERE rec.given_Id = tbl.given_Id AND rec.emplid IN (@lst_emplids)))

If @challenge = 1 or 3 will be filtered all items tbl.emplid IN (@lst_emplids)
If @challenge = 2 or 3 will be filtered all items given_Id IN (...)
So if @challenge is 3 - will be extracted all items, but if 1 or 2 - only one subset.

1 Comment

While the test for @challenge is a good suggestion, this is not very good as an answer. You should consider using words to explain what is going on and why, instead of just posting a block of code.
-2

A more sensible solution would be to call this code from a stored procedure and use dynamic sql in that to build your query , and use parameterised query rather then concatenating values from the UI, it makes your code prone to sql-injection.

I would do it something like......

CREATE PROCEDURE my_SP
   @dtFrom      DATE
 , @dtTo        DATE
 , @Sb          VARCHAR(8000)
 , @challenge   INT
 AS
 BEGIN
   SET NOCOUNT ON;
DECLARE @Sql NVARCHAR(MAX);

SET @Sql = N'  declare @xml xml = N''<root><r>'' + replace(@Sb,'','',''</r><r>'') + ''</r></root>'';
SELECT tbl.given_id, name, message, tbl.emplid, category, created_date
FROM  dbo.myTable tbl
WHERE created_date >= @dtFrom
 AND  created_date < @dtTo'

+ CASE WHEN (@challenge = 0)
         THEN N'AND emplid IN ( select r.value(''.'',''varchar(max)'') as item
                                from @xml.nodes(''//root/r'') as records(r))' 
       WHEN  (@challenge = 1)
         THEN N'AND given_Id IN (SELECT rec.given_Id  
                                from dbo.Recipient rec 
                                WHERE rec.given_Id = tbl.given_Id 
                                AND rec.emplid IN ( select r.value(''.'',''varchar(max)'') as item
                                                    from @xml.nodes(''//root/r'') as records(r))
                                )'
        WHEN (@challenge = 2)
          THEN 'AND 
                    ( emplid IN ( select r.value(''.'',''varchar(max)'') as item
                                  from @xml.nodes(''//root/r'') as records(r))
                      OR 
                      given_Id IN (SELECT rec.given_Id  
                                from dbo.Recipient rec 
                                WHERE rec.given_Id = tbl.given_Id 
                                AND rec.emplid IN ( select r.value(''.'',''varchar(max)'') as item
                                                    from @xml.nodes(''//root/r'') as records(r))
                                )
                     )'
         ELSE N'' END

Exec sp_executesql @Sql 
                  ,N'@dtFrom DATE, @dtTo DATE, @Sb VARCHAR(8000)'
                  ,@dtFrom 
                  ,@dtTo 
                  ,@Sb 
END

1 Comment

Comments are not for extended discussion; this conversation has been moved to chat.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.