1

Edit: I use Microsoft SQL Server Management Studio to run my queries. Microsoft SQL Server Standard Edition (64-bit) I guess.

I've got a complex query to run. I had no idea how to do it, so I went and write a query that's around 500 rows and many joins. Not how things should be dome.

My data structure is:

id        user_id       question_id     answer1     
1         1             1               a           
2         1             2               c           
3         1             3               a           
4         2             1               c           
5         2             2               a           
...       ...           ...             ...         

There are over 700 users. Each user has answered around 60 questions (times 2, there are 2 answers to each question, but this is irrelevant if I can get a nice query for the first answer). There are some questions I have no interest in (and should actually skip).

My current result is the following (good result, but the query is way too heave):

user_id    q1    q2    q3    q4    q8    q9    ...   q60
1          a     b     c     d     b     a     ...   a
2          b     a     c     a     c     b     ...   w
3          y     a     w     ...   ...

So, basically I'd like to create an array, say [1,2,3,4,8,9], saying I am interested in those question id's, and query so I get those columns as the above example. I have no idea how to do this.

My current query is in the likes of:

SELECT C.user_id, Q1, Q2, Q3, Q4, Q8, ...
FROM (
  SELECT A.user_id, Q1, // and here tons of unions
7
  • In that case you have to write Dynamic Sql i.e. using array list generate query and then exec it. In sql server you can run dynamic query by EXEC("Select ...") Commented Oct 20, 2015 at 6:27
  • Your sample data and output dont match, is that intentional? Also, tag your DBMS pls. Commented Oct 20, 2015 at 6:35
  • 1
    Which DBMS are you using? Postgres? Oracle? Commented Oct 20, 2015 at 6:37
  • Oracle. Yes, output doesn't match, didn't really pay attention to that. The content doesn't matter, it's about the table structure. Commented Oct 20, 2015 at 6:52
  • "I use Microsoft SQL Server Management Studio" doesn't fit with "I use Oracle" SMSS can't be used together with Oracle. Commented Oct 20, 2015 at 7:01

2 Answers 2

1

EDIT: simplified using only answer1:

DECLARE @tbl TABLE(id INT,user_id INT,question_id INT,answer1 VARCHAR(1),answer2 VARCHAR(1));
INSERT INTO @tbl VALUES     
 (1,1,1,'a','x')           
,(2,1,2,'c','y')           
,(3,1,3,'a','y')           
,(4,2,1,'c','y')         
,(5,2,2,'a','x');

WITH AllAnswers AS
(
    SELECT p.*
    FROM
    (
        SELECT tbl.user_id
              ,'q' + CAST(tbl.question_id AS VARCHAR(10)) AS columnName
              ,answer1
        FROM @tbl AS tbl
    ) AS x
    PIVOT
    (
        MIN(answer1) FOR columnName IN(q1,q2,q3 /*Add your question numbers here*/)
    ) AS p
)
SELECT aa.user_id 
      ,aa.q1
      ,aa.q2
      ,aa.q3 
      /*Get only the questions you want*/
FROM AllAnswers AS aa

/* Result

user_id q1  q2  q3
1       a   c   a
2       c   a   NULL

*/

Could this be a solution for you:

Just a short explanation: As you write about two answers for each question I put this second answer into the test set. To allow a PIVOT with more than one column I use the trick to concat both answers as a pseudo-XML. This is pivoted and taken apart by its internal index again.

The result is a list of all users with all there answers to all questions. With the last SELECT you can choose which questions / answers you want to fetch.

DECLARE @tbl TABLE(id INT,user_id INT,question_id INT,answer1 VARCHAR(1),answer2 VARCHAR(1));
INSERT INTO @tbl VALUES     
 (1,1,1,'a','x')           
,(2,1,2,'c','y')           
,(3,1,3,'a','y')           
,(4,2,1,'c','y')         
,(5,2,2,'a','x');

WITH AllAnswers AS
(
    SELECT p.user_id
          ,CAST(p.q1 AS XML).value('x[1]','varchar(1)') AS q1_1
          ,CAST(p.q1 AS XML).value('x[2]','varchar(1)') AS q1_2
          ,CAST(p.q2 AS XML).value('x[1]','varchar(1)') AS q2_1
          ,CAST(p.q2 AS XML).value('x[2]','varchar(1)') AS q2_2
          ,CAST(p.q3 AS XML).value('x[1]','varchar(1)') AS q3_1
          ,CAST(p.q3 AS XML).value('x[2]','varchar(1)') AS q3_2
          /*Add all your question numbers here*/
    FROM
    (
        SELECT tbl.user_id
              ,'q' + CAST(tbl.question_id AS VARCHAR(10)) AS columnName
              ,'<x>' + ISNULL(answer1,'') + '</x><x>' + ISNULL(answer2,'') + '</x>' AS BothAnswers
        FROM @tbl AS tbl
    ) AS x
    PIVOT
    (
        MIN(BothAnswers) FOR columnName IN(q1,q2,q3 /*Add your question numbers here*/)
    ) AS p
)
SELECT aa.user_id 
      ,aa.q1_1 
      ,aa.q1_2 
      ,aa.q2_1 
      ,aa.q2_2 
      ,aa.q3_1 
      ,aa.q3_2 
      /*Get only the questions you want*/
FROM AllAnswers AS aa
Sign up to request clarification or add additional context in comments.

2 Comments

I'll get to work on this, thanks for the elaborate example! With the "two answers", I mean that the table has two answer columns (see the 'answer1', there is also an 'answer2'). I do not need to look at the 2nd answer: I want two result tables, one for answer1 and one for answer2. So I could use the query and set it to take the answer from column answer1, and then the same query but from the anwer2 column.
@Tjab, I simplified my example to use only answer1. I do not think, that this can be done much easier... In the final SELECT you have the free choice to select only the columns you want
0
SELECT *
FROM
(
  SELECT [user_id], 
         [answer1],
         row_number() OVER(PARTITION BY [user_id] ORDER BY [question_id]) rn
  FROM [table]
) d
PIVOT
(
  MAX([answer1])
  FOR rn in ([1], [2], [3], [4], [5], [6])
) piv

Works very well, for one bug. user_id = 1 has no answers for question_id = 3, but it has for question_id = 4. My result places the answer on 4 in to 3:

user_id   1    2    3    4    5    6
1         a    b    c         NULL NULL
2         a    b    c    d    NULL NULL

With this data:

user_id    question_id    answer1
1          1              a
1          2              b
1          4              c
1          5                                      (anwer1 = empty string)
1          6              NULL
2          1              a
2          2              b
2          4              c
2          5              d
2          6              NULL

Should have given:

user_id   1    2    3    4    5    6
1         a    b    NULL c         NULL
2         a    b    NULL c    d    NULL

So the bug: if no record is found, the answer to the 'next question' is used.

3 Comments

ROW_NUMBER will come back with a running number. If there is no answer in the table, your "rn" will skip one value. You could try using a Tally-Table, but my approach in my answer is surely more reliable... what's wrong with it?
Hi, could you solve your problem? Did my (edited) answer help you? If yes, it would be very kind to vote it up and - if it solved you problem - mark it as accepted, thx!
Yes, thanks! Sorry I got back so very, very late. Marked as accepted, thanks again!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.