2

I need a query that find the recommended TV shows for an user, based on the TV Shows he is following. Do to this I have the following tables:

  • the table Progress that contains wich show the user is following and the percentage of seen episodes (to solve this problem we can assume I have only one user in the database)

  • the table Suggested that contains _id1,_id2 and value (value is the strength of the connections between the show with id=_id1 and the show with id=_id2: the more value is great, the more the shows have something in common). Note that in this table applies the commutative property, so the strength of the connection between id1 and _id2 is the same of _id1 and _id2. Moreover there aren't two rows such as ROW1._id1=ROW2._id2 AND ROW1._id2 = ROW2._id1

  • the table ShowCache that contains the details about a TV Show, such as name etc..

The following query is what I'm trying to do, but the result is an empty set:

SET @a = 0;   //In other tests this line seem to be necessary
SELECT `ShowCache`.*,
       (SUM(value) * (Progress.progress)) as priority
FROM `Suggested`,`ShowCache`, Progress 
WHERE 

    ((_id2 = Progress.id AND _id1 NOT IN (SELECT id FROM Progress) AND @a:=_id1)//There is a best way to set a variable here?

    OR

    (_id1 = Progress.id AND _id2 NOT IN (SELECT id FROM Progress) AND @a:=_id2))

    AND  `ShowCache`._id = @a   //I think that the query fails here

GROUP BY `ShowCache`._id 
ORDER BY priority DESC 
LIMIT 0,20

I know the problem is related to the usage of variables, but I can't solve it. Any help is really appreciated.


PS: the main problem is that (because of the commutative propriety), without variables I need two queries, wich takes about 3 secs to begin executed (the query is more complex than the above). I'm really trying to make a single query to do this task

PPS: I tied also with an XOR operation, that results in an infinite loop?!?!? Here's the WHERE clause I tried:

((_id2=Progress.id AND @a:=_id1) XOR (_id1=Progress.id AND @a:=_id2)) AND `ShowCache`._id = @a

EDIT: I come up with this WHERE conditions without using any variable:

(_id2 = Progress.id OR _id1 = Progress.id) 
AND `ShowCache`._id = IF(_id2 = Progress.id, _id1,_id2)
AND  `ShowCache`._id NOT IN (SELECT id FROM Progress)

It works, but it is very slow.

2
  • 1
    what's jquery got to do with your question? Commented Apr 23, 2013 at 16:08
  • Ops... wrong tag, I meant query Commented Apr 23, 2013 at 17:03

2 Answers 2

1

Your attempt to use xor is clever. If you want to get the nonmatching value you want to use bitwise XOR which is ^

Progress.id ^_id1 ^ _id2

3 ^ 2 ^ 3 = 2

2 ^ 2 ^ 3 = 3

You can use this trick to setup a join and really simplify your query (eliminate the OR's and NOT IN's and do it in one query without variables.)


select users.name as username, showcache.name as show_name, 
  sum(progress * value) as priority  from users
inner join progress on users.id = progress.user_id
inner join suggested on progress.show_id in (suggested.id_1, suggested.id_2)
inner join showcache on showcache.id = 
  (suggested.id_1 ^ suggested.id_2 ^ progress.show_id)
where showcache.id  not in 
  (select show_id from progress where user_id = users.id)
group by showcache.id
order by priority desc;

I also setup a fiddle to demonstrate it: http://sqlfiddle.com/#!2/2dcd8/24

To break it down. I created a users table with a single user (but the solution will work with multiple users.)

The select and join to progress is straightforward. The join to suggested uses IN as an alternative to writing it with OR

The join to showcache is where the bitwise XOR happens. One of the id's links up to the progress.show_id and we want to use the other one.

It does include a not in to exclude shows already watched from the results. I could have changed it to not exists? but it seems clearer this way.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot! I must ask: in the SELECT would be more correct to use 'progress * SUM(value) as priority' instead of 'progress * value', or am I understanding something wrong?
Thanks again! PS: very clever trick with XOR operation. There is an equivalent operation that 3 x 3 x 2 = 3? XAND?
That's just a & b | c, note that the order does matter here and that is the same as a&b | a&c (| is or, & is and)
Thanks again. Anyway I realize that the select is wrong: it correctly is SUM(progress * value)
0

You're setting @a's value twice within the where clause, meaning that the query is actually boiling down to:

...
WHERE ... AND `ShowCache`._id = _id2

MySQL evalutes variable assignments in a first-encountered order, so you should leave @a constant until the END of the clause, then assign a new value, e.g

mysql> set @a=5;
mysql> select @a, @a+1, @a*5, @a := @a + 1, @a;
+------+------+------+--------------+------+
| @a   | @a+1 | @a*5 | @a := @a + 1 | @a   |
+------+------+------+--------------+------+
|    0 |    1 |    0 |            1 |    1 |
|    1 |    2 |    5 |            2 |    2 |
|    2 |    3 |   10 |            3 |    3 |
+------+------+------+--------------+------+

Note that @a's value in the first 3 columns remains constant, UNTIL mysql reaches the @a := @a +1, after which @a has a new value

So perhaps your query should be

set @a = 0;
select @temp := @a, ..., @a := _id2
where
   ((_id2 = Progress.id AND _id1 NOT IN (SELECT id FROM Progress) AND @temp =_id1)
   ...
etc...

1 Comment

Thanks for the clarifications. Anyway how can I solve the problem (even if it is possible with a single query).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.