3

The SQL code that I have works in HeidiSQL when run over and over again, but when I clone it into PHP and run mysqli_query($db,$sql), it doesn't work.

The following PHP/MySQL code is all valid and works perfectly.

$sql = "select `ID`,`User` from (
            select * from 
                (SELECT 
                    `ID`,
                    `User`,
                    `BI`,
                    (@cnt:= @cnt + (`BI`/(select SUM(`BI`) from `ax`))) as `Entirety`
                from `ax` as `t`
                    CROSS JOIN (SELECT @cnt := 0) AS var
                order by `BI`
            ) d
            where `Entirety`>(@rnd)
            order by `BI`
        ) as `l`
        cross join (select (@rnd := rand()) as `RandomValue`) as var2
        limit 1;";

I then run the $sql through

$result = mysqli_query($db,$sql);
$results = mysqli_fetch_array($result);

where $db is a valid and open connection to the MySQL server. But the return of the object when I do

print_r($result);

comes out as

mysqli_result Object ( 
    [current_field] => 0 
    [field_count] => 2 
    [lengths] => 
    [num_rows] => 0 
    [type] => 0
)

I don't want this and the num_rows should be '1' as that is what shows in HeidiSQL when I run it. Here's an image of HeidiSQL showing the results:

HeidiSQL results

Anyone have any ideas?

The COLUMNS of the table ax are ID, User and BI. In the SQL, it creates two temporary columns called 'Entirety' which is the probability counter anda column called RandomValue which is a rand() from 0 -> 1.

The only row in this table, has values of (1,1,10). Even though it is 10, it has a Entirety of '1' which means it is a 100% guaranteed-to-be-chosen. Despite this, nothing is chosen.

17
  • 1
    Are you absolutely sure that your PHP database connection is pointing at the same database/schema as your HeidiSQL database connection? Commented Nov 28, 2014 at 19:13
  • You can try semicolon from end of query. And try echo mysqli_error($db); after mysqli_query Commented Nov 28, 2014 at 19:14
  • 1
    Does this code always return the same result in HeidiSQL? I notice that you're limiting one of the result sets based on a random value, which presumably could change from one run of the query to the next..? To eliminate this possibility temporarily, try replacing that rand() with a fixed number, like 0.5, say, in both queries, and then see if they get the same result. Commented Nov 28, 2014 at 19:25
  • 1
    Humour me for a second, would you, and just try it with 0.5 hard-coded instead of rand()? I'd like to eliminate that fully before starting to make my head hurt with anything even more improbable ;D Commented Nov 28, 2014 at 19:32
  • 1
    I'd still test it without the random cross join condition. Commented Nov 28, 2014 at 19:32

3 Answers 3

8

If you prepend EXPLAIN to your query in PHP, mysqli tells you the error for one subquery:

Impossible WHERE noticed after reading const tables

and

const row not found

This can mean a lot but in this context my bet is on an invalid value in a column. I'm pretty sure

select (@rnd := rand()) as `RandomValue`

is executed after

where `Entirety`>(@rnd)

and therefore not set.

So, why does it work in your client? Because the client keeps the connection. So it fails the first time you execute the query but at the second time it has the stored value of @rnd from the first execution. You can test that by renaming @rnd in @rnd1 and run the query once.

So, solution is (as Matt pointed out) to set the variable @rnd first and then use it.

Here is a related SO question about user defined variables in subqueries:

User variable in MySQL subquery

From my client:

First time EXPLAIN:

first time execution

Second time EXPLAIN:

second time

Hope this helps.

Sign up to request clarification or add additional context in comments.

4 Comments

Excellent tip with the EXPLAIN. And you're quite right; I'd not noticed that the first time I ran the query in a client it returned no rows. This led me to what I think is a key piece of advice from the MySQL manual: "As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement."
One thing I'd like to know: am I right in thinking that the version of the code in my answer, which appears to "work", is only working by accident, i.e. the query engine happens to evaluate the join with the variable initialisation first, probably because I moved it earlier in the statement, but there's no guarantee that it will always evaluate in that order?
Good catch! I assumed that there is a given evaluation order but I'm not sure that this is really the case. So, to be on the safe side KeirDavis should split the sql statement and set rnd and cnt first.
@Thorsten: or move the initialization of @rnd into the deepest inline view, along with @cnt. The MySQL Reference Manual warns that the behavior we observe is not guaranteed, that is, it may change in some future release. But in MySQL 5.1 and 5.5, that deepest inline view query (var) gets evaluated first and the result is materialized as a derived table, before the query that references it run. (You can see that in the EXPLAIN output.)
4

Here's interesting: if you reverse the order of your cross-joined query for the @rnd variable:

select `ID`,`User` from 
    (select (@rnd := rand()) as `RandomValue`) as var2 CROSS JOIN 
    (
        select * from 
            (SELECT 
                `ID`,
                `User`,
                `BI`,
                (@cnt:= @cnt + (`BI`/(select SUM(`BI`) from `ax`))) as `Entirety`
            from `ax` as `t`
                CROSS JOIN (SELECT @cnt := 0) AS var
            order by `BI`
        ) d
        where `Entirety`>(@rnd)
        order by `BI`
    ) as `l`        
    limit 1;

...then I believe that's logically the same query. However, this version returns the row you'd expect when run from both PHP and a standard MySQL client for me, whereas I can reproduce your lack of a result from PHP with your original.

NB: I still can't tell you why this happens, though. Even if this technically answers your question, I'd personally not mark it as an answer, and leave it to attract other answers from people who may understand what's actually going on, for at least a day or two. If it's still open after 48 hours, I'll put a bounty on it, as I'm curious.

4 Comments

This works, I'll consider this answer correct after 48 hours as it works and does it exactly how I want it. It's just re-ordered which makes no difference to me.
I don't think there's any guarantee it'll carry on working, though—I think this may be related to the order that the variables are assigned/evaluated in, which may change from one run of the same query to the next. Personally, to be on the safe side for now, I'd generate your random number in PHP and pass it in to a version of the query that didn't use that cross join, I think.
if @rnd has not been defined yet, then Entirety > @rnd evaluates to NULL, which is not truthy -- condition not met
As Andras points out, in the OP query, @rnd is not initialized when it's evaluated. It's an order of operation thing, touched on in the MySQL Reference manual. The initialization of @rnd could actually be performed in the var inline view, since that gets evaluated first. The extra var2 inline view and CROSS JOIN operation isn't necessary. The query is working Heidi SQL because @rnd has been initialized by a previous query; user-defined variables have persistence within a session, not just a query. With a brand new connection, you'd see the same behavior in HeidiSQL.
2
+200

The crux of the problem is that @rnd is not initialized when it's being evaluated. It's basically an order of operations problem.

The statement appears to be working in HeidiSQL because the statement is using the value that was previously assigned to @rnd, by a previously executed statement. (We know that MySQL user-defined variables retain their values across query executions, for the duration of the session, the MySQL database connection).

To have RAND() evaluated once, early in the statement execution, we can move the assignment down into the lowest level inline view, which will be evaluated first. (This works because of the way that MySQL evaluates the inline view query to materialize a "derived table", before the outer query runs.

If I had a guarantee that the BI column contains positive integer values, then I'd prefer to return the result with a statement like this:

SELECT d.ID
     , d.User
  FROM ( SELECT t.ID
              , t.User
              , t.BI
              , @rt := @rt + t.BI AS rt
           FROM `ax` t
          CROSS
           JOIN ( SELECT @rt := 0
                       , @rnd := FLOOR(1 + RAND() * (r.tot – 1))
                    FROM ( SELECT SUM(q.BI) AS tot FROM `ax` q ) r
                ) s
          ORDER BY t.BI
       ) d
 WHERE d.rt > @rnd
 ORDER BY d.BI
 LIMIT 1

N.B. MySQL Reference Manual states that the behavior we observe with user-defined variables within a statement is not guaranteed and should not be relied upon.

But we do observe consistent behavior, at least in MySQL 5.1 and 5.5. The MySQL documentation is making it clear that future releases of MySQL are free to change this behavior.

(If BI is not integer, or if it can be zero or negative, then the method the statement is using to determine the "running total" (rt) may not work correctly. I believe the OP query may have a similar issue related to non-integer or negative values.)

1 Comment

I've awarded my bounty here for both a clear explanation, and a solution to the problem that seems much more likely to continue working than my wing-and-a-prayer "move the initialisation until it works and hope the query engine carries on evaluating in that order". (I'd love to have split some of this bounty to @Thorsten's answer, as that was also very helpful, but sadly that's not how bounties work...) Thanks!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.