1

I need the SQL equivalent of this.

I have a table like this

ID MN MX
-- -- --
A  0  3
B  4  6
C  7  9

Given a number, say 5, I want to find the ID of the row where MN and MX contain that number, in this case that would be B.

Obviously,

SELECT ID FROM T WHERE ? BETWEEN MN AND MX

would do, but I have 9 million rows and I want this to run as fast as possible. In particular, I know that there can be only one matching row, I now that the MN-MX ranges cover the space completely, and so on. With all these constraints on the possible answers, there should be some optimizations I can make. Shouldn't there be?

All I have so far is indexing MN and using the following

SELECT ID FROM T WHERE ? BETWEEN MN AND MX ORDER BY MN LIMIT 1

but that is weak.

3
  • Well, you could change ? BETWEEN MN AND MX to MN >= ?, or else drop the ORDER BY MN LIMIT 1 (obviously not both, though). But . . . what's "weak" about your current query? Without knowing what you dislike about it, it's hard to know what alternatives to suggest. Commented Jan 18, 2012 at 1:17
  • What are the current query times? BETWEEN is pretty fast. Did you run an explain on your query to see the breakdown of how MySQL will execute the query? Commented Jan 18, 2012 at 1:18
  • @MikePurcell - I haven't done it yet. I was just wondering if there was collective wisdom on how to do things like that. Commented Jan 18, 2012 at 3:13

2 Answers 2

3

If you have an index spanning MN and MX it should be pretty fast, even with 9M rows.

alter table T add index mn_mx (mn, mx);

Edit

I just tried a test w/ a 1M row table

mysql> select count(*) from T;
+----------+
| count(*) |
+----------+
|  1000001 |
+----------+
1 row in set (0.17 sec)

mysql> show create table T\G
*************************** 1. row ***************************
       Table: T
Create Table: CREATE TABLE `T` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `mn` int(10) DEFAULT NULL,
  `mx` int(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `mn_mx` (`mn`,`mx`)
) ENGINE=InnoDB AUTO_INCREMENT=1048561 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select * from T order by rand() limit 1;
+--------+-----------+-----------+
| id     | mn        | mx        |
+--------+-----------+-----------+
| 112940 | 948004986 | 948004989 |
+--------+-----------+-----------+
1 row in set (0.65 sec)

mysql> explain select id from T where 948004987 between mn and mx;
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
| id | select_type | table | type  | possible_keys | key   | key_len | ref  | rows   | Extra                    |
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | T     | range | mn_mx         | mn_mx | 5       | NULL | 239000 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)

mysql> select id from T where 948004987 between mn and mx;
+--------+
| id     |
+--------+
| 112938 |
| 112939 |
| 112940 |
| 112941 |
+--------+
4 rows in set (0.03 sec)

In my example I just had an incrementing range of mn values and then set mx to +3 that so that's why I got more than 1, but should apply the same to you.

Edit 2

Reworking your query will definitely be better

mysql> explain select id from T where mn<=947892055 and mx>=947892055;
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key   | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
|  1 | SIMPLE      | T     | range | mn_mx         | mn_mx | 5       | NULL |    9 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+

It's worth noting even though the first explain reported many more rows to be scanned I had enough innodb buffer pool set to keep the entire thing in RAM after creating it; so it was still pretty fast.

Sign up to request clarification or add additional context in comments.

1 Comment

Strangely, I did not get that same drop-off in rows scanned when replacing the between. In fact, even the index doesn't seem to actually speed it up much (between 0.06 and 0.15 seconds). The table ended up being much smaller than I was anticipating and I think it's just all in memory.
0

If there are no gaps in your set, a simple gte comparison will work:

SELECT ID FROM T WHERE ? >= MN ORDER BY MN ASC LIMIT 1

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.