7

I have the following SQL:

select code, distance from places;    

The output is below:

CODE    DISTANCE            LOCATION
106     386.895834130068    New York, NY
80      2116.6747774121     Washington, DC
80      2117.61925131453    Alexandria, VA
106     2563.46708627407    Charlotte, NC

I want to be able to just get a single code and the closest distance. So I want it to return this:

CODE    DISTANCE            LOCATION
106     386.895834130068    New York, NY
80      2116.6747774121     Washington, DC

I originally had something like this:

SELECT code, min(distance), location
GROUP BY code
HAVING distance > 0 
ORDER BY distance ASC

The min worked fine if I didn't want to get the correct location that was associated with the least distance. How do I get the min(distance) and the correct location (depending on the ordering on the inserts in the table, sometimes you could end up with the New York distance but the Charlotte in Location).

2
  • 2
    Telling the DBMS up front woulda been nice.... Commented Jul 27, 2012 at 8:24
  • chris, why you wonder for performance so much under each answer? wouldn't you execute proposed queries once and buffer the results in order to obtain simple code 1:1 closest location relationship? as far as I'm concerned distances between codes and locations do not change very often... Commented Jul 27, 2012 at 8:41

3 Answers 3

9

To get the correct associated location, you'll need to join a subselect which gets the minimum distance per code on the condition that the distance in the outer main table matches with the minimum distance derived in the subselect.

SELECT a.code, a.distance
FROM   places a
INNER JOIN
(
    SELECT   code, MIN(distance) AS mindistance
    FROM     places
    GROUP BY code
) b ON a.code = b.code AND a.distance = b.mindistance
ORDER BY a.distance
Sign up to request clarification or add additional context in comments.

17 Comments

how's the performance on this for like 100,000s of locations?
@chris, since you're using MySQL, this is likely the most efficient solution you will find. You'll have to make sure you have the proper indexes set up on code and distance fields.
yeah code is a PK so that's okay, but distance is calcualted with math (using lat and long and google's map api)
@chris, if performance becomes suboptimal for your needs, you may want to look into using a spatial database instead of a relational one. Relational databases will only get you so far with these types of queries, but spatial databases are much more geared towards them.
Use a "bounding box" to get a shorter candidate list, then use the distance function. That lets you exclude most of the 100,000.
|
0

You can try to do a nested lookup between the minimum grouping and the original table.

This seems to do the trick

SELECT MinPlaces.Code, MinPlaces.Distance, Places.Location 
FROM Places INNER JOIN
(
    SELECT Code, MIN(Distance) AS Distance
    FROM Places
    GROUP BY Code
    HAVING MIN(Distance) > 0 
) AS MinPlaces ON Places.Code = MinPlaces.Code AND Places.Distance = MinPlaces.Distance
ORDER BY MinPlaces.Distance ASC

UPDATE: Tested using the following:

DECLARE @Places TABLE ( Code INT, Distance FLOAT, Location VARCHAR(50) )

INSERT INTO @Places (Code, Distance, Location)
VALUES
(106, 386.895834130068, 'New York, NY'),
(80, 2116.6747774121, 'Washington, DC'),
(80, 2117.61925131453, 'Alexandria, VA'),
(106, 2563.46708627407, 'Charlotte, NC')

SELECT MinPlaces.Code, MinPlaces.Distance, P.Location 
FROM @Places P INNER JOIN
(
    SELECT Code, MIN(Distance) AS Distance
    FROM @Places
    GROUP BY Code
    HAVING MIN(Distance) > 0 
) AS MinPlaces ON P.Code = MinPlaces.Code AND P.Distance = MinPlaces.Distance
ORDER BY MinPlaces.Distance ASC

And this yields:

enter image description here

2 Comments

@ErikE: Updated my answer. I like using CTE's too.
You're still self-joining Places which will be worse performance than a sequence project...
0

You did not say your DBMS. The following solutions are for SQL Server.

WITH D AS (
   SELECT code, distance, location,
      Row_Number() OVER (PARTITION BY code ORDER BY distance) Seq
   FROM places
)
SELECT *
FROM D
WHERE Seq = 1

If you have a table with unique Codes, and an index in your Places table on [Code, Distance] then a CROSS APPLY solution could be better:

SELECT
   X.*
FROM
   Codes C
   CROSS APPLY (
      SELECT TOP 1 *
      FROM Places P
      WHERE C.Code = P.Code
      ORDER BY P.Distance
   ) X

I cannot work on a solution for mysql unti much later.

P.S. You cannot rely on insertion order. Do not try!

5 Comments

what do you mean unique codes, as the sample i gave has duplicate codes
If you have a separate table listing all the Codes with 1 row per code!
yes i know i can't rely on insertion order. anyway my code is actual a user.id which comes from a user table and links to a locations table with distance and location
then yes there are unique and they have a one to many relationship with locations table
is it in mysql and is the performance fast as i'll have 100,000s of locations

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.