PostgreSQL: select rows with min value in one column among rows with identical columns

Question

The description in tilte may not be exactly what I want. Here is an example. Given a table t1:

src dest length path
 a    e    5    a b e
 a    d    2    a c d
 a    g    6    a c g
 a    e    3    a c e
 a    e    4    a g e
 a    d    2    a b d

For each (src, dest) pair, if there is only one entry, keep it; if there are multiple entries, select the one has the minimum length, if their length are the same, keep all of them. The output should be:

src dest length path
 a   d     2    a c d
 a   g     6    a c g
 a   e     3    a c e
 a   d     2    a b d

How can I approach it using PostgreSQL?

stackoverflow.com/questions/tagged/…

user330315
– user330315

2017-09-08 08:49:16 +00:00
Commented Sep 8, 2017 at 8:49 — user330315
– user330315, Commented Sep 8, 2017 at 8:49

Gordon Linoff · Accepted Answer · 2017-09-08 09:56:48Z

3

I would use window functions:

select t.*
from (select t.*,
             dense_rank() over (partition by src, dest order by length) as seqnum
      from t
     ) t
where seqnum = 1;

answered Sep 8, 2017 at 9:56

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

markusk Over a year ago

Any reason to prefer dense_rank over rank here? Both should yield the same result for this query, since you're only checking for rank 1.

Gordon Linoff Over a year ago

@markusk . . . No reason at all. I usually use rank() because it is shorter (and easier to type).

markusk Over a year ago

Your query doesn't quite match the desired output, it also includes the rank as an additional column. Easily fixed by rewriting outermost query to select src, dest, length, path from (...) where seqnum = 1.

Vao Tsun · Accepted Answer · 2017-09-08 08:51:48Z

0

I dont think you can make it without scanning table twice:

t=# with g as (select src,dest,min(length) from t1 group by src,dest)
select t1.* from t1
join g on t1.src = g.src and t1.dest = g.dest and length = min
;
 src | dest | length | path
-----+------+--------+------
 a   | d    |      2 | acd
 a   | d    |      2 | abd
 a   | e    |      3 | ace
 a   | g    |      6 | acg
(4 rows)

answered Sep 8, 2017 at 8:51

Vao Tsun

52.4k13 gold badges114 silver badges149 bronze badges

1 Comment

markusk Over a year ago

No need to join or scan twice if you use window functions. See Gordon Linoff's answer.

Collectives™ on Stack Overflow

PostgreSQL: select rows with min value in one column among rows with identical columns

2 Answers 2

3 Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Related