3

I want to select without duplicate ids and keep row '5d' and not '5e' in select statement.

table

id | name
1  | a
2  | b
3  | c
5  | d
5  | e

I tried:

SELECT id, name 
FROM table t
INNER JOIN (SELECT DISTINCT id FROM table) t2 ON t.id = t2.id
5
  • 2
    Duplicate: stackoverflow.com/questions/4662464/… Commented Mar 28, 2019 at 21:13
  • Possible duplicate of How can I remove duplicate rows? Commented Mar 28, 2019 at 21:16
  • 1
    Make sure to always specify what rule you want applied. " and keep row 5d and not 5e" only tells us that 'd' wins over 'e' and we don't know why. What if it were not 'd' and 'e' but 'a' and 'z' or whatever other name? Is 'd' better than 'e', because it comes before 'e' in the alphabet? Or because it is not a vowel? So, in case of duplicates: how to decide for the value to keep? Commented Mar 28, 2019 at 21:23
  • I want to keep the earliest insertion and remove the latest Commented Mar 28, 2019 at 21:29
  • 1
    Define "earliest"? By definition a table is an unordered set. That means you have to provide order with the data. Commented Mar 28, 2019 at 21:33

3 Answers 3

6

You can also use ROW_NUMBER():

SELECT id, name
FROM (
    SELECT id, name, ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) rn 
    FROM mytable
) x
WHERE rn = 1

This will retain the record that has the smallest name (so '5d' will come before '5e'). With this technique, you can also use a sort criteria on another column that the one where duplicates exists (which an aggregate query with MIN() cannot do). Also, queries using window functions usually perform better than the equivalent aggregate query.

Sign up to request clarification or add additional context in comments.

Comments

6

For the given example an aggregation using min() would work.

SELECT id,
       min(name) name
       FROM table
       GROUP BY id;

Comments

3

If you want to keep the row with the smallest name then you can use not exists:

select t.* from tablename t
where not exists (
  select 1 from tablename
  where id = t.id and name < t.name
)

2 Comments

I am a little confused by this snippet, could you please a bit explain how it works?
@bentz123 this query selects only the rows of the tables for which there is no other row with the same id but less name (alphabetically). So it returns distinct ids with the least name of each id.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.