0

We want to delete duplicated rows on our MySQL database, and we have tried a lot of queries, but for unfortunately we haven't succeeded yet. We found this query on several posts, but didn't work either:

DELETE t1 FROM Raw_Validated_backup AS t1 INNER JOIN Raw_Validated_backup AS t2 
    ON t1.time_start=t2.time_start 
    AND t1.time_end=t2.time_end 
    AND t1.first_temp_lpn=t2.first_temp_lpn 
    AND t1.first_WL=t2.first_WL 
    AND t1.first_temp_lpn_validated=t2.first_temp_lpn_validated 
    AND t1.second_temp_lpn=t2.second_temp_lpn 
    AND t1.second_WL=t2.second_WL 
    AND t1.second_temp_lpn_validated=t2.second_temp_lpn_validated 
    AND t1.third_temp_lpn=t2.third_temp_lpn 
    AND t1.third_WL=t2.third_WL 
    AND t1.third_temp_lpn_validated=t2.third_temp_lpn_validated 
    AND t1.first_temp_rising=t2.first_temp_rising 
    AND t1.first_WR=t2.first_WR 
    AND t1.first_temp_rising_validated=t2.first_temp_rising_validated 
    AND t1.second_temp_rising=t2.second_temp_rising 
    AND t1.second_WR=t2.second_WR 
    AND t1.second_temp_rising_validated=t2.second_temp_rising_validated 
    AND t1.third_temp_rising=t2.third_temp_rising 
    AND t1.third_WR=t2.third_WR 
    AND t1.third_temp_rising_validated=t2.third_temp_rising_validated 
    AND t1.id<t2.id;

Message we receive after running query: No errors, 0 rows affected, taking 40,4 s

1
  • Incidentally, if operationally possible, it's often far quicker to create a new table, retaining just the rows you want to keep, and then dropping/archiving the old table and renaming the new one. Commented Jan 6, 2020 at 11:49

1 Answer 1

2

This query:

select max(id) id
from Raw_Validated_backup
group by <list of all the columns except id>

returns all the ids for the rows that you want to keep.
So delete the rest:

delete from Raw_Validated_backup
where id not in (
  select t.id from (
    select max(id) id
    from Raw_Validated_backup
    group by <list of all the columns except id>
  ) t
)

See the demo.
Another option with a self join:

delete v1 
from Raw_Validated_backup v1 inner join Raw_Validated_backup v2
on v1.time_start = v2.time_start and v1.time_end = v2.time_end and .......
and v1.id < v2.id;

See a simplified demo.

Sign up to request clarification or add additional context in comments.

4 Comments

Use USING (fieldslist) instead of ON - it is shorter and more clear.
Just tested and worked as wished. Thank you very much!
@Akina I agree using should be handy in this case but then the last condition v1.id < v2.id should be moved to a WHERE clause.
the last condition v1.id < v2.id should be moved to a WHERE clause Of course.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.