0

Possible Duplicate:
How to delete Duplicates in MySQL table

I imported data that I had in plain text files into a mysql database. Now I found I had duplicated entries, which I want to delete. The duplicates are identified by a key which is not the primary key. Please remind that I have to keep one of the duplicated items.

the table, T1, with three duplicates, for example:

ID(unique, primary key) REAL_ID(char(11))

1 '01234567890'
2 '01234567891'
3 '01234567891'
4 '01234567891'
...

Now, I use

 SELECT ID AS x, COUNT(*) AS y FROM T1 GROUP BY x HAVING y>1;

to identify the duplicates. The result i

+------+-------------+
| ID   | REAL_ID     |
+------+-------------+
|    1 | 01234567891 |
|    2 | 01234567891 |
|    3 | 01234567891 |
+------+-------------+

I can even construct the list of the Ids I have to delete:

SELECT ID 
FROM T1 
  RIGHT JOIN ( (SELECT ID AS x, COUNT(*) AS y 
                FROM T1 
                GROUP BY x 
                HAVING y>1) AS T2 ) ON (T2.x=T1.REAL_ID) LIMIT 1,100;

The result is

+------+-------------+
| ID   | REAL_ID     |
+------+-------------+
|    2 | 01234567890 |
|    3 | 01234567890 |
+------+-------------+

Now, I need help on how to delete these entries.

Since is not possible to use DELETE in combination with subqueries, I was trying to flag all the duplicated entries in the column REAL_ID, and then use

DELETE FROM T1 WHERE REAL_ID='flag';

but I can't figure out how to flag these entries.

1

1 Answer 1

0

You can do this:

DELETE t
FROM T1 t 
RIGHT JOIN 
( 
   SELECT ID, COUNT(*) AS y 
   FROM T1 
   GROUP BY ID
   HAVING y > 1 
) AS T2 ON T2.ID = t.REAL_ID;

Update: Note that you can't use LIMI or ORDER BY when deleting from multiple table, as quoted from DELETE:

For the multiple-table syntax, DELETE deletes from each tbl_name the rows that satisfy the conditions. In this case, ORDER BY and LIMIT cannot be used

Try this instead:

DELETE t
FROM T1 t 
WHERE REAL_ID IN
( 
  SELECT t1.REAL_ID
  FROM T1 t1
  RIGHT JOIN 
  ( 
     SELECT ID, COUNT(*) AS y 
     FROM T1 
     GROUP BY ID
     HAVING y > 1 
  ) AS T2 ON T2.ID = t.REAL_ID
  WHERE t1.REAL_ID IS NOT NULL
  ORDER BY t2.y DESC
  LIMIT 1, 1000
)
Sign up to request clarification or add additional context in comments.

3 Comments

sorry, it doesn't work: ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'LIMIT 1,100' at line 1.
@chuse How it doesn't work?
@chuse - You can't use ORDER BY not LIMIT when deleting the way I did. See my edit please.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.