1

I've an 'orders' table structure like this which contains 100,000 records:

date           orderid   type      productsales  other
01-Aug-2014    11        order     118           10.12    
01-Aug-2014    11        order     118           10.12
18-Aug-2014    11        order     35            4.21
22-Aug-2014    11        Refund    -35           -4.21
09-Sep-2014    12        order     56            7.29
15-Sep-2014    12        refund    -56           -7.29
23-Oct-2014    13        order     25            2.32
26-Oct-2014    13        refund    -25           -2.32

Now, what I want to achieve is to delete those duplicate row from my table where the orderid, type, productsales and other columns values are same to each other and keep only one row (look at the first two records for the orderid of 11).

But if the 'orderids' are same for the two records of the same 'type' of order, but the 'productsales' and 'other' columns values are different then don't delete those rows. I hope I clarified my point.

I'm looking for a mysql delete query to perform this task.

6
  • What if date is different? Do you want to delete one of them? Which one do you want to keep, the oldest or the newest? Commented Feb 13, 2015 at 12:56
  • I don't think DELETE will work here. Better/easier approach would be to SELECT DISTINCT the respective rows into a temp table, delete the old table and then rename the temp table accordingly. Commented Feb 13, 2015 at 12:56
  • DELETE ... LIMIT 1 should do the trick: dev.mysql.com/doc/refman/5.0/en/delete.html Commented Feb 13, 2015 at 12:57
  • @DavidP date will also be the same for both the rows Commented Feb 13, 2015 at 12:58
  • 2
    If the rows are completely equal, a delete won't work. Either use a cursor, or follow Sirko's temporary table advise. Next time, use primary keys and other constraints to avoid this mess. Commented Feb 13, 2015 at 13:00

5 Answers 5

1

You should add an id column. If you don't want to use a temp table, you could probably do something like this (I have NOT tested this, so...):

ALTER TABLE 'orders'
ADD COLUMN 'id' INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY Id(id)

DELETE 
FROM orders INNER JOIN 
(
   SELECT TOP 1 id
   FROM orders
   WHERE COUNT(DISTINCT date,orderid,type.productsales,other) > 1
) dupes
ON orders.id = dupes.id
Sign up to request clarification or add additional context in comments.

Comments

1

May be its duplicate question to this: MySql: remove table rows depending on column duplicate values?

You can seek for the answer there.

The solution there specify that adding unique index on your possible duplicate columns with IGNORE keyword will remove all duplicates row.

ALTER IGNORE TABLE `table` ADD UNIQUE INDEX `name` (`col1`, `col2`, `col3`);

Here I also want to mention some points:

  1. unique index does not make change in row if any columns(from index, like here 3 columns) have null as value. Ex: null,1,"asdsa" can be stored twice
  2. same way if you have single column in unique index then multiple rows with null values(for that column) will remains in table
  3. IGNORE keywords id depreciated now, it will not work after MySQL 5.6(may be). Now only option is to create new table by a query like this:

CREATE TABLE <table_name> AS SELECT * FROM <your_table> GROUP BY col1,col2,col3;

After that you can delete <your_table> and rename <table_name> to your table.

Here you can change the column list in Group By clause according to your need(from all columns to one column, or few columns which have duplicate values together).

The plus point is, it will work with null values also.

Comments

0

A really easy way to do this is to add a UNIQUE index on the 3 columns. When you write the ALTER statement, include the IGNORE keyword. Like so:

ALTER IGNORE TABLE orders ADD UNIQUE INDEX idx_name (orderid, type, productsales, other);

This will drop all the duplicate rows. As an added benefit, future INSERTs that are duplicates will error out. As always, you may want to take a backup before running something like this...

I hope this can help you.

Comments

0

try this. create temp table such as temp and stored unique data,

SELECT distinct * into temp FROM Orders

then delete records of orders table table as

 DELETE FROM orders

after deleted all records insert records temp into records.

INSER into RECORDS SELECT * FROM TEMP DROP TABLE TEMP 

Comments

0

If you have completely duplicated rows, and you want to do this in SQL, then perhaps the best method is to save the rows you want in a temporary table, truncate the table, and insert the data back in:

create temporary table temp_orders as
    select distinct *
    from orders;

truncate table orders;

alter table orders add orderid int not null primary key auto_increment;

insert into orders;
    select *
    from temp_orders;

Oh, look, I also added an auto-incrementing primary key so you won't have this problem in the future. This would be a simpler process if you have a unique key on each row.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.