0

I have a SQL table with some redundant data as follows. (SQL Server 2012)

ColumnA(varchar) | ColumnB(varchar)
---------------- | ---------------
name1            | name2
name3            | name4
name2            | name1
name5            | name6

I need to select distinct data/rows from this table such that it will give me result as

ColumnA(varchar) | ColumnB(varchar)
---------------- | ---------------
name3            | name4
name2            | name1
name5            | name6

or

ColumnA(varchar) | ColumnB(varchar)
---------------- | ---------------
name1            | name2
name3            | name4
name5            | name6

Basically, name1 & name2 should be consider as unique if it is present as name2 & name1 (irrespective of order of column in which they are present).

I have no idea how can I filter the rows based on the strings being equal in different columns.

Can someone help me with this?

4 Answers 4

1

You can remove the data with logic like this:

delete from t
    where t.columnB > t.columnA and
          exists (select 1
                  from t t2
                  where t2.columnA = t.columnB and t2.columnB = t.columnA
                 );

If you don't want to actually delete the records, but simply want to return a result set without duplicates, you can use a similar query:

select t.columnA, t.columnB
from t
where t.columnA < t.columnB
union all
select t.columnA, t.columnB
from t
where t.columnA > t.columnB and
      not exists (select 1
                  from t t2
                  where t2.columnA = t.columnB and t2.columnB = t.columnA
                 );
Sign up to request clarification or add additional context in comments.

2 Comments

so the one nuance if removing all duplicates if the test data actually duplicate name1 name2 and name2 name1 so both are represented twice in the dataset these statements wont remove one set of those duplicates
@Matt . . . It seems pretty clear that the OP's intention is to remove "duplicates" where that is defined as the value in the two columns being in reversed order: "Basically, name1 & name2 should be consider as unique if it is present as name2 & name1 (irrespective of order of column in which they are present)."
1
with TabX as(
 select 'name1' as ColumnA, 'name2' as ColumnB
 union all
 select 'name3' as ColumnA, 'name4' as ColumnB
 union all
 select 'name2' as ColumnA, 'name1' as ColumnB
 union all
 select 'name5' as ColumnA, 'name6' as ColumnB
)

select min(ColumnA) as ColumnA,max(ColumnB) as ColumnB
  from tabX
 group by case when ColumnA > ColumnB then ColumnA+ColumnB else ColumnB+ColumnA end

1 Comment

Great Answer Mike!
0
;WITH cte AS (
    SELECT *
       ,ROW_NUMBER() OVER (PARTITION BY
          CASE WHEN ColumnA < ColumnB THEN ColumnA + ColumnB ELSE ColumnB + ColumnA END
          ORDER BY (SELECT 0)) as RowNumber
    FROM
       @Table
)

DELETE FROM cte
WHERE
    RowNumber > 1

If you want to select rather than delete change it to

SELECT * FROM cte WHERE RowNumber = 1

Or you can also use a method similar to that of @mike and just do straight case expressions with DISTINCT to get the unique combinations:

SELECT DISTINCT 
    CASE WHEN ColumnA < ColumnB THEN ColumnA ELSE ColumnB END as ColumnA
    ,CASE WHEN ColumnA < ColumnB THEN ColumnB ELSE ColumnA END as ColumnB
FROM
    @Table

Here is some test data:

DECLARE @Table AS TABLE (ColumnA VARCHAR(10),ColumnB VARCHAR(10))
INSERT INTO @Table VALUES
('name1','name2')
,('name3','name4')
,('name2','name1')
,('name2','name1')
,('name5','name6')
,('name1','name2')

Comments

0

Here's a simple way to get a totally de-duped set of rows (per your criteria for dupes):

select t.columnA, t.columnB
from (
    select t.columnA, t.columnB, 
    row_number() over (
        partition by 
            case when t.columnA >= t.columnB then t.columnA + t.columnB 
            else t.columnB + t.columnA end 
        order by t.columnA) as rseq 
        /* order of "dupes" decided above, only first one gets rseq = 1 */
    from t
) t
where t.rseq = 1

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.