Edited for correct expected output

Source Link

edited May 18, 2019 at 12:31

209
3
5

I have a file tmp.log with fields like

description ID  valueA valueB valueC
xxx         x    1       1     1
yyy         y    3       100    23
zzz         z    0       0      0
aaa         a    4       4      4

I would like to remove data points which have same values across all 'value' columns

description ID  valueA valueB valueC
yyy         y    3       100    23
aaa         a    4       4      4

I am using

cat tmp.log | tail -n+2 | awk '!a[$3$4$5]++'

But it still prints the redundant values, why is this wrong and how to correct?

I have a file tmp.log with fields like

description ID  valueA valueB valueC
xxx         x    1       1     1
yyy         y    3       100    23
zzz         z    0       0      0
aaa         a    4       4      4

I would like to remove data points which have same values across all 'value' columns

description ID  valueA valueB valueC
yyy         y    3       100    23
aaa         a    4       4      4

I am using

cat tmp.log | tail -n+2 | awk '!a[$3$4$5]++'

But it still prints the redundant values, why is this wrong and how to correct?

I have a file tmp.log with fields like

description ID  valueA valueB valueC
xxx         x    1       1     1
yyy         y    3       100    23
zzz         z    0       0      0
aaa         a    4       4      4

I would like to remove data points which have same values across all 'value' columns

description ID  valueA valueB valueC
yyy         y    3       100    23

I am using

cat tmp.log | tail -n+2 | awk '!a[$3$4$5]++'

But it still prints the redundant values, why is this wrong and how to correct?

edited tags

Link

edited May 18, 2019 at 10:46

Jeff Schaller ♦

68.8k
35
122
265

Became Hot Network Question

occurred May 18, 2019 at 7:15

Source Link

asked May 18, 2019 at 5:21

kris

209
3
5

How to remove duplicate values based on multiple columns

I have a file tmp.log with fields like

description ID  valueA valueB valueC
xxx         x    1       1     1
yyy         y    3       100    23
zzz         z    0       0      0
aaa         a    4       4      4

I would like to remove data points which have same values across all 'value' columns

description ID  valueA valueB valueC
yyy         y    3       100    23
aaa         a    4       4      4

I am using

cat tmp.log | tail -n+2 | awk '!a[$3$4$5]++'

But it still prints the redundant values, why is this wrong and how to correct?

bash awk

Stack Exchange Network

Return to Question

How to remove duplicate values based on multiple columns