Original data (abc.csv):
8|AAAAA_001|0|
8|AAAAA_002|0|
8|AAAAA_003|0|
8|AAAAA_004|0|
8|AAAAA_005|0|AAAAA_005
8|AAAAA_006|0|
9|BBBBB_001|0|
9|BBBBB_002|0|
9|BBBBB_003|0|BBBBB_003
9|BBBBB_004|0|
9|BBBBB_005|0|
9|BBBBB_901|0|
10|CCCCC_001|0|
10|CCCCC_002|0|
10|CCCCC_003|0|
10|CCCCC_004|0|
Expected result:
8|AAAAA|0|AAAAA
9|BBBBB|0|BBBBB
10|CCCCC|0
Any idea? Thanks
What I have done as below, but it still show doubled result if data content $3
cat abc.csv | awk 'BEGIN{FS="|";OFS="|"}
{print $1,substr($2,1,5),$3,substr($4,1,5)}' |
sort -t "|" -k 2 | uniq > abc_final.csv
Helloon them have duplicated 1st and 2nd fields. In fact, only the first and secord lines are duplicated.