1

Given this file:

92157768877;Sof_deme_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0
92157768877;Sof_trav_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0

91231838895;Sof_deme_faible_Email_am;EMAIL;26/01/2015;1 0;0
91231838895;Sof_nais_faible_Email_am;EMAIL;26/01/2015;1 0;0
91231838895;Sof_deme_Faible_Email_Relance_am;EMAIL;28/01/2015;1;0;0
91231838895;Sof_nais_faible_Email_Relance_am;EMAIL;28/01/2015;1;0;0
91231838895;Sof_deme_Faible_Email_Relance_am;EMAIL;30/01/2015;1;0;0

92100709652;Sof_voya_Faible_Email_am_%yyyy%%mm%%dd%;EMAIL;11/02/2015;1;0;0
92100709652 Sof_voya_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;11/02/2015;1;0;0
92100709652;Export Voya_Fort Postal;EXPORT;13/02/2015;1;0;0

92100709634;Export Voya_Fort Postal;EXPORT;15/02/2015;1;0;0
92100709634;Export Voya_Fort Postal;EXPORT;15/02/2015;1;0;0
92100709635;Deme_Voya_Fort Postal;EXPORT;16/02/2015;1;0;0

I want to get those lines that fulfill the following conditions:

  • 1st field is the same as the 1st field of the next line
  • 4th field is the same as the 4th field of the next line
  • the remaining lines match with their 1st field to the 1st field of the 1st line.

So that the output is like this:

92157768877;Sof_deme_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0
92157768877;Sof_trav_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0
91231838895;Sof_deme_faible_Email_am;EMAIL;26/01/2015;1 0;0
91231838895;Sof_nais_faible_Email_am;EMAIL;26/01/2015;1 0;0
91231838895;Sof_deme_Faible_Email_Relance_am;EMAIL;28/01/2015;1;0;0
91231838895;Sof_nais_faible_Email_Relance_am;EMAIL;28/01/2015;1;0;0
91231838895;Sof_deme_Faible_Email_Relance_am;EMAIL;30/01/2015;1;0;0
92100709652;Sof_voya_Faible_Email_am_%yyyy%%mm%%dd%;EMAIL;11/02/2015;1;0;0
92100709652 Sof_voya_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;11/02/2015;1;0;0
92100709652;Export Voya_Fort Postal;EXPORT;13/02/2015;1;0;0

I tried with the awk solution below but something is wrong. I cannot add the fourth field condition. And how should I select the subsequent lines?

awk -F";" 'FNR==NR{a[$1]++; next} && FNR==NR{a[$4]++; next} a[$1]==2  a[$4]==2' filetestv2.txt filetestv2.txt
7
  • so you want to compare lines in blocks based on the 1st and 2nd field. If their 4th field also matches, then print both; otherwise, print just the first? Commented Mar 12, 2015 at 13:40
  • Hi @fedorqui, if 1st field of 1st line and 1st field of 2nd line match and if the 4th field of 1st line and the 4th field of the 2nd line match. And if the remaining lines match with their 1st field to the 1st field of the 1st line. Does it sound clear? Commented Mar 12, 2015 at 13:46
  • Mmmm a bit more. But I don't understand the sample input with a line appearing three times. How should this be handled? Commented Mar 12, 2015 at 14:03
  • @fedorqui if the 1st field of line 3 is the same as the 1st field of line 1, it must be flagged as well. If there is a fourth line with the 1st field like the same as the 1st field of line 1, it must be flagged as well and so forth. Commented Mar 12, 2015 at 14:15
  • You should define what "flagged" means: remove it? keep it with some extra info? I would recommend you to go again through How do I ask a good question? Commented Mar 12, 2015 at 14:21

1 Answer 1

5

Based on a discussion we had in chat, what you want is to print all lines whose 1st and 4th fields are the same as the 1st and 4th fields of another line. If so, you can do:

awk -F';' '{ 
                if(NR==1){n=0; a[n]=$0}
                if($1==l1 && $4==l4){a[++n]=$0}
                else{
                    for (l in a){print a[l];}
                      delete a
                    } 
                    l1=$1; l4=$4; l=$0
                }
                END{if($1==l1 && $4==l4){print}
          }' file

Or, in Perl:

perl -F';' -ane '$k{$F[0]}{$F[3]}.=$_; 
                 $l{$F[0]}{$F[3]}++;
                 END{
                    foreach $o (keys(%k)){
                        foreach $f (keys(%{$k{$o}})){ 
                          print "$k{$o}{$f}" if $l{$o}{$f}>1
                        }
                    }
                }' file
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.