Return to Revisions

1 of 3

answered Jul 29, 2013 at 0:23

40.5k
8
113
146

Here's a solution in perl. You should save the following code in a file and run it as a script (see below):

#!/usr/bin/perl
$file1 = '/path/to/file1';
$file2 = '/path/to/file2';
open $f1,'<',$file1;
open $f2,'<',$file2;
while(<$f1>){
    ($c1,$c2,$c4,$c5) = (split / /)[0,1,3,4]; #get relevant columns in file 1
    $lines_dictionary{"$c1 $c2 $c4"}="$c5---$_"; #create a hash entry keyed by the relevant columns
}
while(<$f2>){
    ($c1,$c2,$c4,$c5) = (split / /)[0,1,3,4]; #get relevant columns in file 2
    if(exists $lines_dictionary{"$c1 $c2 $c4"}){ #if a line with similar columns was seen in file 1
        ($file1_c5,$file1_line) = split /---/,$lines_dictionary{"$c1 $c2 $c4"}; #parse the hash entry this line in file 1
        if($file1_c5 -ne $c5){ #if column 5 of file 2 doesn't match column 5 of file 1
            print "$file1_line\n$_\n\n";
        }
    }
}
close $f1;
close $f2;

Use any text editor to paste this script into a file, modify the $file1 and $file2 variables to reflect the true locations of your files, then make the script executable by doing:

$ chmod +x /path/to/script

Finally, call the script:

$ /path/to/script

Disclaimer

This code is untested
This code assumes the pattern '---' is unlikely to occur in the 5th column.
This code assumes the lines in file 1 are unique (i.e. that each line has a different combination of "column1 column2 column4"). If there are multiple lines (not necessarily consecutive) containing the same data in the relevant columns, the script will use the last of these lines.

answered Jul 29, 2013 at 0:23

Joseph R.

40.5k
8
113
146