3

I have multiple files (10+) that I want to merge/join into the one output file, for example:

file 1

2000 0.0202094
2001 0.0225532
2002 0.02553
2003 0.0261099
2004 0.0280311
2005 0.028843

file 2

2000 0.0343179
2001 0.036318
2003 0.039579
2004 0.0412106
2005 0.041264

file 3

2004 0.068689
2005 0.0645474

All files have the same two columns are are of unequal length.

The desired output would be:

        file1       file2      file3
2000    0.0202094   0.0343179
2001    0.0225532   0.036318
2002    0.02553
2003    0.0261099   0.0395799
2004    0.0280311   0.0412106   0.0686893
2005    0.028843    0.041264    0.0645474

I have tried the following code however the values don't align with the first column:

awk '{printf($1); for(i=2;i<=NF;i+=2) printf ("\t%s", $i); printf "\n"}' <(paste file*) > mergedfile.txt
2
  • What have you tried and where are you stuck? Commented Jun 19, 2019 at 1:40
  • I can run this however it doesn't align it to the values in the first column awk '{printf($1); for(i=2;i<=NF;i+=2) printf ("\t%s", $i); printf "\n"}' <(paste file*) > masterfile.txt Commented Jun 19, 2019 at 1:44

2 Answers 2

1

You can just make awk run on all those files in one shot, by grouping on the first column entries. The part map[$1]?(map[$1] FS $2):($2) is a ternary statement, meaning add to the array map indexed by $1, if it was empty or append to the already existing values if it is non-empty.

awk '{ map[$1] = ($1 in map)?(map[$1] FS $2):($2); } 
     END { for(i in map) print i, map[i] }' file*

To make the output a bit more readable than the output produced by awk, pipe the output as

awk '{ map[$1] = ($1 in map)?(map[$1] FS $2):($2); } 
     END { for(i in map) print i, map[i] }' file* | column -t > mergedfile.txt
1
  • 2
    map[$1] = map[$1]?... will fail when map[$1] is zero. You should be testing map[$1] = ($1 in map)?... instead. Commented Jun 21, 2019 at 16:53
0

Done by below script

STEP1

awk '{print $1}' file1 file2 file3| awk '{if(!seen[$1]++){print $0}}' >pattern_content

STEP2

for i in `awk '{print $1}' file1 file2 file3| sort | uniq`; do grep "$i" file1>/dev/null; if [[ $? == 0 ]]; then grep $i file1| awk '{print $2}'; else echo "                                "; fi; done > file1_o

for i in `awk '{print $1}' file1 file2 file3| sort | uniq`; do grep "$i" file2>/dev/null; if [[ $? == 0 ]]; then grep $i file2| awk '{print $2}'; else echo "                                "; fi; done > file2_o


for i in `awk '{print $1}' file1 file2 file3| sort | uniq`; do grep "$i" file3>/dev/null; if [[ $? == 0 ]]; then grep $i file3| awk '{print $2}'; else echo "                                "; fi; done > file3_o


step3

 paste pattern_content file1_o file2_o file3_o|sed '1i                 file1          file2               file3'| sed "s/file1/\t&/g"

output

        file1       file2      file3
2000    0.0202094   0.0343179
2001    0.0225532   0.036318
2002    0.02553
2003    0.0261099   0.0395799
2004    0.0280311   0.0412106   0.0686893
2005    0.028843    0.041264    0.0645474

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.