Revisions to Merging 2 files with based on field match

added 83 characters in body

Source Link

edited Mar 31, 2016 at 18:39

76.4k
12
176
165

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.
a[$1]=$2

Here, a is an associative array and a[$1]=$2 means saving file2's second column while indexed based on the, denoted $2, as a value in array a using file2's first column, (of file2)$1, as the key.

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.
a[$1]=$2 means saving second column while indexed based on the first column (of file2)

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.

Here, a is an associative array and a[$1]=$2 means saving file2's second column, denoted $2, as a value in array a using file2's first column, $1, as the key.

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

added: a[$1]=$2 means saving second column indexed based on the first column (of file2)

Source Link

edit approved Mar 31, 2016 at 17:57

Fatemeh

3
2

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.
a[$1]=$2 means saving second column while indexed based on the first column (of file2)

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.
a[$1]=$2 means saving second column while indexed based on the first column (of file2)

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

added 844 characters in body

Source Link

edited Oct 8, 2014 at 4:48

John1024

76.4k
12
176
165

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.

next tells awk to skip the rest of the commands and start over with the next line.

($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.

next tells awk to skip the rest of the commands and start over with the next line.

($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

Source Link

answered Oct 8, 2014 at 4:41

John1024

76.4k
12
176
165

Loading

Stack Exchange Network

Return to Answer