Comparing two files using Awk

Question

I have two text files one with a list of ids and another one with some id and corresponding values.

File 1

abc
abcd
def
cab
kac

File 2

abcd   100
def    200
cab    500
kan    400

So, I want to compare both the files and fetch the value of matching columns and also keep all the id from File 1 and assign "NA" to the ids that don't have a value in File2

Desired output

abc     NA
abcd    100
def     200
cab     500
kac     NA

PS: Only Awk script/One-liners

The code I'm using to print matching columns:

awk 'FNR==NR{a[$1]++;next}a[$1]{print $1,"\t",$2}'

So, what did you try?

James Brown
– James Brown

2016-09-23 06:39:38 +00:00
Commented Sep 23, 2016 at 6:39 — James Brown
– James Brown, Commented Sep 23, 2016 at 6:39
I was only able to print the matching values.

arupgsh
– arupgsh

2016-09-23 06:40:44 +00:00
Commented Sep 23, 2016 at 6:40 — arupgsh
– arupgsh, Commented Sep 23, 2016 at 6:40
Please add that code to your question.

James Brown
– James Brown

2016-09-23 06:41:51 +00:00
Commented Sep 23, 2016 at 6:41 — James Brown
– James Brown, Commented Sep 23, 2016 at 6:41
@JamesBrown Added

arupgsh
– arupgsh

2016-09-23 06:46:28 +00:00
Commented Sep 23, 2016 at 6:46 — arupgsh
– arupgsh, Commented Sep 23, 2016 at 6:46
Is it important to keep the order of file1?

rudimeier
– rudimeier

2016-09-23 07:11:02 +00:00
Commented Sep 23, 2016 at 7:11 — rudimeier
– rudimeier, Commented Sep 23, 2016 at 7:11

James Brown · Accepted Answer · 2016-09-23 06:48:40Z

3

$ awk 'NR==FNR{a[$1]=$2;next} {print $1,  ($1 in a? a[$1]: "NA") }' file2 file1
abc NA
abcd 100
def 200
cab 500
kac NA

answered Sep 23, 2016 at 6:48

James Brown

37.7k8 gold badges52 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rudimeier · Accepted Answer · 2016-09-23 07:39:29Z

Using join and sort (hopefully portable):

export LC_ALL=C
sort -k1 file1 > /tmp/sorted1
sort -k1 file2 > /tmp/sorted2
join -a 1 -e NA -o 0,2.2 /tmp/sorted1 /tmp/sorted2

In bash you can use here-files in a single line:

LC_ALL=C join -a 1 -e NA -o 0,2.2 <(LC_ALL=C sort -k1 file1) <(LC_ALL=C sort -k1 file2)

Note 1, this gives output sorted by 1st column:

abc NA
abcd 100
cab 500
def 200
kac NA

Note 2, the commands may work even without LC_ALL=C. Important is that all sort and join commands are using the same locale.

Collectives™ on Stack Overflow

Comparing two files using Awk

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related