0

I am trying to execute the following command:

for i in test1.txt do awk '$1==$i {sum +=$4}END {print sum}' test2.txt

Where test1.txt looks like:

A
B
C
D
E

But it's not working. What I want to achieve is, for each letter within test1.txt file, find all the rows of test2.txt that have the same letter in their first column, and for those rows, sum all their values in the 4th column.

1
  • Can you give us an example of how test2.txt look like, and how you want output to look like ? Commented Mar 26, 2018 at 6:56

2 Answers 2

2

You can use awk alone here.

awk 'NR==FNR{a[$1]++; next} ($1 in a) {sum+=$4} END{print sum}' file1.txt file2.txt
2

The reason this does not work the way you have written it is that awk will interpret $i as "the ith field", and since the awk variable i has no value, you will get an error, or if you are using GNU awk or mawk, $i will be the same as $0 which is the whole line (with mawk or GNU awk, the program looks for lines whose first column is the same as the whole line).

Instead, to "import" you shell variable into awk:

awk -v i="$i" '$1 == i { sum += $4 } END { print sum }' test2.txt

Also, the value of the shell variable $i will only ever be the name of the file test1.txt (since this is what you loop over).

To loop over the contents of the file:

while IFS= read -r i; do
    awk ...as above...
done <test1.txt

αғsнιη's answer shows how you can do this without using a shell loop.

1
  • Alternatively, one could put the variable into awk's environment and use ENVIRON["i"] Commented Mar 26, 2018 at 6:50

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.