0

I need to find a way to sum up all the integer values of a specific column in a file, and print its result. This is a piece of my file:

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 1 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 2 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 2 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 0 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

<<< The program found 1 rare variants for the gene ARIH1 for this HEALTHY_CONTROL <<<

I want to print the sum of the 5th column, that is the total number of rare variants. In this example, it should print 6.

I tried the following command (which did not work):

grep "rare variants for the gene ARIH1" fileName | tail -n+2 | awk -F " " '{sum+=$5} END {print sum}'

This command prints 1, which is wrong.

How can I do? Thanks!

5
  • 1
    awk '/rare variants for the gene ARIH1/{sum += ($5 + 0)} END{print sum}' filename should do the job. Commented Aug 29, 2019 at 14:44
  • 1
    @Davide your script worked when I tried it Commented Aug 29, 2019 at 14:47
  • Thanks but they don't work to me... they both generate 1 as a result :-/ Commented Aug 29, 2019 at 14:50
  • What is the return of grep "rare variants for the gene ARIH1" fileName | tail -n+2? and why did you add this part in your question? Commented Aug 29, 2019 at 14:58
  • 1
    Please show output of awk '/rare variants for the gene ARIH1/{print $5}' file Commented Aug 29, 2019 at 14:59

2 Answers 2

0

Try this awk script:

awk 'BEGIN{sum=0} {if ($0 ~ /rare variants for the gene ARIH1/) sum+=$5} END{ print "Sum is ",sum}' fileName

or shorter form of the above script.

awk '/rare variants for the gene ARIH1/{sum+=$5} END{print "Sum is ",sum}' fileName

Working: It checks whether the following pattern rare variants for the gene ARIH1 is found in a line (if ($0 ~ /pattern/)) . If it's a match then it sums the values in the column 5. At the end, it prints the final sum.

Sign up to request clarification or add additional context in comments.

Comments

0
awk -F " " '/<<< The/ {sum += $5} END {print sum}' file

The above command should get you the required output. you may exclude the -F switch, since awk by default will have space as its delimiter.

8 Comments

Don't be root.
@tripleee , thanks for the comment. Request you to reach me at [email protected](e-mail) . Let us discuss regarding this over there.
What's to discuss? The reasons to not shoot yourself in the foot are well-documented and widely understood.
I have used online vm , to run that command. That VM had already logged me in as root "bellard.org/jslinux/vm.html?url=https://bellard.org/jslinux/…".
That's just coincidental, and not something you should include in your answer.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.