0

I'd like to sum multiple columns in a text file similar to this:

GeneA Sample  34  7  8   16
GeneA Sample  17  7  10  91
GeneA Sample  42  9  8   11

I'd like to generate the sum at the bottom of columns 3-5 so it will look like:

GeneA Sample  34   7   8   16
GeneA Sample  17   7  10   91
GeneA Sample  42   9   8   11
              93  23  26 

I can use this for a single column but don't know how to specify a range of columns:

awk -F'\t' '{sum+=$3} END {print sum}' input file> out

4 Answers 4

1

The easiest way is just repeat summing for each column, i.

awk -F '\t' '{
    s3 += $3
    s4 += $4
    s5 += $5
}
END {
    print s3, s4, s5
}' input_file > out
Sign up to request clarification or add additional context in comments.

Comments

1

In awk:

$ awk '
{
    for(i=3;i<=NF;i++)                       # loop wanted fields
        s[i]+=$i }                           # sum to hash, index on field #
END { 
    for(i=3;i<=NF;i++)                       # same old loop
        printf "%s%s",s[i],(i==NF?ORS:OFS) } # output
' file
93 23 26 118

Currently the for loop goes thru every numeric field. Change the parameters if needed.

1 Comment

Thanks, this worked great. I printf "%s%s" to printf "\t" "%s%s" to make the output tab delimitated.
1
$ awk -v OFS='\t' '{s3+=$3; s4+=$4; s5+=$5; $1=$1} 1; 
              END  {print "","",s3,s4,s5}' file

GeneA   Sample  34      7       8       16
GeneA   Sample  17      7       10      91
GeneA   Sample  42      9       8       11
                93      23      26

Comments

0

Try this. Note that NF just means number of fields. And AWK indexing starts with 1. So the example here has a range of 3 to the last col.

awk '{ for(i=3;i<=NF;i++) sum[i] += $i } END { for(i=3;i<=NF;i++) printf( "%d ", sum[i] ); print "" }' input_file

If you want fewer columns, say 3 and 4, then I'd suggest:

awk '{ for(i=3;i<=4 && i<=NF;i++) sum[i] += $i } END { for(i=3;i<=4 && i<=NF;i++) printf( "%d ", sum[i] ); print "" }' input_file

2 Comments

Thanks, I tried this but it only worked for about half of the columns? I tried the third solution below and it worked well.
Hi @MeghanRudd, afraid I confused you by answering more than what was asked. James Brown's and my first solutions are equivalent in summing all the columns (except for the first and second). I provided a second solution to show how you could further limit the range of columns, which would only sum the third and fourth columns. Anyway, glad you found help here.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.