linux-bash read data from csv and sum up values conditionally

Question

Im new to shell scriptingand been working on reading data from csv. I have a csv file which contain data from line 10. The data is like

2016-12-12,22.5,56
2016-12-13,23.1,62.1
2016-12-14,16.3,76.6
2016-12-15,18.8,44.7
2016-12-16,17.6,53.2

I would like to get average of 2nd and 3rd column for only weekends [ when first column date is either sat or sun]. In this case i need to get averages for 2016-07-2 and 2016-07-3.

I have written a script which uses awk , but however its failing near looping inside first bracket

idealminTemp=15 #set ideal minimum
idealMaxTemp=25 #set ideal maximum
weekends=( "2016-07-2" "2016-07-3" "2016-07-9" "2016-07-10" "2016-07-16" "2016-07-17" "2016-07-23" "2016-07-24" "2016-07-30" "2016-07-31" )

awk -F"," 'NR > 9 {for i in ${weekends[@]} i= $1 ? MinTemp+=$3; MaxTemp+=$4 : MinTemp+=0; MaxTemp+=0 ; } END { ( MinTemp/(NR-9) >= $idealminTemp && MaxTemp/(NR-9) <= $idealMaxTemp ? 
print "Ideal" : print "Not Ideal" }' ./InputFile.csv

What do you think ${weekends[@]} mean? I think you are using shell code style in awk code. Please, read the awk basic guides and learn about awk, because ${weekends[@]} means nothing in awk. weekends is a bash array variable, not an awk array. The same reasoning for idealminTemp and idealMaxTemp. — Jdamian
– Jdamian, Commented Aug 26, 2016 at 17:30

karakfa · Accepted Answer · 2016-08-26 17:54:15Z

2

I didn't understand your script but you can follow this template to restrict the logic to any day of week

$ awk -F, '{date=$2;                              # copy of date field
            gsub("-"," ",date);                   # format date for mktime
            if(strftime("%u",mktime(date " 00 00 00"))>5)  # Saturday:6, Sunday:7
                  print $2,($3+$4)/2}' file       # print original date field and average

2016-07-2 13.55
2016-07-3 11.45

answered Aug 26, 2016 at 17:54

karakfa

67.7k8 gold badges45 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

David C. Rankin · Accepted Answer · 2016-08-26 21:12:45Z

If I understand you need to read the csv file, check which dates are either Saturday or Sunday and for either day, average the values that follow the date, you can do that in bash itself (making a call to date to check the day, and bc to handle the average). There is absolutely nothing wrong with using awk for the task, but be aware, bash can also do it for you. For example you can do the following:

#!/bin/bash

while IFS=, read -r a b c d; do 
    day=$(date -d "$b" +%w)
    if [[ $day -eq 0 ]]; then
        avg=$(echo "scale=2; ($c + $d)/2" | bc)
        echo "$b -- Sunday, average: $avg"
    fi

    if [[ $day -eq 6 ]]; then
        avg=$(echo "scale=2; ($c + $d)/2" | bc)
        echo "$b -- Saturday, average: $avg"
    fi
done <"$1"

Which simply reads each line (discarding the empty field at the beginning of each row) and then using date -d "$b" +%w (with b holding the 2nd field, date) to check whether the date is a Saturday or Sunday (day 6 and 0, respectively). If it is a weekend day, the values (in the 3rd and 4th fields, c and d, respectively) are passed to bc to average.

Example

Using your data, the script would provide:

$ bash dayavgweekend.sh dayscsv
2016-07-2 -- Saturday, average: 13.55
2016-07-3 -- Sunday, average: 11.45

Collectives™ on Stack Overflow

linux-bash read data from csv and sum up values conditionally

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related