Store awk result in variables in bash script

Question

I want to extract the row of a CSV file where column 4 contains a certain number.

The CSV file's rows look like this:

Markus;Haltmeyer;ID;SomeIdentifier

I want to store the first column and second column in different variables each, if SomeIdentifier is fownd.

In the bash script I only have the first characters of SomeIdentifier in a variable firstPartOfID. But nevertheless the correct row is found with the following command:

result=$(awk -v pat="${firstPartOfID}" -F ";" '$0~pat{print $1, $2 }' MyFile.csv)
echo ${result}

Unfortunately result contains both columns. I could try to split $result afterwards, but I want to do it with awk directly.

hek2mgl · Accepted Answer · 2018-08-21 17:59:49Z

3

You can use read together with process substitution:

read var1 var2 < <(awk -v regexp="${firstPartOfID}" -F ";" '$0~regexp{print $1, $2 }')

I assume that the output does not contain whitespace (except of the delimiter). Otherwise you need to use a different output delimiter in awk and use that also in read:

IFS=";" read var1 var2 < <(awk -v regexp="${firstPartOfID}" 'BEGIN{FS=OFS=";"}$0~regexp{print $1, $2 }')

I'm using the ; as the output delimiter in the above example. It makes sense to use it because it is also the input delimiter and therefore it is guaranteed to be not contained in the data.

Btw, instead of using a regular expression you may use the index() function in awk. That would be more efficient.

awk -v id_prefix="${firstPartOfID}" -F ";" 'index($3, id_prefix){print $1, $2 }'

edited Aug 21, 2018 at 17:59

answered Aug 21, 2018 at 13:00

hek2mgl

159k31 gold badges263 silver badges279 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

PesaThe Over a year ago

I think ; would be a more suitable delimiter, since it is the original delimiter. The columns (as unlikely as it is) may contain #.

Sadık Over a year ago

I think it doesn't matter. This works great. Thank you.

Ed Morton Over a year ago

index() takes 2 args, not 1. Also, I know you're just trying to show the OP how to save the awk output but look at what she said she's trying to do - I want to extract the row of a CSV file where column 4 contains a certain number.. That's not what her awk script did, presumably due to a bug, and so it's not what yours does either. You could fix that for her with 'index($4,pat)==1{... or just $4~("^"pat) { if pat doesn't contain any RE metachars. (and please rename pat to regexp or string - whichever it really is since "patterns" are for quilts and knitting, not software!).

hek2mgl Over a year ago

pat changed to regexp. I should know better ;) Use of index() fixed. Thanks!. About the remaining part, Not sure if I'm missing something but the question says I want to store the first column and second column in different variables each. For me it looks like the filter in awk basically worked but the result still had to be split using the shell.

Eric Renouf · Accepted Answer · 2018-08-21 13:10:05Z

2

You can also do this skipping awk if you want multiple values, and just use bash to do the pattern matching:

while IFS=\; read first last idfield rest; do
    if [[ $idfield =~ $firstPartOfID ]]; then
        first_name=$first
        last_name=$last
        break
    fi
done < MyFile.csv

or depending on what you want to do with those values after, you might be able to do that within awk

answered Aug 21, 2018 at 13:10

Eric Renouf

14.6k3 gold badges53 silver badges73 bronze badges

Collectives™ on Stack Overflow

Store awk result in variables in bash script

2 Answers 2

4 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Related