1

I am creating an output file inside an awk script which I want to compress then.

Input file - Marks.txt

Student1:AP:Maths:30:Science:43
Student2:AP:Maths:23:Science:35
Student3:Non_AP:Maths:17:Science:33

My code looks as below

BEGIN{
   FS = ":"
}

$2 == "AP"{
 print $3, $4 > "maths_AP.txt"
}

$2 == "Non_AP"{
 print $3, $4 > "maths_non_AP.txt"
}

{...} #some other processing not relevant to question

I want to create both maths_AP.txt and maths_non_AP.txt as zipped files. Some forums suggest using gunzip function but I can't understand how to place it in the script.

1
  • 2
    Why not compress the files after they've already been created, outside of the awk command? Commented Jan 27, 2022 at 13:34

2 Answers 2

8

Awk is a tool for manipulating text. A shell is a tool for manipulating (creating/destroying) files and processes and sequencing calls to other tools. Therefore you shouldn't in general sequence calls to other tools from inside awk as that's the job of a shell, instead manipulate text with awk and then let the shell call any other tools, e.g. untested:

mkdir out &&
sort -t':' -k3,3 -k2,2 Marks.txt |
awk '
    BEGIN { FS=OFS=":" }
    { key = "out/" $3 "_" $2 ".txt" }
    key != out {
        close(out)
        out = key
    }
    { print > out }
' &&
for file in out/*.txt; do
    zip "$file" &&
    rm -f "$file"         # assuming you want to discard the .txt file
done

The above will work using any versions of the tools. Any awk solution that doesn't call close() will fail in most awk versions once you get past some threshold of max number of simultaneously open files which I've seen be less than 20.

2

compress can be done after, or during awk's run.

Try

$2 == "AP"{
 print $3, $4 > "maths_AP.txt" ;
 print $3, $4 | "gzip > maths_AP.gz" ;
}

$2 == "Non_AP"{
 print $3, $4 > "maths_non_AP.txt" ;
 print $3, $4 | "gzip > maths_non_AP.gz" ;
}
  • you may run out of file descriptor if there is too many files to write.
  • I ended awk's statement by ; for ages before noticing it was optionnal.

You must log in to answer this question.