awk/bash append headers in many csv files

Question

I would like to transform the header of many csv files automatically using awk and bash scripts.

Currently, I am using the following code-block, which is working fine:

for FILE in *.csv;

do

awk 'FNR>1{print $0}' $FILE | awk 'NR == 1{print "aaa,bbb,ccc,ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn,...,zzz"}1' > OUT_$FILE

done

What these commands are doing is that it first removes the old header from $FILE and then ~~append~~ prepend a new comma-separated (very long) header aaa,bbb,ccc,ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn,...,zzz to $FILE and then save the output to OUT_$FILE.

Currently, I am copying the part aaa,bbb,ccc,ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn,...,zzz manually from another csv file and pasting into this field to replace the header from $FILE. While it is working, it is getting tedious, repetitive and time-consuming for many csv files.

Instead of copying the header manually, I am trying to extract the header from another csv file new_headers.csv and save to a new variable $NEWHEAD.

NEWHEAD=$(awk 'NR==1{print $0}' new_headers.csv)

While I can view the extracted header $NEWHEAD, I am not sure how to merge this command into previous workflow to ~~append~~ prepend the headers from $FILE.

I will certainly appreciate any suggestions to resolve this problem. Thank you :)

Aside: To "append" something is to add it to the end; if you're putting a header at the beginning, you're prepending rather than appending. — Charles Duffy
– Charles Duffy, Commented Jan 20, 2022 at 17:41
Yes, you are right! Thank you for this suggestion, I have changed the word from 'append' to 'prepend'. — doraemon
– doraemon, Commented Jan 24, 2022 at 9:03

Ed Morton · Accepted Answer · 2022-01-20 20:47:37Z

1

With GNU awk for "inplace" editing:

awk -i inplace 'NR==1{hdr=$0} {print (FNR>1 ? $0 : hdr)}' new_headers.csv *.csv

answered Jan 20, 2022 at 20:47

Ed Morton

208k18 gold badges90 silver badges212 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

doraemon Over a year ago

This is a neat solution! It reduces many lines into a single line to replace the header from *.csv using the input from new_headers.csv. Thank you!

Fravadona · Accepted Answer · 2022-01-20 17:55:28Z

0

newheader=$(head -n 1 new_headers.csv)

for file in *.csv
do
    {
        printf '%s\n' "$newheader"
        tail -n +2 "$file" 
    } > OUT_"$file"
done

notes:

head -n 1 outputs the first line of a file
tail -n +2 outputs all the lines but the first
{ } is to group commands, so that you redirect their output as a whole

edited Jan 20, 2022 at 17:55

answered Jan 20, 2022 at 17:39

Fravadona

17.6k1 gold badge29 silver badges50 bronze badges

1 Comment

doraemon Over a year ago

Thank you for this suggestion! I tested the solution just now. It was easy to follow and worked!

Diego Torres Milano · Accepted Answer · 2022-01-20 17:49:05Z

0

You can read the header inside awk script, like this

awk '
  BEGIN{
    do {
      h = (h) ? (h "\n" line) : line
    } while ((getline line <"new_header.csv") > 0)
}

...
'

and h contains the new header.

answered Jan 20, 2022 at 17:49

Diego Torres Milano

69.9k9 gold badges116 silver badges145 bronze badges

2 Comments

doraemon Over a year ago

This solution doesn't seem to work, it doesn't produces the output as expected...

Diego Torres Milano Over a year ago

This does not produce any output, it's showing how to read a file in awk, then you need to replace the ... by whatever you want to do with the header h

karakfa · Accepted Answer · 2022-01-20 19:12:05Z

0

$ awk 'NR==FNR {header=$0; next} 
               {print (FNR==1?header:$0) > (FILENAME".updated")}' new_header.csv other files...

capture the first record from the header file and replace the first lines from the rest of the files, updated files will have suffix ".updated".

caveat emptor not tested.

answered Jan 20, 2022 at 19:12

karakfa

67.7k8 gold badges45 silver badges59 bronze badges

2 Comments

karakfa Over a year ago

yes, right. I assumed the header file has just the header but might be better to make it extract the first line.

doraemon Over a year ago

This solution is also working, I have made minimal changes to your suggestion, by changing 'updated' from suffix to be prefix because the suffix will modify the csv filetype from .csv to be .csv.updated. awk 'NR==FNR {header=$0; next}{print (FNR==1?header:$0) > ("updated_"FILENAME)}' new_header.csv *.csv Thank you for your suggestion!

Collectives™ on Stack Overflow

awk/bash append headers in many csv files

4 Answers 4

1 Comment

1 Comment

2 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

1 Comment

2 Comments

2 Comments

Related