2

I have over 1000 .csv files in a folder on Debian Squeeze. I am trying to create a new .csv file containing ONLY the first rows of each of these 1000 .csv files.

I tried:

read -r firstline < sourcefile_1.csv > headers.csv

But that only created a blank file. (And even if it did work, I would have copied only the first row of just one file.)

How do I write a command that copies the first rows of all 1000 files in the folder and add it to the new .csv file?

Thanks in advance!

2 Answers 2

8

head -q -n 1 *.csv > output.csv
-q supresses the headers that normally get printed, -n 1 prints only the first line

0
0

Assuming the CSV files all contain headers and may contain fields with embedded newlines, we can't just run head on each file (as this may truncate records, and since this would also include the header from each file).

Instead of head, using a CSV-aware tool, such as Miller (mlr), would be a better option:

mlr --csv put -q 'FNR == 1 { emit $* }' *.csv

This outputs the first data record of each file matching the *.csv filename globbing pattern.

Example:

$ cat file1
a,b
1,2
3,4
$ cat file2
a,b
5,6
7,8
$ cat file3
a,b
field one,"last
field
here"
$ mlr --csv put -q 'FNR == 1 { emit $* }' file[123]
a,b
1,2
5,6
field one,"last
field
here"

If the CSV files are header-less, use mlr with its -N option.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.