Skip to main content
43 votes
Accepted

How do I keep the first 200 lines of all the csv files in a directory using bash?

Assuming that the current directory contains all CSV files and that they all have a .csv filename suffix: for file in ./*.csv; do head -n 200 "$file" >"$file.200" done This outputs the first ...
Kusalananda's user avatar
  • 356k
34 votes

Merging contents of multiple .csv files into single .csv file

awk '(NR == 1) || (FNR > 1)' *.csv > 1000Plus5years_companies_data.csv
user387763's user avatar
25 votes

Is there a robust command line tool for processing CSV files?

Miller is another nice tool for manipulating name-based data, including CSV (with headers). To extract the first column of a CSV file, without caring about its name, you’d do something like printf '&...
Stephen Kitt's user avatar
24 votes

How do I keep the first 200 lines of all the csv files in a directory using bash?

Previous answers copy data and overwrite files. This technique should keep the same inodes, do no copying, and run a whole lot faster. For each file : (a) Find the length of each file by reading the ...
Paul_Pedant's user avatar
  • 9,394
22 votes

AWK - print range of columns

The utility cut has a compact notation: cut -d, -f2-7 <input-file> producing: column2,column3,column4,column5,column6,column7 Answering the comment by @PlasmaBinturong: my intent was address ...
drl's user avatar
  • 848
21 votes

Merging contents of multiple .csv files into single .csv file

use paste paste -d ',' file1.csv file2.csv ... fileN.csv
Chris's user avatar
  • 1,077
21 votes

Replace the comma with vertical bar |, except when inside double quotes, and remove double quotes

Using csvkit: $ csvformat -D '|' file.csv 12584|Capital of America, Inc.||HORIZONCAPITAL|USA|......etc 25841|Capital of America, Inc.||HORIZONCAPITAL|USA|......etc 87455|Capital of America, Inc.||...
Kusalananda's user avatar
  • 356k
19 votes
Accepted

Using AWK to select rows with specific value in specific column

The file that you run the script on has DOS line-endings. It may be that it was created on a Windows machine. Use dos2unix to convert it to a Unix text file. Alternatively, run it through tr: tr -...
Kusalananda's user avatar
  • 356k
18 votes

Merging contents of multiple .csv files into single .csv file

Use csvstack from csvkit: csvstack *.csv > out.csv
Dan F's user avatar
  • 289
18 votes
Accepted

Convert JSON array into CSV

The issue is not really that the JSON that you show is an array, but that each element of the array (of which you only have one) is a fairly complex structure. It is straight forward to extract the ...
Kusalananda's user avatar
  • 356k
17 votes

Is there a robust command line tool for processing CSV files?

I'd recommend xsv, a "fast CSV command line toolkit written in Rust". Written by Ripgrep's author. Featured in How we made our CSV processing 142x faster (Reddit thread).
Nicolas Girard's user avatar
16 votes
Accepted

How to use Unix Shell to show only the first n columns and last n columns?

awk -F, '{print $1, $2, $(NF-1), $NF}' < input More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing ...
Jeff Schaller's user avatar
  • 68.8k
15 votes
Accepted

Is there a command line utility to transpose a csv-file?

ruby -rcsv -e 'puts CSV.parse(STDIN).transpose.map &:to_csv' < in.csv > out.csv
luikore's user avatar
  • 266
15 votes
Accepted

Compact way to get tab-separated fields into variables

Yes: while read -r width height size thedate thetime; do # use variables here done <file This will read from standard input and split the data on blanks (spaces or tabs). The last variable ...
Kusalananda's user avatar
  • 356k
15 votes

How do I keep the first 200 lines of all the csv files in a directory using bash?

Using sed with shell globbing: sed -ni '1,200p' *.csv Using globbing/sed/parallel: printf '%s\n' *.csv | parallel -- sed -ni '1,200p' {} This will find all .csv files in the current directory and ...
jesse_b's user avatar
  • 41.5k
15 votes

How can SQLite command ".import" read directly from standard input?

I found another solution that still uses sqlite3, but that doesn't read /dev/stdin or a temporary named pipe. Instead, it uses .import with the pipe operator to invoke cat - to read directly from ...
Derek Mahar's user avatar
15 votes

Rounding many values in a csv to 3 decimals (printf ?)

There is a GNU utility called numfmt, part of the GNU coreutils collection of tools, that looks as if it could be useful here. It allows you to format numerical values, and the following command ...
Kusalananda's user avatar
  • 356k
14 votes

Substitute every comma outside of double quotes for a pipe

Using csvkit: $ csvformat -D '|' file.csv John|Tonny|345.3435,23|56th Street The tools in csvkit knows how to handle the intricacies of CVS files, and here we're using csvformat to replace the ...
Kusalananda's user avatar
  • 356k
14 votes

awk script to prepare csv file

awk ' BEGIN { split("Monday Tuesday Wednesday Thursday Friday Saturday Sunday",days) FS=OFS="," } NR > 1 { gsub(/"/,"") ...
Ed Morton's user avatar
  • 35.8k
14 votes

Find and add quotes in between particular string

Using csvformat from csvkit, and assuming that the end result should be a CSV file with comma as delimiter (as described in the text of the question): $ csvformat -d '|' file 1,"a,b",4 1,&...
Kusalananda's user avatar
  • 356k
13 votes

Any good csv bash utility?

Yes: CSVkit. http://csvkit.readthedocs.io/ CSV is not a standard that has anything to do with Unix, hence there are no "standard" (as in POSIX) utility for working with CSV files. To vertically ...
Kusalananda's user avatar
  • 356k
13 votes

How to cat all the log files within a range of dates

You can nest brace expansion. Short and sweet: cat localhost_log_file.2017-{09-{03..30},10-{01..08}}.txt > totallog.csv Note that some systems such as macOS use an older version of Bash where ...
Wildcard's user avatar
  • 37.5k
13 votes
Accepted

Adding suffix to filename during for loop in bash

for file in TLC*.csv; do cut -d, -f2- "${file}" > "${file%.*}_prepped.csv" done ${file%.*} removes everything after the last dot of $file. If you want to remove everything ...
Quasímodo's user avatar
  • 19.4k
13 votes

Grep and Cut command in linux

Although you state awk is not a possibility - for the sake of completeness: awk -F',' '$9>=1' input.csv This will instruct awk to consider , as field separator and print only lines where field 9 ...
AdminBee's user avatar
  • 23.6k
12 votes

Is there a robust command line tool for processing CSV files?

If you want a visual / interactive tool in the terminal, I wholeheartedly recommend VisiData. It has frequency tables (shown above), pivot, melting, scatterplots, filtering / computation using Python, ...
DameDebugger's user avatar
12 votes
Accepted

Swap first and second columns in a CSV file

Here you go: awk -F, '{ print $2 "," $1 }' sampleData.csv
Joe M's user avatar
  • 914
12 votes
Accepted

awk unexpectedly removes dot from string

You seem to have got the quotes wrong. You need to do as below awk -F"," 'BEGIN { OFS = "," } {$2="\"2.4.0\""; print}' test.csv > output.csv This is explained in the GNU awk man page - 3.2 Escape ...
Inian's user avatar
  • 13.1k
12 votes

sort the whole .csv based on the value in a certain column

Using sort: cat input.csv | (sed -u 1q; sort -t, -r -n -k5) The sed -u 1q is required to make the sort ignore the header. It basically means, process the 1st line and quit, then pass the remaining ...
annahri's user avatar
  • 2,118
12 votes
Accepted

sort the whole .csv based on the value in a certain column

So, you want to sort (stably) on revenue in numerically descending order, which sounds like it should be easy in Miller except that its rules for null handling say: Records with one or more empty ...
steeldriver's user avatar
  • 83.8k

Only top scored, non community-wiki answers of a minimum length are eligible