43
votes
Accepted
How do I keep the first 200 lines of all the csv files in a directory using bash?
Assuming that the current directory contains all CSV files and that they all have a .csv filename suffix:
for file in ./*.csv; do
head -n 200 "$file" >"$file.200"
done
This outputs the first ...
34
votes
Merging contents of multiple .csv files into single .csv file
awk '(NR == 1) || (FNR > 1)' *.csv > 1000Plus5years_companies_data.csv
25
votes
Is there a robust command line tool for processing CSV files?
Miller is another nice tool for manipulating name-based data, including CSV (with headers). To extract the first column of a CSV file, without caring about its name, you’d do something like
printf '&...
24
votes
How do I keep the first 200 lines of all the csv files in a directory using bash?
Previous answers copy data and overwrite files. This technique should keep the same inodes, do no copying, and run a whole lot faster. For each file :
(a) Find the length of each file by reading the ...
22
votes
AWK - print range of columns
The utility cut has a compact notation:
cut -d, -f2-7 <input-file>
producing:
column2,column3,column4,column5,column6,column7
Answering the comment by @PlasmaBinturong: my intent was address ...
21
votes
Merging contents of multiple .csv files into single .csv file
use paste
paste -d ',' file1.csv file2.csv ... fileN.csv
21
votes
Replace the comma with vertical bar |, except when inside double quotes, and remove double quotes
Using csvkit:
$ csvformat -D '|' file.csv
12584|Capital of America, Inc.||HORIZONCAPITAL|USA|......etc
25841|Capital of America, Inc.||HORIZONCAPITAL|USA|......etc
87455|Capital of America, Inc.||...
19
votes
Accepted
Using AWK to select rows with specific value in specific column
The file that you run the script on has DOS line-endings. It may be that it was created on a Windows machine.
Use dos2unix to convert it to a Unix text file.
Alternatively, run it through tr:
tr -...
18
votes
Merging contents of multiple .csv files into single .csv file
Use csvstack from csvkit:
csvstack *.csv > out.csv
18
votes
Accepted
Convert JSON array into CSV
The issue is not really that the JSON that you show is an array, but that each element of the array (of which you only have one) is a fairly complex structure. It is straight forward to extract the ...
17
votes
Is there a robust command line tool for processing CSV files?
I'd recommend xsv, a "fast CSV command line toolkit written in Rust".
Written by Ripgrep's author.
Featured in How we made our CSV processing 142x faster (Reddit thread).
16
votes
Accepted
How to use Unix Shell to show only the first n columns and last n columns?
awk -F, '{print $1, $2, $(NF-1), $NF}' < input
More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing ...
15
votes
Accepted
Is there a command line utility to transpose a csv-file?
ruby -rcsv -e 'puts CSV.parse(STDIN).transpose.map &:to_csv' < in.csv > out.csv
15
votes
Accepted
Compact way to get tab-separated fields into variables
Yes:
while read -r width height size thedate thetime; do
# use variables here
done <file
This will read from standard input and split the data on blanks (spaces or tabs). The last variable ...
15
votes
How do I keep the first 200 lines of all the csv files in a directory using bash?
Using sed with shell globbing:
sed -ni '1,200p' *.csv
Using globbing/sed/parallel:
printf '%s\n' *.csv | parallel -- sed -ni '1,200p' {}
This will find all .csv files in the current directory and ...
15
votes
How can SQLite command ".import" read directly from standard input?
I found another solution that still uses sqlite3, but that doesn't read /dev/stdin or a temporary named pipe. Instead, it uses .import with the pipe operator to invoke cat - to read directly from ...
15
votes
Rounding many values in a csv to 3 decimals (printf ?)
There is a GNU utility called numfmt, part of the GNU coreutils collection of tools, that looks as if it could be useful here. It allows you to format numerical values, and the following command ...
14
votes
Substitute every comma outside of double quotes for a pipe
Using csvkit:
$ csvformat -D '|' file.csv
John|Tonny|345.3435,23|56th Street
The tools in csvkit knows how to handle the intricacies of CVS files, and here we're using csvformat to replace the ...
14
votes
awk script to prepare csv file
awk '
BEGIN {
split("Monday Tuesday Wednesday Thursday Friday Saturday Sunday",days)
FS=OFS=","
}
NR > 1 {
gsub(/"/,"")
...
14
votes
Find and add quotes in between particular string
Using csvformat from csvkit, and assuming that the end result should be a CSV file with comma as delimiter (as described in the text of the question):
$ csvformat -d '|' file
1,"a,b",4
1,&...
13
votes
Any good csv bash utility?
Yes: CSVkit. http://csvkit.readthedocs.io/
CSV is not a standard that has anything to do with Unix, hence there are no "standard" (as in POSIX) utility for working with CSV files.
To vertically ...
13
votes
How to cat all the log files within a range of dates
You can nest brace expansion.
Short and sweet:
cat localhost_log_file.2017-{09-{03..30},10-{01..08}}.txt > totallog.csv
Note that some systems such as macOS use an older version of Bash where ...
13
votes
Accepted
Adding suffix to filename during for loop in bash
for file in TLC*.csv; do
cut -d, -f2- "${file}" > "${file%.*}_prepped.csv"
done
${file%.*} removes everything after the last dot of $file. If you want to remove everything ...
13
votes
Grep and Cut command in linux
Although you state awk is not a possibility - for the sake of completeness:
awk -F',' '$9>=1' input.csv
This will instruct awk to consider , as field separator and print only lines where field 9 ...
12
votes
Is there a robust command line tool for processing CSV files?
If you want a visual / interactive tool in the terminal, I wholeheartedly recommend VisiData.
It has frequency tables (shown above), pivot, melting, scatterplots, filtering / computation using Python, ...
12
votes
Accepted
Swap first and second columns in a CSV file
Here you go:
awk -F, '{ print $2 "," $1 }' sampleData.csv
12
votes
Accepted
awk unexpectedly removes dot from string
You seem to have got the quotes wrong. You need to do as below
awk -F"," 'BEGIN { OFS = "," } {$2="\"2.4.0\""; print}' test.csv > output.csv
This is explained in the GNU awk man page - 3.2 Escape ...
12
votes
sort the whole .csv based on the value in a certain column
Using sort:
cat input.csv | (sed -u 1q; sort -t, -r -n -k5)
The sed -u 1q is required to make the sort ignore the header. It basically means, process the 1st line and quit, then pass the remaining ...
12
votes
Accepted
sort the whole .csv based on the value in a certain column
So, you want to sort (stably) on revenue in numerically descending order, which sounds like it should be easy in Miller except that its rules for null handling say:
Records with one or more empty ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
csv × 948awk × 381
text-processing × 373
shell-script × 164
sed × 163
linux × 118
bash × 109
shell × 54
command-line × 47
grep × 41
scripting × 37
text-formatting × 30
files × 28
columns × 27
json × 26
sort × 22
regular-expression × 19
jq × 19
perl × 18
date × 18
array × 15
python × 13
split × 13
join × 12
cut × 11