40
votes
Accepted
Splitting a small file into 512 byte segments changes it, but splitting it in 1k segments doesn't
The order find processes the files is not deterministic. It may be just the same order as the underlying system call gives, which probably depends on the underlying filesystem structure and can be ...
35
votes
Accepted
How to split a string by the third .(dot) delimiter
Used sed for example:
$ echo 'version: 1.8.0.110' | sed 's/\./-/3'
version: 1.8.0-110
Explanation:
sed s/search/replace/x searches for a string and replaces it with another string. x determines which ...
31
votes
Split two concatenated files
You'll have to figure out where the gif ends and where the 7z starts.
If you don't know the original size of the gif file, you can try and spot the start of the 7z file which should start with the 7z ...
29
votes
What is the state of the art of splitting a binary file by size?
The split command has been part of Unix since the ancient days, and while it was originally a text processing command that split in lines, modern implementations also work with binary files. split -b ...
25
votes
Accepted
Uncompressed file estimation wrong?
This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and ...
14
votes
Accepted
Split single line into multiple lines, Newline character missing for all the lines in input file
Using awk
$ awk -v RS='[,\n]' '{a=$0;getline b; getline c; print a,b,c}' OFS=, filename
A,B,C
D,E,F
G,H,I
J,K,L
M,N,O
How it works
-v RS='[,\n]'
This tells awk to use any occurrence of either a ...
10
votes
Splitting string by the first occurrence of a delimiter
Solution in standard bash:
text='id;some text here with possible ; inside'
text2=${text#*;}
text1=${text%"$text2"}
echo $text1
#=> id;
echo $text2
#=> some text ...
10
votes
How to split an image vertically using the command line?
Using the "tiles" functionality:
convert image.png -crop 1x5@ out-%d.png
https://www.imagemagick.org/Usage/crop/#crop_tile
9
votes
Cloning/Splitting a serial port (COM) port in Ubuntu
While a previous answer said it cannot be shared, this is partly wrong.
A linux TTY port can be opened with different applications ( if they don't use or check for locks), however data will be ...
9
votes
Split large files into a number of files in unix
You have to do the integer division to get the value for the -l parameter of split. The shell is just fine for integer divisions:
lines_number=$(wc -l < file)
split -l $((lines_number / 10)) file
...
9
votes
Accepted
Save current vim session state and restore it later (e.g. buffers, splits etc.)
You can use vim sessions for that. Just run:
:mksession mysession.vim
and a file will be created in the current directory (called 'mysession.vim'). When you next open vim, you can do:
:source ...
8
votes
What's the best way to join files again after splitting them?
Files Spliting
Split By Size
If you want to split big file into small files and choose name and size of small output files this is the way.
split -b 500M videos\BigVideoFile.avi SmallFile.
In this ...
8
votes
How to split a large folder into smaller folders of equal size
Without trying to solve the bin packing problem, you could use a script like this:
#!/bin/bash
directory=${1:-testdir} ...
8
votes
Accepted
rename the filename
First of all, avoid parsing the output of ls. Next, even if you have a good reason to parse ls output (you don't, here), there is no reason to pass it through grep: ls *gz will list only the file and ...
7
votes
Bash: split multi line input into array
Your attempt is close to the actual solution. The relevant flag can be found in the read help:
$ help read
...
-d delim continue until the first character of DELIM is read, rather
...
6
votes
Accepted
Split and concatenate (create command line arguments from input file)
This depends on what you want to do with the string that you create. It looks like a set of command line options, so I'm going to assume that you want to use it as such together with some utility ...
6
votes
Split file in different records using a loop and give the files new names
You can try to split the file with command split. If you want 20k records in file the command will be:
split -l 20000 file1
If you want specific prefix for result files use command like:
split -l ...
6
votes
Accepted
awk: split file by column name and add header row to each file
The solution would be to store the header in a separate variable and print it on the first occurence of a new $1 value (=file name):
awk -F'|' 'FNR==1{hdr=$0;next} {if (!seen[$1]++) print hdr>$1; ...
6
votes
I don't really understand expansion and quoting on the shell (zsh/bash)
for f in "$(tmsu files)"; do echo "${f}"; done
That's more like the wrong thing to do than a zsh-ism. The double-quotes tell the shell to keep the result of the expansion as a ...
6
votes
How do I extract some pages of a PDF into another PDF file?
Use the qpdf utility. Example:
qpdf --empty --pages foo.pdf 1-4,6 -- bar.pdf
this will open the input file foo.pdf, take five page - pages 1 through 4 and also page 6 - from that input file and put ...
5
votes
Accepted
How to split a single file into multiple files based on a column in linux?
Awk solution:
awk 'NR==1{ h=$0 }NR>1{ print (!a[$2]++? h ORS $0 : $0) > $2".txt" }' file
NR==1{ h=$0 } - capture the 1st line/record as header line (NR points to a record number, $0 - contains ...
5
votes
Split single line into multiple lines, Newline character missing for all the lines in input file
sed 's/\(\([^,]\+,\)\{3\}\)/\1\n/g;s/,\n/\n/g' filename
I know that you asked for an awk solution, and I'll now try to submit that as an edit to this answer, but for me a sed solution was simpler... .....
5
votes
Accepted
Split file in different records using a loop and give the files new names
Romeo Ninov already gave you The Right Answer™: use split. But to answer the general case about sed, you could do the same thing with:
i=1;
filelen=$(wc -l < file1)
while [[ $i -le $filelen ]]; do ...
5
votes
How to split a string by the third .(dot) delimiter
Use bash
You do not need an external program to do this, especially since it appears you want to replace the last period with a dash. Bash can handle string manipulations itself.
Presuming you have
...
5
votes
Using split() with awk
This might be what you want:
n=split($7,a," "); print a[n-3], a[n-4], a[n-5]
If not then edit your question to provide concise, testable sample input and expected output that demonstrates ...
5
votes
Accepted
Use output of cat with split command and specified output directory
Use - as the input filename. e.g.
cat file.csv | tail -n +2 | split -l 500 - /mnt/outdir
but there's no need for cat here.
tail -n +2 file.csv | split -l 500 - /mnt/outdir
Alternatively, use /dev/...
5
votes
Split a file into multiple files, using the first X positions of a line, using sed or awk
$ cat tst.awk
/ PAGE 1$/ {
close(out)
split(prev,p)
out = p[1]
}
NR > 1 { print prev > out }
{ prev = $0 }
END { print prev > out }
$ awk -f tst.awk file
$ head T*
==> T5271-...
5
votes
Accepted
What is the max size limit for using split and cat combination?
The cat command has practically no limit, and is only bounded by your system capabilities in terms of disk space and RAM.
The error "file size too large" you are seeing comes from the ...
4
votes
Accepted
Removing a field from a comma delimited text with accented chars
Since I'm not sure you want a perl code so much, here is a similar awk code:
awk -F';' -v OFS=';' '{ $NF=""; print }' data.csv
=> This code empties the last field of each line ($NF=""). Input fields (...
4
votes
split file into N pieces with same name but different target directories
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
split × 283text-processing × 61
awk × 48
bash × 37
files × 34
shell-script × 26
linux × 24
tar × 22
shell × 18
command-line × 18
sed × 16
csv × 13
string × 12
gzip × 9
grep × 8
perl × 7
cat × 7
ubuntu × 6
regular-expression × 6
pipe × 6
backup × 6
filenames × 6
command × 5
scripting × 4
find × 4