I have a big file and need to split into two files. Suppose in the first file the 1000 lines should be selected and put into another file and delete those lines in the first file.
I tried using split but it is creating multiple chunks.
The easiest way is probably to use head and tail:
$ head -n 1000 input-file > output1
$ tail -n +1001 input-file > output2
That will put the first 1000 lines from input-file into output1, and all lines from 1001 till the end in output2
I think that split is you best approach.
Try using the -l xxxx option, where xxxx is the number of lines you want in each file (default is 1000).
You can use the -n yy option if you are more concerned about the amount of files created. Use -n 2 will split your file in only 2 parts, no matter the amount of lines in each file.
You can count the amount of lines in your file with wc -l filename. This is the 'wordcount' command with the lines option.
man splitman wcsplit -l 1000 bigfile && mv xaa piece1 && cat x?? > piece2 && rm x??.
split is what I was looking for
This is a job for csplit:
csplit -s infile 1001
will silently split infile, the first piece xx00 - up to but not including line 1001 and the second piece xx01 - the remaining lines.
You can play with the options if you need different output file names e.g. using -f and specifying a prefix:
csplit -sf piece. infile 1001
produces two files named piece.00 and piece.01
With a smart head you could also do something like:
{ head -n 1000 > 1st.out; cat > 2nd.out; } < infile
csplit. Very nice. (I'm just reading through the list of POSIX commands and had enormous trouble wrapping my head around the csplit command's purpose at first. Turns out it's really really simple.) :)
A simple way to do what the question asks for, in one command:
awk '{ if (NR <= 1000) print > "piece1"; else print > "piece2"; }' bigfile
or, for those of you who really hate to type long, intuitively comprehensible commands,
awk '{ print > ((NR <= 1000) ? "piece1" : "piece2"); }' bigfile
#!/bin/bash
split() {
n=$2
m=$(( n + 1 ))
head -c $n -- "$1" > "$3"
tail -c +$m -- "$1" > "$4"
}
# Split smth
split a.txt 10000 b.txt c.txt
# Test a = b + c
set -x
cat b.txt c.txt > test.txt
diff a.txt test.txt
rm test.txt
split --help?