Split a file into two

Question

I have a big file and need to split into two files. Suppose in the first file the 1000 lines should be selected and put into another file and delete those lines in the first file.

I tried using split but it is creating multiple chunks.

Yes i have checked it, but is creating multiple files which doesn't need to me. — Aravind
– Aravind, Commented Oct 21, 2014 at 16:02

Michael Mrozek · Accepted Answer · 2014-10-21 16:11:24Z

70

The easiest way is probably to use head and tail:

$ head -n 1000 input-file > output1
$ tail -n +1001 input-file > output2

That will put the first 1000 lines from input-file into output1, and all lines from 1001 till the end in output2

answered Oct 21, 2014 at 16:11

Michael Mrozek

95.7k40 gold badges245 silver badges236 bronze badges

2

Funny thing that it works pretty well with 2GB+ files too!

Gergely Lukacsy
– Gergely Lukacsy

2020-10-21 18:12:28 +00:00
Commented Oct 21, 2020 at 18:12
I think the OP asked for a solution whereby only one new file is created, while the first file is truncated.

Leslie Krause
– Leslie Krause

2024-04-17 19:48:53 +00:00
Commented Apr 17, 2024 at 19:48

Add a comment |

slm · Accepted Answer · 2014-10-21 16:56:27Z

31

I think that split is you best approach.

Try using the -l xxxx option, where xxxx is the number of lines you want in each file (default is 1000).

You can use the -n yy option if you are more concerned about the amount of files created. Use -n 2 will split your file in only 2 parts, no matter the amount of lines in each file.

You can count the amount of lines in your file with wc -l filename. This is the 'wordcount' command with the lines option.

References

man split
man wc

edited Oct 21, 2014 at 16:56

slm♦

380k127 gold badges793 silver badges897 bronze badges

answered Oct 21, 2014 at 16:44

Lucien Raven

4193 silver badges3 bronze badges

2

This is how to split into a bunch of files with a fixed number of lines, or how to split evenly into a fixed number of files. Is there a way to split into one 1000-line file and one file with everything else? That's what he was asking for; I couldn't find it in the man page

Michael Mrozek
– Michael Mrozek

2014-10-21 17:05:38 +00:00
Commented Oct 21, 2014 at 17:05
You´re correct Michael. I think I took a simplistic view on the question. You solution is the best one in this case. Another way would be to use the 'sed' command: sed -n 1,1000 originalfile > first_1000_lines. sed '1,1000d' originalfile > remaining_lines.

Lucien Raven
– Lucien Raven

2014-10-21 17:17:25 +00:00
Commented Oct 21, 2014 at 17:17
Of course you could do split -l 1000 bigfile && mv xaa piece1 && cat x?? > piece2 && rm x??.

G-Man Says 'Reinstate Monica'
– G-Man Says 'Reinstate Monica'

2014-10-21 23:40:59 +00:00
Commented Oct 21, 2014 at 23:40
split is what I was looking for

Daniel
– Daniel

2020-04-08 20:53:48 +00:00
Commented Apr 8, 2020 at 20:53
split with both -l and -n options doesn't run ('split: cannot split in more than one way'). Question wanted file into 2 parts, but at a specific line: split is the wrong tool for this job. csplit is the correct tool

RGD2
– RGD2

2021-06-28 23:40:06 +00:00
Commented Jun 28, 2021 at 23:40

Add a comment |

don_crissti · Accepted Answer · 2018-08-26 19:37:46Z

20

This is a job for csplit:

csplit -s infile 1001

will silently split infile, the first piece xx00 - up to but not including line 1001 and the second piece xx01 - the remaining lines.
You can play with the options if you need different output file names e.g. using -f and specifying a prefix:

csplit -sf piece. infile 1001

produces two files named piece.00 and piece.01

With a smart head you could also do something like:

{ head -n 1000 > 1st.out; cat > 2nd.out; } < infile

edited Aug 26, 2018 at 19:37

answered May 10, 2015 at 22:54

don_crissti

85.6k31 gold badges234 silver badges262 bronze badges

2

Wow, it really is a job for csplit. Very nice. (I'm just reading through the list of POSIX commands and had enormous trouble wrapping my head around the csplit command's purpose at first. Turns out it's really really simple.) :)

Wildcard
– Wildcard

2016-11-02 05:38:31 +00:00
Commented Nov 2, 2016 at 5:38

Add a comment |

G-Man Says 'Reinstate Monica' · Accepted Answer · 2014-10-21 21:59:34Z

6

A simple way to do what the question asks for, in one command:

awk '{ if (NR <= 1000) print > "piece1"; else print > "piece2"; }' bigfile

or, for those of you who really hate to type long, intuitively comprehensible commands,

awk '{ print > ((NR <= 1000) ? "piece1" : "piece2"); }' bigfile

edited Oct 21, 2014 at 21:59

answered Oct 21, 2014 at 21:11

G-Man Says 'Reinstate Monica'

24k29 gold badges76 silver badges130 bronze badges

Add a comment |

arcadius · Accepted Answer · 2024-09-13 15:44:57Z

0

#!/bin/bash

split() {
    n=$2
    m=$(( n + 1 ))
    head -c  $n -- "$1" > "$3"
    tail -c +$m -- "$1" > "$4"
}

# Split smth
split a.txt 10000 b.txt c.txt

# Test a = b + c
set -x
cat b.txt c.txt > test.txt
diff a.txt test.txt
rm test.txt

answered Sep 13, 2024 at 15:44

arcadius

213 bronze badges

Add a comment |

Stack Exchange Network

Split a file into two

5 Answers 5

References

You must log in to answer this question.

Linked

Hot Network Questions

Split a file into two

5 Answers 5

References

You must log in to answer this question.

Linked

Related

Hot Network Questions