Move top 1000 lines from text file to a new file using Unix shell commands


Unix Problem Overview

I wish to copy the top 1000 lines in a text file containing more than 50 million entries, to another new file, and also delete these lines from the original file.

Is there some way to do the same with a single shell command in Unix?

Unix Solutions

Solution 1 - Unix

head -1000 input > output && sed -i '1,+999d' input

For example:

$ cat input 
$ head -3 input > output && sed -i '1,+2d' input
$ cat input 
$ cat output 

Solution 2 - Unix

head -1000 file.txt > first100lines.txt
tail --lines=+1001 file.txt > restoffile.txt

Solution 3 - Unix

Out of curiosity, I found a box with a GNU version of sed (v4.1.5) and tested the (uncached) performance of two approaches suggested so far, using an 11M line text file:

$ wc -l input
11771722 input

$ time head -1000 input > output; time tail -n +1000 input > input.tmp; time cp input.tmp input; time rm input.tmp

real	0m1.165s
user	0m0.030s
sys     0m1.130s

real	0m1.256s
user	0m0.062s
sys     0m1.162s

real	0m4.433s
user	0m0.033s
sys     0m1.282s

real	0m6.897s
user	0m0.000s
sys     0m0.159s

$ time head -1000 input > output && time sed -i '1,+999d' input

real	0m0.121s
user	0m0.000s
sys 	0m0.121s

real	0m26.944s
user	0m0.227s
sys 	0m26.624s

This is the Linux I was working with:

$ uname -a
Linux hostname 2.6.18-128.1.1.el5 #1 SMP Mon Jan 26 13:58:24 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

For this test, at least, it looks like sed is slower than the tail approach (27 sec vs ~14 sec).

Solution 4 - Unix

This is a one-liner but uses four atomic commands:

head -1000 file.txt > newfile.txt; tail +1000 file.txt > file.txt.tmp; cp file.txt.tmp file.txt; rm file.txt.tmp

Solution 5 - Unix

Perl approach:

perl -ne 'if($i<1000) { print; } else { print STDERR;}; $i++;' in 1> 2> out && mv in

Solution 6 - Unix

Using pipe:

cat en-tl.100.en | head -10


All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiongagneetView Question on Stackoverflow
Solution 1 - UnixmoinudinView Answer on Stackoverflow
Solution 2 - UnixcletusView Answer on Stackoverflow
Solution 3 - UnixAlex ReynoldsView Answer on Stackoverflow
Solution 4 - UnixAlex ReynoldsView Answer on Stackoverflow
Solution 5 - UnixpiotrView Answer on Stackoverflow
Solution 6 - UnixJavid DadashkarimiView Answer on Stackoverflow