How to crop(cut) text files based on starting and ending line-numbers in cygwin?

LinuxCommand LineCygwin

Linux Problem Overview


I have few log files around 100MBs each. Personally I find it cumbersome to deal with such big files. I know that log lines that are interesting to me are only between 200 to 400 lines or so.

What would be a good way to extract relavant log lines from these files ie I just want to pipe the range of line numbers to another file.

For example, the inputs are:

filename: MyHugeLogFile.log
Starting line number: 38438
Ending line number:   39276

Is there a command that I can run in cygwin to cat out only that range in that file? I know that if I can somehow display that range in stdout then I can also pipe to an output file.

Note: Adding Linux tag for more visibility, but I need a solution that might work in cygwin. (Usually linux commands do work in cygwin).

Linux Solutions


Solution 1 - Linux

Sounds like a job for sed:

sed -n '8,12p' yourfile

...will send lines 8 through 12 of yourfile to standard out.

If you want to prepend the line number, you may wish to use cat -n first:

cat -n yourfile | sed -n '8,12p'

Solution 2 - Linux

You can use wc -l to figure out the total # of lines.

You can then combine head and tail to get at the range you want. Let's assume the log is 40,000 lines, you want the last 1562 lines, then of those you want the first 838. So:

tail -1562 MyHugeLogFile.log | head -838 | ....

Or there's probably an easier way using sed or awk.

Solution 3 - Linux

I saw this thread when I was trying to split a file in files with 100 000 lines. A better solution than sed for that is:

split -l 100000 database.sql database-

It will give files like:

database-aaa
database-aab
database-aac
...

Solution 4 - Linux

And if you simply want to cut part of a file - say from line 26 to 142 - and input it to a newfile : cat file-to-cut.txt | sed -n '26,142p' >> new-file.txt

Solution 5 - Linux

How about this:

$ seq 1 100000 | tail -n +10000 | head -n 10
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009

It uses tail to output from the 10,000th line and onwards and then head to only keep 10 lines.

The same (almost) result with sed:

$ seq 1 100000 | sed -n '10000,10010p'
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009
10010

This one has the advantage of allowing you to input the line range directly.

Solution 6 - Linux

If you are interested only in the last X lines, you can use the "tail" command like this.

$ tail -n XXXXX yourlogfile.log >> mycroppedfile.txt

This will save the last XXXXX lines of your log file to a new file called "mycroppedfile.txt"

Solution 7 - Linux

This is an old thread but I was surprised nobody mentioned grep. The -A option allows specifying a number of lines to print after a search match and the -B option includes lines before a match. The following command would output 10 lines before and 10 lines after occurrences of "my search string" in the file "mylogfile.log":

grep -A 10 -B 10 "my search string" mylogfile.log

If there are multiple matches within a large file the output can rapidly get unwieldy. Two helpful options are -n which tells grep to include line numbers and --color which highlights the matched text in the output.

If there is more than file to be searched grep allows multiple files to be listed separated by spaces. Wildcards can also be used. Putting it all together:

grep -A 10 -B 10 -n --color "my search string" *.log someOtherFile.txt

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionbitsView Question on Stackoverflow
Solution 1 - LinuxjohnsywebView Answer on Stackoverflow
Solution 2 - LinuxDavidView Answer on Stackoverflow
Solution 3 - LinuxDorianView Answer on Stackoverflow
Solution 4 - LinuxMarc Perrin-PelletierView Answer on Stackoverflow
Solution 5 - LinuxthkalaView Answer on Stackoverflow
Solution 6 - LinuxJose Antonio Escobar GarciaView Answer on Stackoverflow
Solution 7 - LinuxhbolingbrokeView Answer on Stackoverflow