Is there a way to ignore header lines in a UNIX sort?

Unix Problem Overview

I have a fixed-width-field file which I'm trying to sort using the UNIX (Cygwin, in my case) sort utility.

The problem is there is a two-line header at the top of the file which is being sorted to the bottom of the file (as each header line begins with a colon).

Is there a way to tell sort either "pass the first two lines across unsorted" or to specify an ordering which sorts the colon lines to the top - the remaining lines are always start with a 6-digit numeric (which is actually the key I'm sorting on) if that helps.

Example:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
500123TSTMY_RADAR00
222334NOTALINEOUT01
477821USASHUTTLES21
325611LVEANOTHERS00

should sort to:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
222334NOTALINEOUT01
325611LVEANOTHERS00
477821USASHUTTLES21
500123TSTMY_RADAR00

Unix Solutions

Solution 1 - Unix

(head -n 2 <file> && tail -n +3 <file> | sort) > newfile

The parentheses create a subshell, wrapping up the stdout so you can pipe it or redirect it as if it had come from a single command.

Solution 2 - Unix

If you don't mind using awk, you can take advantage of awk's built-in pipe abilities

eg.

extract_data | awk 'NR<3{print $0;next}{print $0| "sort -r"}'

This prints the first two lines verbatim and pipes the rest through sort.

Note that this has the very specific advantage of being able to selectively sort parts of a piped input. all the other methods suggested will only sort plain files which can be read multiple times. This works on anything.

Solution 3 - Unix

In simple cases, sed can do the job elegantly:

    your_script | (sed -u 1q; sort)

or equivalently,

    cat your_data | (sed -u 1q; sort)

The key is in the 1q -- print first line (header) and quit (leaving the rest of the input to sort).

For the example given, 2q will do the trick.

The -u switch (unbuffered) is required for those seds (notably, GNU's) that would otherwise read the input in chunks, thereby consuming data that you want to go through sort instead.

Solution 4 - Unix

Here is a version that works on piped data:

(read -r; printf "%s\n" "$REPLY"; sort)

If your header has multiple lines:

(for i in $(seq $HEADER_ROWS); do read -r; printf "%s\n" "$REPLY"; done; sort)

This solution is from here

Solution 5 - Unix

You can use tail -n +3 <file> | sort ... (tail will output the file contents from the 3rd line).

Solution 6 - Unix

head -2 <your_file> && nawk 'NR>2' <your_file> | sort

example:

> cat temp
10
8
1
2
3
4
5
> head -2 temp && nawk 'NR>2' temp | sort -r
10
8
5
4
3
2
1

Solution 7 - Unix

It only takes 2 lines of code...

head -1 test.txt > a.tmp; 
tail -n+2 test.txt | sort -n >> a.tmp;

For a numeric data, -n is required. For alpha sort, the -n is not required.

Example file:
$ cat test.txt

>header
8
5
100
1
-1

Result:
$ cat a.tmp

>header
-1
1
5
8
100

Solution 8 - Unix

So here's a bash function where arguments are exactly like sort. Supporting files and pipes.

function skip_header_sort() {
    if [[ $# -gt 0 ]] && [[ -f ${@: -1} ]]; then
        local file=${@: -1}
        set -- "${@:1:$(($#-1))}"
    fi
    awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file
}

How it works. This line checks if there is at least one argument and if the last argument is a file.

    if [[ $# -gt 0 ]] && [[ -f ${@: -1} ]]; then

This saves the file to separate argument. Since we're about to erase the last argument.

        local file=${@: -1}

Here we remove the last argument. Since we don't want to pass it as a sort argument.

        set -- "${@:1:$(($#-1))}"

Finally, we do the awk part, passing the arguments (minus the last argument if it was the file) to sort in awk. This was orignally suggested by Dave, and modified to take sort arguments. We rely on the fact that $file will be empty if we're piping, thus ignored.

    awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file

Example usage with a comma separated file.

$ cat /tmp/test
A,B,C
0,1,2
1,2,0
2,0,1

# SORT NUMERICALLY SECOND COLUMN
$ skip_header_sort -t, -nk2 /tmp/test
A,B,C
2,0,1
0,1,2
1,2,0

# SORT REVERSE NUMERICALLY THIRD COLUMN
$ cat /tmp/test | skip_header_sort -t, -nrk3
A,B,C
0,1,2
2,0,1
1,2,0

Solution 9 - Unix

With Python:

import sys
HEADER_ROWS=2

for _ in range(HEADER_ROWS):
    sys.stdout.write(next(sys.stdin))
for row in sorted(sys.stdin):
    sys.stdout.write(row)

Solution 10 - Unix

Here's a bash shell function derived from the other answers. It handles both files and pipes. First argument is the file name or '-' for stdin. Remaining arguments are passed to sort. A couple examples:

$ hsort myfile.txt
$ head -n 100 myfile.txt | hsort -
$ hsort myfile.txt -k 2,2 | head -n 20 | hsort - -r

The shell function:

hsort ()
{
   if [ "$1" == "-h" ]; then
       echo "Sort a file or standard input, treating the first line as a header.";
       echo "The first argument is the file or '-' for standard input. Additional";
       echo "arguments to sort follow the first argument, including other files.";
       echo "File syntax : $ hsort file [sort-options] [file...]";
       echo "STDIN syntax: $ hsort - [sort-options] [file...]";
       return 0;
   elif [ -f "$1" ]; then
       local file=$1;
       shift;
       (head -n 1 $file && tail -n +2 $file | sort $*);
   elif [ "$1" == "-" ]; then
       shift;
       (read -r; printf "%s\n" "$REPLY"; sort $*);
   else
       >&2 echo "Error. File not found: $1";
       >&2 echo "Use either 'hsort <file> [sort-options]' or 'hsort - [sort-options]'";
       return 1 ;
   fi
}

Solution 11 - Unix

This is the same as Ian Sherbin answer but my implementation is :-

cut -d'|' -f3,4,7 $arg1 | uniq > filetmp.tc
head -1 filetmp.tc > file.tc;
tail -n+2 filetmp.tc | sort -t"|" -k2,2 >> file.tc;

Solution 12 - Unix

cat file_name.txt | sed 1d | sort

This will do what you want.

Content Type	Original Author	Original Content on Stackoverflow
Question	Rob Gilliam	View Question on Stackoverflow
Solution 1 - Unix	BobS	View Answer on Stackoverflow
Solution 2 - Unix	Dave	View Answer on Stackoverflow
Solution 3 - Unix	Andrea	View Answer on Stackoverflow
Solution 4 - Unix	freeseek	View Answer on Stackoverflow
Solution 5 - Unix	Anton Kovalenko	View Answer on Stackoverflow
Solution 6 - Unix	Vijay	View Answer on Stackoverflow
Solution 7 - Unix	Ian Sherbin	View Answer on Stackoverflow
Solution 8 - Unix	flu	View Answer on Stackoverflow
Solution 9 - Unix	crusaderky	View Answer on Stackoverflow
Solution 10 - Unix	JonDeg	View Answer on Stackoverflow
Solution 11 - Unix	Bik	View Answer on Stackoverflow
Solution 12 - Unix	Sathish G	View Answer on Stackoverflow