Swap two columns - awk, sed, python, perl

SedAwk

Sed Problem Overview


I've got data in a large file (280 columns wide, 7 million lines long!) and I need to swap the first two columns. I think I could do this with some kind of awk for loop, to print $2, $1, then a range to the end of the file - but I don't know how to do the range part, and I can't print $2, $1, $3...$280! Most of the column swap answers I've seen here are specific to small files with a manageable number of columns, so I need something that doesn't depend on specifying every column number.

The file is tab delimited:

Affy-id chr 0 pos NA06984 NA06985 NA06986 NA06989

Sed Solutions


Solution 1 - Sed

You can do this by swapping values of the first two fields:

awk ' { t = $1; $1 = $2; $2 = t; print; } ' input_file

Solution 2 - Sed

I tried the answer of perreal with cygwin on a windows system with a tab separated file. It didn't work, because the standard separator is space.

If you encounter the same problem, try this instead:

awk -F $'\t' ' { t = $1; $1 = $2; $2 = t; print; } ' OFS=$'\t' input_file

Incoming separator is defined by -F $'\t' and the seperator for output by OFS=$'\t'.

awk -F $'\t' ' { t = $1; $1 = $2; $2 = t; print; } ' OFS=$'\t' input_file > output_file

Solution 3 - Sed

Try this more relevant to your question :

awk '{printf("%s\t%s\n", $2, $1)}' inputfile

Solution 4 - Sed

This might work for you (GNU sed):

sed -i 's/^\([^\t]*\t\)\([^\t]*\t\)/\2\1/' file

Solution 5 - Sed

Have you tried using the cut command? E.g.

cat myhugefile | cut -c10-20,c1-9,c21- > myrearrangedhugefile

Solution 6 - Sed

This is also easy in perl:

perl -pe 's/^(\S+)\t(\S+)/$2\t$1/;' file > outputfile

Solution 7 - Sed

You could do this in Perl:

perl -F\\t -nlae 'print join("\t", @F[1,0,2..$#F])' inputfile

The -F specifies the delimiter. In most shells you need to precede a backslash with another to escape it. On some platforms -F automatically implies -n and -a so they can be dropped.

For your problem you wouldn't need to use -l because the last columns appears last in the output. But if in a different situation, if the last column needs to appear between other columns, the newline character must be removed. The -l switch takes care of this.

The "\t" in join can be changed to anything else to produce a different delimiter in the output.

2..$#F specifies a range from 2 until the last column. As you might have guessed, inside the square brackets, you can put any single column or range of columns in the desired order.

Solution 8 - Sed

No need to call anything else but your shell:

bash> while read col1 col2 rest; do 
        echo $col2 $col1 $rest
      done <input_file

Test:

bash> echo "first second a c d e f g" | 
      while read col1 col2 rest; do 
        echo $col2 $col1 $rest
      done
second first a b c d e f g

Solution 9 - Sed

Maybe even with "inlined" Python - as in a Python script within a shell script - but only if you want to do some more scripting with Bash beforehand or afterwards... Otherwise it is unnecessarily complex.

Content of script file process.sh:

#!/bin/bash

# inline Python script
read -r -d '' PYSCR << EOSCR
from __future__ import print_function
import codecs
import sys

encoding = "utf-8"
fn_in = sys.argv[1]
fn_out = sys.argv[2]

# print("Input:", fn_in)
# print("Output:", fn_out)

with codecs.open(fn_in, "r", encoding) as fp_in, \
        codecs.open(fn_out, "w", encoding) as fp_out:
    for line in fp_in:
        # split into two columns and rest
        col1, col2, rest = line.split("\t", 2)
        # swap columns in output
        fp_out.write("{}\t{}\t{}".format(col2, col1, rest))
EOSCR

# ---------------------
# do setup work?
# e. g. list files for processing

# call python script with params
python3 -c "$PYSCR" "$inputfile" "$outputfile"

# do some more processing
# e. g. rename outputfile to inputfile, ...

If you only need to swap the columns for a single file, then you can also just create a single Python script and statically define the filenames. Or just use an answer above.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCharley FarleyView Question on Stackoverflow
Solution 1 - SedperrealView Answer on Stackoverflow
Solution 2 - Sedemi-leView Answer on Stackoverflow
Solution 3 - SedPradyumna SagarView Answer on Stackoverflow
Solution 4 - SedpotongView Answer on Stackoverflow
Solution 5 - SedRobbie DeeView Answer on Stackoverflow
Solution 6 - SedAaron LawsonView Answer on Stackoverflow
Solution 7 - SedLoaxView Answer on Stackoverflow
Solution 8 - SedFranz MeyerView Answer on Stackoverflow
Solution 9 - SedE. KörnerView Answer on Stackoverflow