Trim leading and trailing spaces from a string in awk

ShellUnixAwk

Shell Problem Overview


I'm trying to remove leading and trailing space in 2nd column of the below input.txt:

Name, Order  
Trim, working
cat,cat1

I have used the below awk to remove leading and trailing space in 2nd column but it is not working. What am I missing?

awk -F, '{$2=$2};1' input.txt

This gives the output as:

Name, Order  
Trim, working
cat,cat1

Leading and trailing spaces are not removed.

Shell Solutions


Solution 1 - Shell

If you want to trim all spaces, only in lines that have a comma, and use awk, then the following will work for you:

awk -F, '/,/{gsub(/ /, "", $0); print} ' input.txt

If you only want to remove spaces in the second column, change the expression to

awk -F, '/,/{gsub(/ /, "", $2); print$1","$2} ' input.txt

Note that gsub substitutes the character in // with the second expression, in the variable that is the third parameter - and does so in-place - in other words, when it's done, the $0 (or $2) has been modified.

Full explanation:

-F,            use comma as field separator 
               (so the thing before the first comma is $1, etc)
/,/            operate only on lines with a comma 
               (this means empty lines are skipped)
gsub(a,b,c)    match the regular expression a, replace it with b, 
               and do all this with the contents of c
print$1","$2   print the contents of field 1, a comma, then field 2
input.txt      use input.txt as the source of lines to process

EDIT I want to point out that @BMW's solution is better, as it actually trims only leading and trailing spaces with two successive gsub commands. Whilst giving credit I will give an explanation of how it works.

gsub(/^[ \t]+/,"",$2);    - starting at the beginning (^) replace all (+ = zero or more, greedy)
                             consecutive tabs and spaces with an empty string
gsub(/[ \t]+$/,"",$2)}    - do the same, but now for all space up to the end of string ($)
1                         - ="true". Shorthand for "use default action", which is print $0
                          - that is, print the entire (modified) line

Solution 2 - Shell

remove leading and trailing white space in 2nd column

awk 'BEGIN{FS=OFS=","}{gsub(/^[ \t]+/,"",$2);gsub(/[ \t]+$/,"",$2)}1' input.txt

another way by one gsub:

awk 'BEGIN{FS=OFS=","} {gsub(/^[ \t]+|[ \t]+$/, "", $2)}1' infile

Solution 3 - Shell

Warning by @Geoff: see my note below, only one of the suggestions in this answer works (though on both columns).

I would use sed:

sed 's/, /,/' input.txt

This will remove on leading space after the , . Output:

Name,Order
Trim,working
cat,cat1

More general might be the following, it will remove possibly multiple spaces and/or tabs after the ,:

sed 's/,[ \t]\?/,/g' input.txt

It will also work with more than two columns because of the global modifier /g


@Floris asked in discussion for a solution that removes trailing and and ending whitespaces in each colum (even the first and last) while not removing white spaces in the middle of a column:

sed 's/[ \t]\?,[ \t]\?/,/g; s/^[ \t]\+//g; s/[ \t]\+$//g' input.txt

*EDIT by @Geoff, I've appended the input file name to this one, and now it only removes all leading & trailing spaces (though from both columns). The other suggestions within this answer don't work. But try: " Multiple spaces , and 2 spaces before here " *


IMO sed is the optimal tool for this job. However, here comes a solution with awk because you've asked for that:

awk -F', ' '{printf "%s,%s\n", $1, $2}' input.txt

Another simple solution that comes in mind to remove all whitespaces is tr -d:

cat input.txt | tr -d ' '

Solution 4 - Shell

I just came across this. The correct answer is:

awk 'BEGIN{FS=OFS=","} {gsub(/^[[:space:]]+|[[:space:]]+$/,"",$2)} 1'

Solution 5 - Shell

just use a regex as a separator:

', *' - for leading spaces

' *,' - for trailing spaces

for both leading and trailing:

awk -F' *,? *' '{print $1","$2}' input.txt

Solution 6 - Shell

Simplest solution is probably to use tr

$ cat -A input
^I    Name, ^IOrder  $
  Trim, working  $
cat,cat1^I  

$ tr -d '[:blank:]' < input | cat -A
Name,Order$
Trim,working$
cat,cat1

Solution 7 - Shell

The following seems to work:

awk -F',[[:blank:]]*' '{$2=$2}1' OFS="," input.txt

Solution 8 - Shell

If it is safe to assume only one set of spaces in column two (which is the original example):

awk '{print $1$2}' /tmp/input.txt

Adding another field, e.g. awk '{print $1$2$3}' /tmp/input.txt will catch two sets of spaces (up to three words in column two), and won't break if there are fewer.

If you have an indeterminate (large) number of space delimited words, I'd use one of the previous suggestions, otherwise this solution is the easiest you'll find using awk.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMarjerView Question on Stackoverflow
Solution 1 - ShellFlorisView Answer on Stackoverflow
Solution 2 - ShellBMWView Answer on Stackoverflow
Solution 3 - Shellhek2mglView Answer on Stackoverflow
Solution 4 - ShellEd MortonView Answer on Stackoverflow
Solution 5 - ShellIlya KharlamovView Answer on Stackoverflow
Solution 6 - ShellFredrik PihlView Answer on Stackoverflow
Solution 7 - ShellHåkon HæglandView Answer on Stackoverflow
Solution 8 - ShellAndrewView Answer on Stackoverflow