How to remove trailing whitespace of all files recursively?
BashWhitespaceBash Problem Overview
How can you remove all of the trailing whitespace of an entire project? Starting at a root directory, and removing the trailing whitespace from all files in all folders.
Also, I want to to be able to modify the file directly, and not just print everything to stdout.
Bash Solutions
Solution 1 - Bash
Here is an OS X >= 10.6 Snow Leopard solution.
It Ignores .git and .svn folders and their contents. Also it won't leave a backup file.
export LC_CTYPE=C
export LANG=C
find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | perl -0ne 'print if -T' | xargs -0 sed -Ei 's/[[:blank:]]+$//'
Solution 2 - Bash
Use:
find . -type f -print0 | xargs -0 perl -pi.bak -e 's/ +$//'
if you don't want the ".bak" files generated:
find . -type f -print0 | xargs -0 perl -pi -e 's/ +$//'
as a zsh user, you can omit the call to find, and instead use:
perl -pi -e 's/ +$//' **/*
Note: To prevent destroying .git
directory, try adding: -not -iwholename '*.git*'
.
Solution 3 - Bash
Two alternative approaches which also work with DOS newlines (CR/LF) and do a pretty good job at avoiding binary files:
Generic solution which checks that the MIME type starts with text/
:
while IFS= read -r -d '' -u 9
do
if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
then
sed -i 's/[ \t]\+\(\r\?\)$/\1/' -- "$REPLY"
else
echo "Skipping $REPLY" >&2
fi
done 9< <(find . -type f -print0)
Git repository-specific solution by Mat which uses the -I
option of git grep
to skip files which Git considers to be binary:
git grep -I --name-only -z -e '' | xargs -0 sed -i 's/[ \t]\+\(\r\?\)$/\1/'
Solution 4 - Bash
In Bash:
find dir -type f -exec sed -i 's/ *$//' '{}' ';'
Note: If you're using .git
repository, try adding: -not -iwholename '.git'
.
Solution 5 - Bash
This worked for me in OSX 10.5 Leopard, which does not use GNU sed or xargs.
find dir -type f -print0 | xargs -0 sed -i.bak -E "s/[[:space:]]*$//"
Just be careful with this if you have files that need to be excluded (I did)!
You can use -prune to ignore certain directories or files. For Python files in a git repository, you could use something like:
find dir -not -path '.git' -iname '*.py'
Solution 6 - Bash
Ack was made for this kind of task.
It works just like grep, but knows not to descend into places like .svn, .git, .cvs, etc.
ack --print0 -l '[ \t]+$' | xargs -0 -n1 perl -pi -e 's/[ \t]+$//'
Much easier than jumping through hoops with find/grep.
Ack is available via most package managers (as either ack or ack-grep).
It's just a Perl program, so it's also available in a single-file version that you can just download and run. See: Ack Install
Solution 7 - Bash
ex
Try using Ex editor (part of Vim):
$ ex +'bufdo!%s/\s\+$//e' -cxa **/*.*
Note: For recursion (bash4 & zsh), we use a new globbing option (**/*.*
). Enable by shopt -s globstar
.
You may add the following function into your .bash_profile
:
# Strip trailing whitespaces.
# Usage: trim *.*
# See: https://stackoverflow.com/q/10711051/55075
trim() {
ex +'bufdo!%s/\s\+$//e' -cxa $*
}
sed
For using sed
, check: How to remove trailing whitespaces with sed?
find
Find the following script (e.g. remove_trail_spaces.sh
) for removing trailing whitespaces from the files:
#!/bin/sh
# Script to remove trailing whitespace of all files recursively
# See: https://stackoverflow.com/questions/149057/how-to-remove-trailing-whitespace-of-all-files-recursively
case "$OSTYPE" in
darwin*) # OSX 10.5 Leopard, which does not use GNU sed or xargs.
find . -type f -not -iwholename '*.git*' -print0 | xargs -0 sed -i .bak -E "s/[[:space:]]*$//"
find . -type f -name \*.bak -print0 | xargs -0 rm -v
;;
*)
find . -type f -not -iwholename '*.git*' -print0 | xargs -0 perl -pi -e 's/ +$//'
esac
Run this script from the directory which you want to scan. On OSX at the end, it will remove all the files ending with .bak
.
Or just:
find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;
which is recommended way by Spring Framework Code Style.
Solution 8 - Bash
I ended up not using find and not creating backup files.
sed -i '' 's/[[:space:]]*$//g' **/*.*
Depending on the depth of the file tree, this (shorter version) may be sufficient for your needs.
NOTE this also takes binary files, for instance.
Solution 9 - Bash
Instead of excluding files, here is a variation of the above the explicitly white lists the files, based on file extension, that you want to strip, feel free to season to taste:
find . \( -name *.rb -or -name *.html -or -name *.js -or -name *.coffee -or \
-name *.css -or -name *.scss -or -name *.erb -or -name *.yml -or -name *.ru \) \
-print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"
Solution 10 - Bash
I ended up running this, which is a mix between pojo and adams version.
It will clean both trailing whitespace, and also another form of trailing whitespace, the carriage return:
find . -not \( -name .svn -prune -o -name .git -prune \) -type f \
-exec sed -i 's/[:space:]+$//' \{} \; \
-exec sed -i 's/\r\n$/\n/' \{} \;
It won't touch the .git folder if there is one.
Edit: Made it a bit safer after the comment, not allowing to take files with ".git" or ".svn" in it. But beware, it will touch binary files if you've got some. Use -iname "*.py" -or -iname "*.php"
after -type f
if you only want it to touch e.g. .py and .php-files.
Update 2: It now replaces all kinds of spaces at end of line (which means tabs as well)
Solution 11 - Bash
This works well.. add/remove --include for specific file types :
egrep -rl ' $' --include *.c * | xargs sed -i 's/\s\+$//g'
Solution 12 - Bash
Ruby:
irb
Dir['lib/**/*.rb'].each{|f| x = File.read(f); File.write(f, x.gsub(/[ \t]+$/,"")) }
Solution 13 - Bash
-
Many other answers use
-E
. I am not sure why, as that's undocumented BSD compatibility option.-r
should be used instead. -
Other answers use
-i ''
. That should be just-i
(or-i''
if preffered), because-i
has the suffix right after. -
Git specific solution:
git config --global alias.check-whitespace
'git diff-tree --check $(git hash-object -t tree /dev/null) HEAD'git check-whitespace | grep trailing | cut -d: -f1 | uniq -u -z | xargs -0 sed --in-place -e 's/[ \t]+$//'
The first one registers a git alias check-whitespace
which lists the files with trailing whitespaces.
The second one runs sed
on them.
I only use \t
rather than [:space:]
as I don't typically see vertical tabs, form feeds and non-breakable spaces. Your measurement may vary.
Solution 14 - Bash
I use regular expressions. 4 steps:
- Open the root folder in your editor (I use Visual Studio Code).
- Tap the Search icon on the left, and enable the regular expression mode.
- Enter " +\n" in the Search bar and "\n" in the Replace bar.
- Click "Replace All".
This removes all trailing spaces at the end of each line in all files. And you can exclude some files that don't fit with this need.
Solution 15 - Bash
This is what works for me (Mac OS X 10.8, GNU sed installed by Homebrew):
find . -path ./vendor -prune -o \
\( -name '*.java' -o -name '*.xml' -o -name '*.css' \) \
-exec gsed -i -E 's/\t/ /' \{} \; \
-exec gsed -i -E 's/[[:space:]]*$//' \{} \; \
-exec gsed -i -E 's/\r\n/\n/' \{} \;
Removed trailing spaces, replaces tabs with spaces, replaces Windows CRLF with Unix \n
.
What's interesting is that I have to run this 3-4 times before all files get fixed, by all cleaning gsed
instructions.