146

How can you remove all of the trailing whitespace of an entire project? Starting at a root directory, and removing the trailing whitespace from all files in all folders.

Also, I want to to be able to modify the file directly, and not just print everything to stdout.

2
  • Oh, are you looking for a "portable" solution, or a more OS-specific? What OS are you using? Commented Sep 29, 2008 at 23:17
  • 3
    I'd love to see a version of this that would work on OS X Snow Leopard, and would ignore .git and .svn folders. Commented Feb 12, 2010 at 19:54

15 Answers 15

90

Here is an OS X >= 10.6 Snow Leopard solution.

It Ignores .git and .svn folders and their contents. Also it won't leave a backup file.

(export LANG=C LC_CTYPE=C
find . -type d \( -name .svn -o -name .git \) -prune -o -type f -print0 | perl -0ne 'print if -T' | xargs -0 -P $(nproc) sed -Ei 's/[[:blank:]]+$//'
)

The enclosing parenthesis preserves the L* variables of current shell – executing in subshell.

-P $(nproc) used for parallel execution.

Sign up to request clarification or add additional context in comments.

14 Comments

You can make it faster by using \+ instead of * in the replacement string - Otherwise it matches on every single line.
You could use [[:blank:]] to remove both tabs and spaces.
In Mountain Lion this returns sed: RE error: illegal byte sequence for me.
For those of you having issues with "illegal byte sequence": Enter export LANG=C and try again
In OS X 10.9 I also needed export LC_CTYPE=C as found here: stackoverflow.com/questions/19242275/…
|
39

Use:

find . -type f -print0 | xargs -0 perl -pi.bak -e 's/ +$//'

if you don't want the ".bak" files generated:

find . -type f -print0 | xargs -0 perl -pi -e 's/ +$//'

as a zsh user, you can omit the call to find, and instead use:

perl -pi -e 's/ +$//' **/*

Note: To prevent destroying .git directory, try adding: -not -iwholename '*.git*'.

7 Comments

Don't try this in a git repo, as it can corrupt git's internal storage.
To clarify, it's alright to run this inside a subfolder of a git repo, just not inside any folders that contain git repo(s) as descendants, i.e. not inside any folders that have .git directories, no matter how deeply nested.
Combining this answer with @deepwell's to avoid git/svn issues find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | xargs -0 perl -pi -e 's/ +$//'
There's probably a better way, but I recovered from mangling a git repo with this by cloning out the repo in a separate folder and then doing rsync -rv --exclude=.git repo/ repo2/ after which the local changes in repo were also in the (undamaged) repo2.
I accidentally ran this but it just messed with my .git/index which you can normally fix using stackoverflow.com/a/47109640/1507124
|
38

Two alternative approaches which also work with DOS newlines (CR/LF) and do a pretty good job at avoiding binary files:

Generic solution which checks that the MIME type starts with text/:

while IFS= read -r -d '' -u 9
do
    if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
    then
        sed -i 's/[ \t]\+\(\r\?\)$/\1/' -- "$REPLY"
    else
        echo "Skipping $REPLY" >&2
    fi
done 9< <(find . -type f -print0)

Git repository-specific solution by Mat which uses the -I option of git grep to skip files which Git considers to be binary:

git grep -I --name-only -z -e '' | xargs -0 sed -i 's/[ \t]\+\(\r\?\)$/\1/'

3 Comments

So I really like this git solution. It should really be on the top. I don't want to save carriage returns though. But I prefer this to the one I combined in 2010.
My git complains that the -e expression is empty, but it works great using -e '.*'
@okor In GNU sed the suffix option to -i is optional, but in BSD sed it's not. It's strictly speaking not necessary here anyway, so I'll just remove it.
29

In Bash:

find dir -type f -exec sed -i 's/ *$//' '{}' ';'

Note: If you're using .git repository, try adding: -not -iwholename '.git'.

6 Comments

This generates errors like this for every file found. sed: 1: "dir/file.txt": command a expects \ followed by text
Replacing ';' with \; should work. (Also quotes around {} are not strictly needed).
To remove all whitespace not just spaces you should replace the space character with [:space:] in your sed regular expression.
Another side note: This only works with sed versions >= 4, smaller versions do not support in place editing.
This is a faster and safer variant: find dir -type f -print0 | xargs -r0 sed -i 's/ *$//'
|
16

Ack was made for this kind of task.

It works just like grep, but knows not to descend into places like .svn, .git, .cvs, etc.

ack --print0 -l '[ \t]+$' | xargs -0 -n1 perl -pi -e 's/[ \t]+$//'

Much easier than jumping through hoops with find/grep.

Ack is available via most package managers (as either ack or ack-grep).

It's just a Perl program, so it's also available in a single-file version that you can just download and run. See: Ack Install

Comments

14

This worked for me in OSX 10.5 Leopard, which does not use GNU sed or xargs.

find dir -type f -print0 | xargs -0 sed -i.bak -E "s/[[:space:]]*$//"

Just be careful with this if you have files that need to be excluded (I did)!

You can use -prune to ignore certain directories or files. For Python files in a git repository, you could use something like:

find dir -not -path '.git' -iname '*.py'

4 Comments

Any chance you could clarify this? I'd like a command that will remove trailing whitespace from all files in a directory recursively, while ignoring the ".git" directory. I can't quite follow your example...
If you're using tcsh you'll need to change the double quotes to single quotes. Otherwise, you'll get an "Illegal variable name." error.
GNU sed is similar but you do -i.bak or --in-place=.bak, ending up with a full command of find dir -not -path '.git' -iname '*.py' -print0 | xargs -0 sed --in-place=.bak 's/[[:space:]]*$//'. Replace dir with the directory in question as the top-level to recurse from.
sed -i .bak ? Shouldn't it be sed -i.bak (without the space)?
9

ex

Try using Ex editor (part of Vim):

$ ex +'bufdo!%s/\s\+$//e' -cxa **/*.*

Note: For recursion (bash4 & zsh), we use a new globbing option (**/*.*). Enable by shopt -s globstar.

You may add the following function into your .bash_profile:

# Strip trailing whitespaces.
# Usage: trim *.*
# See: https://stackoverflow.com/q/10711051/55075
trim() {
  ex +'bufdo!%s/\s\+$//e' -cxa $*
}

sed

For using sed, check: How to remove trailing whitespaces with sed?

find

Find the following script (e.g. remove_trail_spaces.sh) for removing trailing whitespaces from the files:

#!/bin/sh
# Script to remove trailing whitespace of all files recursively
# See: https://stackoverflow.com/questions/149057/how-to-remove-trailing-whitespace-of-all-files-recursively

case "$OSTYPE" in
  darwin*) # OSX 10.5 Leopard, which does not use GNU sed or xargs.
    find . -type f -not -iwholename '*.git*' -print0  | xargs -0 sed -i .bak -E "s/[[:space:]]*$//"
    find . -type f -name \*.bak -print0 | xargs -0 rm -v
    ;;
  *)
    find . -type f -not -iwholename '*.git*' -print0 | xargs -0 perl -pi -e 's/ +$//'
esac

Run this script from the directory which you want to scan. On OSX at the end, it will remove all the files ending with .bak.

Or just:

find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;

which is recommended way by Spring Framework Code Style.

1 Comment

find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \; only removes one trailing space instead of all.
6

I ended up not using find and not creating backup files.

sed -i '' 's/[[:space:]]*$//g' **/*.*

Depending on the depth of the file tree, this (shorter version) may be sufficient for your needs.

NOTE this also takes binary files, for instance.

2 Comments

For specific files: find . -name '*.rb' | xargs -I{} sed -i '' 's/[[:space:]]*$//g' {}
You don't need the '' parameter for sed; or I might be missing something. I tried it on all files in a given directory, like this: sed -i 's/[[:space:]]*$//g' util/*.m
6

Instead of excluding files, here is a variation of the above the explicitly white lists the files, based on file extension, that you want to strip, feel free to season to taste:

find . \( -name *.rb -or -name *.html -or -name *.js -or -name *.coffee -or \
-name *.css -or -name *.scss -or -name *.erb -or -name *.yml -or -name *.ru \) \
-print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"

2 Comments

For this to work for me I needed to add quotes : -name "*.rb*"
on bash / macOS 12.5 i also needed quotes (e.g. -name "*.swift") as @haroldcarr said, to traverse recursively.
5

I ended up running this, which is a mix between pojo and adams version.

It will clean both trailing whitespace, and also another form of trailing whitespace, the carriage return:

find . -not \( -name .svn -prune -o -name .git -prune \) -type f \
  -exec sed -i 's/[:space:]+$//' \{} \;  \
  -exec sed -i 's/\r\n$/\n/' \{} \;

It won't touch the .git folder if there is one.

Edit: Made it a bit safer after the comment, not allowing to take files with ".git" or ".svn" in it. But beware, it will touch binary files if you've got some. Use -iname "*.py" -or -iname "*.php" after -type f if you only want it to touch e.g. .py and .php-files.

Update 2: It now replaces all kinds of spaces at end of line (which means tabs as well)

2 Comments

I don't know what's going on, but this totally fubared my git repo and messed with my images. PEOPLE, BE MORE CAREFUL THAN I WAS!
Yes, it will ruin binary files. However, it shouldn't touch your git repo at all, because it skips whatever resides inside a .git-folder. But maybe only if you're in the same folder.
5

1) Many other answers use -E. I am not sure why, as that's undocumented BSD compatibility option. -r should be used instead.

2) Other answers use -i ''. That should be just -i (or -i'' if preffered), because -i has the suffix right after.

3) Git specific solution:

git config --global alias.check-whitespace \
'git diff-tree --check $(git hash-object -t tree /dev/null) HEAD'

git check-whitespace | grep trailing | cut -d: -f1 | uniq -u -z | xargs -0 sed --in-place -e 's/[ \t]+$//'

The first one registers a git alias check-whitespace which lists the files with trailing whitespaces. The second one runs sed on them.

I only use \t rather than [:space:] as I don't typically see vertical tabs, form feeds and non-breakable spaces. Your measurement may vary.

2 Comments

For the first I get expansion of alias 'check-whitespace' failed; 'git' is not a git command sed: no input files, and then if I remove the "extra git", I get fatal: ambiguous argument '$(git': unknown revision or path not in the working tree.
re: Other answers use -i '' -- This is because sed on macOS needs it in this format.
5

I use regular expressions. 4 steps:

  1. Open the root folder in your editor (I use Visual Studio Code).
  2. Tap the Search icon on the left, and enable the regular expression mode.
  3. Enter " +\n" in the Search bar and "\n" in the Replace bar.
  4. Click "Replace All".

This removes all trailing spaces at the end of each line in all files. And you can exclude some files that don't fit with this need.

Comments

4

This works well.. add/remove --include for specific file types :

egrep -rl ' $' --include *.c *  | xargs sed -i 's/\s\+$//g'

Comments

4

Ruby:

irb
Dir['lib/**/*.rb'].each{|f| x = File.read(f); File.write(f, x.gsub(/[ \t]+$/,"")) }

Comments

1

This is what works for me (Mac OS X 10.8, GNU sed installed by Homebrew):

find . -path ./vendor -prune -o \
  \( -name '*.java' -o -name '*.xml' -o -name '*.css' \) \
  -exec gsed -i -E 's/\t/    /' \{} \; \
  -exec gsed -i -E 's/[[:space:]]*$//' \{} \; \
  -exec gsed -i -E 's/\r\n/\n/' \{} \;

Removed trailing spaces, replaces tabs with spaces, replaces Windows CRLF with Unix \n.

What's interesting is that I have to run this 3-4 times before all files get fixed, by all cleaning gsed instructions.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.