Regex expressions in Java, \\s vs. \\s+

JavaRegexStringQuantifiers

Java Problem Overview


What's the difference between the following two expressions?

x = x.replaceAll("\\s", "");
x = x.replaceAll("\\s+", "");

Java Solutions


Solution 1 - Java

The first one matches a single whitespace, whereas the second one matches one or many whitespaces. They're the so-called regular expression quantifiers, and they perform matches like this (taken from the documentation):

Greedy quantifiers
X?	X, once or not at all
X*	X, zero or more times
X+	X, one or more times
X{n}	X, exactly n times
X{n,}	X, at least n times
X{n,m}	X, at least n but not more than m times
 
Reluctant quantifiers
X??	X, once or not at all
X*?	X, zero or more times
X+?	X, one or more times
X{n}?	X, exactly n times
X{n,}?	X, at least n times
X{n,m}?	X, at least n but not more than m times
 
Possessive quantifiers
X?+	X, once or not at all
X*+	X, zero or more times
X++	X, one or more times
X{n}+	X, exactly n times
X{n,}+	X, at least n times
X{n,m}+	X, at least n but not more than m times

Solution 2 - Java

Those two replaceAll calls will always produce the same result, regardless of what x is. However, it is important to note that the two regular expressions are not the same:

  • \\s - matches single whitespace character
  • \\s+ - matches sequence of one or more whitespace characters.

In this case, it makes no difference, since you are replacing everything with an empty string (although it would be better to use \\s+ from an efficiency point of view). If you were replacing with a non-empty string, the two would behave differently.

Solution 3 - Java

First of all you need to understand that final output of both the statements will be same i.e. to remove all the spaces from given string.

However x.replaceAll("\\s+", ""); will be more efficient way of trimming spaces (if string can have multiple contiguous spaces) because of potentially less no of replacements due the to fact that regex \\s+ matches 1 or more spaces at once and replaces them with empty string.

So even though you get the same output from both it is better to use:

x.replaceAll("\\s+", "");

Solution 4 - Java

The first regex will match one whitespace character. The second regex will reluctantly match one or more whitespace characters. For most purposes, these two regexes are very similar, except in the second case, the regex can match more of the string, if it prevents the regex match from failing.  from http://www.coderanch.com/t/570917/java/java/regex-difference

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmpluseView Question on Stackoverflow
Solution 1 - JavaÓscar LópezView Answer on Stackoverflow
Solution 2 - JavaarshajiiView Answer on Stackoverflow
Solution 3 - JavaanubhavaView Answer on Stackoverflow
Solution 4 - JavaevgenylView Answer on Stackoverflow