What regex can match sequences of the same character?
RegexPerlRegex Problem Overview
A friend asked me this and I was stumped: Is there a way to craft a regular expression that matches a sequence of the same character? E.g., match on 'aaa', 'bbb', but not 'abc'?
m|\w{2,3}|
Wouldn't do the trick as it would match 'abc'.
m|a{2,3}|
Wouldn't do the trick as it wouldn't match 'bbb', 'ccc', etc.
Regex Solutions
Solution 1 - Regex
Sure thing! Grouping and references are your friends:
(.)\1+
Will match 2 or more occurences of the same character. For word constituent characters only, use \w
instead of .
, i.e.:
(\w)\1+
Solution 2 - Regex
Note that in Perl 5.10 we have alternative notations for backreferences as well.
foreach (qw(aaa bbb abc)) {
say;
say ' original' if /(\w)\1+/;
say ' new way' if /(\w)\g{1}+/;
say ' relative' if /(\w)\g{-1}+/;
say ' named' if /(?'char'\w)\g{char}+/;
say ' named' if /(?<char>\w)\k<char>+/;
}
Solution 3 - Regex
This will match more than \w would, like @@@:
/(.)\1+/
Solution 4 - Regex
Answering my own question, but got it:
m|(\w)\1+|
Solution 5 - Regex
This is what backreferences are for.
m/(\w)\1\1/
will do the trick.
Solution 6 - Regex
This is also possible using pure regular expressions (i.e. those that describe regular languages -- not Perl regexps). Unfortunately, it means a regexp whose length is proportional to the size of the alphabet, e.g.:
(a* + b* + ... + z*)
Where a...z are the symbols in the finite alphabet.
So Perl regexps, although a superset of pure regular expressions, definitely have their advantages even when you just want to use them for pure regular expressions!
Solution 7 - Regex
For same 3 characters:
/(.)/1/1/
/(.)/1{2}/
For 2 characters:
/(.)/1/
For unknown number of same characters:
/(.)/1*/
PS: I use javascript
Solution 8 - Regex
".*(.)\\1{2,}.*"
Works for any two or more repeated symbols in the string
Solution 9 - Regex
If you are using Java, and find duplicate chars in given string here is the code,
public class Test {
public static void main(String args[]) {
String s = "abbc";
if (s.matches(".*([a-zA-Z])\\1+.*")) {
System.out.println("Duplicate found!");
} else {
System.out.println("Duplicate not found!");
}
}
}