Regex not operator
RegexStringRegex Problem Overview
Is there an NOT operator in Regexes?
Like in that string : "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)"
I want to delete all \([0-9a-zA-z _\.\-:]*\)
but not the one where it is a year: (2001)
.
So what the regex should return must be: (2001) name
.
NOTE: something like \((?![\d]){4}[0-9a-zA-z _\.\-:]*\)
does not work for me (the (20019)
somehow also matches...)
Regex Solutions
Solution 1 - Regex
Not quite, although generally you can usually use some workaround on one of the forms
[^abc]
, which is character by character nota
orb
orc
,- or negative lookahead:
a(?!b)
, which isa
not followed byb
- or negative lookbehind:
(?<!a)b
, which isb
not preceeded bya
Solution 2 - Regex
No, there's no direct not operator. At least not the way you hope for.
You can use a zero-width negative lookahead, however:
\((?!2001)[0-9a-zA-z _\.\-:]*\)
The (?!...)
part means "only match if the text following (hence: lookahead) this doesn't (hence: negative) match this. But it doesn't actually consume the characters it matches (hence: zero-width).
There are actually 4 combinations of lookarounds with 2 axes:
- lookbehind / lookahead : specifies if the characters before or after the point are considered
- positive / negative : specifies if the characters must match or must not match.
Solution 3 - Regex
You could capture the (2001)
part and replace the rest with nothing.
public static string extractYearString(string input) {
return input.replaceAll(".*\(([0-9]{4})\).*", "$1");
}
var subject = "(2001) (asdf) (dasd1123_asd 21.01.2011 zqge)(dzqge) name (20019)";
var result = extractYearString(subject);
System.out.println(result); // <-- "2001"
.*\(([0-9]{4})\).*
means
.*
match anything\(
match a(
character(
begin capture[0-9]{4}
any single digit four times)
end capture\)
match a)
character.*
anything (rest of string)
Solution 4 - Regex
Here is an alternative:
(\(\d{4}\))((?:\s*\([0-9a-zA-z _\.\-:]*\))*)([^()]*)(( ?\([0-9a-zA-z _\.\-:]*\))*)
Repetitive patterns are embedded in a single group with this construction, where the inner group is not a capturing one: ((:?pattern)*)
, which enable to have control on the group numbers of interrest.
Then you get what you want with: \1\3