Negative regex for Perl string pattern match

RegexPerl

Regex Problem Overview


I have this regex:

if($string =~ m/^(Clinton|[^Bush]|Reagan)/i)
  {print "$string\n"};

I want to match with Clinton and Reagan, but not Bush.

It's not working.

Regex Solutions


Solution 1 - Regex

Your regex does not work because [] defines a character class, but what you want is a lookahead:

(?=) - Positive look ahead assertion foo(?=bar) matches foo when followed by bar
(?!) - Negative look ahead assertion foo(?!bar) matches foo when not followed by bar
(?<=) - Positive look behind assertion (?<=foo)bar matches bar when preceded by foo
(?<!) - Negative look behind assertion (?<!foo)bar matches bar when NOT preceded by foo
(?>) - Once-only subpatterns (?>\d+)bar Performance enhancing when bar not present
(?(x)) - Conditional subpatterns
(?(3)foo|fu)bar - Matches foo if 3rd subpattern has matched, fu if not
(?#) - Comment (?# Pattern does x y or z)

So try: (?!bush)

Solution 2 - Regex

Sample text:

> Clinton said
> Bush used crayons
> Reagan forgot

Just omitting a Bush match:

$ perl -ne 'print if /^(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

Or if you really want to specify:

$ perl -ne 'print if /^(?!Bush)(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

Solution 3 - Regex

Your regex says the following:

/^         - if the line starts with
(          - start a capture group
Clinton|   - "Clinton" 
|          - or
[^Bush]    - Any single character except "B", "u", "s" or "h"
|          - or
Reagan)   - "Reagan". End capture group.
/i         - Make matches case-insensitive 

So, in other words, your middle part of the regex is screwing you up. As it is a "catch-all" kind of group, it will allow any line that does not begin with any of the upper or lower case letters in "Bush". For example, these lines would match your regex:

Our president, George Bush
In the news today, pigs can fly
012-3123 33

You either make a negative look-ahead, as suggested earlier, or you simply make two regexes:

if( ($string =~ m/^(Clinton|Reagan)/i) and
    ($string !~ m/^Bush/i) ) {
   print "$string\n";
}

As mirod has pointed out in the comments, the second check is quite unnecessary when using the caret (^) to match only beginning of lines, as lines that begin with "Clinton" or "Reagan" could never begin with "Bush".

However, it would be valid without the carets.

Solution 4 - Regex

What's wrong with using two regexs (or three)? This makes your intentions more clear and may even improve your performance:

if ($string =~ /^(Clinton|Reagan)/i && $string !~ /Bush/i) { ... }

if (($string =~ /^Clinton/i || $string =~ /^Reagan/i)
        && $string !~ /Bush/i) {
    print "$string\n"
}

Solution 5 - Regex

If my understanding is correct then you want to match any line which has Clinton and Reagan, in any order, but not Bush. As suggested by Stuck, here is a version with lookahead assertions:

#!/usr/bin/perl

use strict;
use warnings;

my $regex = qr/
	(?=.*clinton)  
	(?!.*bush) 
	.*reagan       
	/ix;

while (<DATA>) {
	chomp;
	next unless (/$regex/);
	print $_, "\n";
}


__DATA__
shouldn't match - reagan came first, then clinton, finally bush
first match - first two: reagan and clinton
second match - first two reverse: clinton and reagan
shouldn't match - last two: clinton and bush
shouldn't match - reverse: bush and clinton
shouldn't match - and then came obama, along comes mary
shouldn't match - to clinton with perl

Results

first match - first two: reagan and clinton
second match - first two reverse: clinton and reagan

as desired it matches any line which has Reagan and Clinton in any order.

You may want to try reading how lookahead assertions work with examples at http://www252.pair.com/comdog/mastering_perl/Chapters/02.advanced_regular_expressions.html

they are very tasty :)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjoeView Question on Stackoverflow
Solution 1 - RegexStuckView Answer on Stackoverflow
Solution 2 - RegexDemosthenexView Answer on Stackoverflow
Solution 3 - RegexTLPView Answer on Stackoverflow
Solution 4 - RegexmobView Answer on Stackoverflow
Solution 5 - RegexAshish KumarView Answer on Stackoverflow