Regex to detect one of several strings

Regex

Regex Problem Overview


I've got a list of email addresses belonging to several domains. I'd like a regex that will match addresses belonging to three specific domains (for this example: foo, bar, & baz)

So these would match:

  1. a@foo
  2. a@bar
  3. b@baz

This would not:

  1. a@fnord

Ideally, these would not match either (though it's not critical for this particular problem):

  1. a@foobar
  2. b@foofoo

Abstracting the problem a bit: I want to match a string that contains at least one of a given list of substrings.

Regex Solutions


Solution 1 - Regex

Use the pipe symbol to indicate "or":

/a@(foo|bar|baz)\b/

If you don't want the capture-group, use the non-capturing grouping symbol:

/a@(?:foo|bar|baz)\b/

(Of course I'm assuming "a" is OK for the front of the email address! You should replace that with a suitable regex.)

Solution 2 - Regex

^(a|b)@(foo|bar|baz)$

if you have this strongly defined a list. The start and end character will only search for those three strings.

Solution 3 - Regex

Use:

/@(foo|bar|baz)\.?$/i

Note the differences from other answers:

  • \.? - matching 0 or 1 dots, in case the domains in the e-mail address are "fully qualified"
  • $ - to indicate that the string must end with this sequence,
  • /i - to make the test case insensitive.

Note, this assumes that each e-mail address is on a line on its own.

If the string being matched could be anywhere in the string, then drop the $, and replace it with \s+ (which matches one or more white space characters)

Solution 4 - Regex

should be more generic, the a shouldn't count, although the @ should.

/@(foo|bar|baz)(?:\W|$)/

Here is a good reference on regex.

edit: change ending to allow end of pattern or word break. now assuming foo/bar/baz are full domain names.

Solution 5 - Regex

If the previous (and logical) answers about '|' don't suit you, have a look at

http://metacpan.org/pod/Regex::PreSuf

module description : create regular expressions from word lists

Solution 6 - Regex

You don't need a regex to find whether a string contains at least one of a given list of substrings. In Python:

def contain(string_, substrings):
    return any(s in string_ for s in substrings)

The above is slow for a large string_ and many substrings. GNU fgrep can efficiently search for multiple patterns at the same time.

Using regex
import re

def contain(string_, substrings):
    regex = '|'.join("(?:%s)" % re.escape(s) for s in substrings)
    return re.search(regex, string_) is not None

Solution 7 - Regex

Ok I know you asked for a regex answer. But have you considered just splitting the string with the '@' char taking the second array value (the domain) and doing a simple match test

if (splitString[1] == "foo" && splitString[1] == "bar" && splitString[1] == "baz")
{
   //Do Something!
}

Seems to me that RegEx is overkill. Of course my assumption is that your case is really as simple as you have listed.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCraig WalkerView Question on Stackoverflow
Solution 1 - RegexJason CohenView Answer on Stackoverflow
Solution 2 - RegexGregory A BeamerView Answer on Stackoverflow
Solution 3 - RegexAlnitakView Answer on Stackoverflow
Solution 4 - RegexsfossenView Answer on Stackoverflow
Solution 5 - RegexsiukurninView Answer on Stackoverflow
Solution 6 - RegexjfsView Answer on Stackoverflow
Solution 7 - RegexAndrew HarryView Answer on Stackoverflow