Reference - What does this regex mean?

Regex

Regex Problem Overview


What is this?

This is a collection of common Q&A. This is also a Community Wiki, so everyone is invited to participate in maintaining it.

Why is this?

[tag:regex] is suffering from give me ze code type of questions and poor answers with no explanation. This reference is meant to provide links to quality Q&A.

What's the scope?

This reference is meant for the following languages: [tag:php], [tag:perl], [tag:javascript], [tag:python], [tag:ruby], [tag:java], [tag:.net].

This might be too broad, but these languages share the same syntax. For specific features there's the tag of the language behind it, example:

  • What are regular expression Balancing Groups? [tag:.net]

Regex Solutions


Solution 1 - Regex

The Stack Overflow Regular Expressions FAQ

See also a lot of general hints and useful links at the [tag:regex] tag details page.


Online tutorials

Quantifiers

  • Zero-or-more: [*:greedy][b1], *?:reluctant, [*+:possessive][b2]
  • One-or-more: [+:greedy][b1], +?:reluctant, [++:possessive][b3]
  • [?:optional (zero-or-one)][b4]
  • Min/max ranges (all inclusive): [{n,m}:between n & m][b5], [{n,}:n-or-more][b6], [{n}:exactly n][b7]
  • Differences between greedy, reluctant (a.k.a. "lazy", "ungreedy") and possessive quantifier:
    • [Greedy vs. Reluctant vs. Possessive Quantifiers][b8]
    • [In-depth discussion on the differences between greedy versus non-greedy][b9]
    • [What's the difference between {n} and {n}?][b10]
    • [Can someone explain Possessive Quantifiers to me?][b11] [tag:php], [tag:perl], [tag:java], [tag:ruby]
    • [Emulating possessive quantifiers][b12] [tag:.net]
    • Non-Stack Overflow references: From [Oracle][b13], [regular-expressions.info][b14]

Character Classes

Escape Sequences

  • Horizontal whitespace: [\h:space-or-tab][d1], [\t:tab][d2]
  • Newlines:
    • [\r, \n:carriage return and line feed][d3]
    • [\R:generic newline][d4] [tag:php] [tag:java-8]
  • Negated whitespace sequences: [\H:Non horizontal whitespace character, \V:Non vertical whitespace character, \N:Non line feed character][d5] [tag:pcre] [tag:php5] [tag:java-8]
  • Other: [\v:vertical tab][d6], [\e:the escape character][d7]

Anchors

anchor matches flavors
^ Start of string Common*
^ Start of line Commonm
$ End of line Commonm
$ End of text Common*
$ The very end of string [tag:php]D, [tag:javascript]
\A Start of string Common except [tag:js]
\Z End of text Common except [tag:js] [tag:python]
\Z The very end of string [tag:python]
\z The very end of string Common except [tag:js] [tag:python]
\b Word boundary Common
\B Not a word boundary Common
\G End of previous match Common except [tag:js], [tag:python] re
Term Definition
[Start of string][a1] At the very start of the string.
[Start of line][a2] At the very start of the string, and
after a non-terminal line terminator.
[End of string][a3] At the very end of the string.
[End of text][a4] At the very end of the string, and
at a terminal line terminator.
[End of line][a5] At the very end of the string, and
at a line terminator.
[Word boundary][a6] At a word character not preceded by a word character, and
at a non-word character not preceded by a non-word character.
[End of previous match][a7] At a previously set position, usually where a previous match ended.
At the very start of the string if no position was set.

"Common" refers to the following: [tag:icu] [tag:java] [tag:js] [tag:.net] [tag:objective-c] [tag:pcre] [tag:perl] [tag:php] [tag:python] [tag:swift] [tag:ruby]

* Default [|] m Multi-line mode. [|] D Dollar end only mode.

Groups

  • [(...):capture group][f1], [(?:):non-capture group][f2]
    • [Why is my repeating capturing group only capturing the last match?][f3]
  • [\1:backreference and capture-group reference, $1:capture group reference][f1]
    • [What's the meaning of a number after a backslash in a regular expression?][f14]
    • [\g<1>123:How to follow a numbered capture group, such as \1, with a number?:][f4] [tag:python]
  • [What does a subpattern (?i:regex) mean?][f5]
  • [What does the 'P' in (?P<group_name>regexp) mean?][f5a]
  • [(?>):atomic group][f6] or independent group, [(?|):branch reset][f7]
    • [Equivalent of branch reset in .NET/C#][f8] [tag:.net]
  • Named capture groups:
    • General named capturing group reference at regular-expressions.info
    • [tag:java]: (?<groupname>regex): [Overview][f9] and [naming rules][f10] (Non-Stack Overflow links)
    • Other languages: [(?P<groupname>regex)][f11] [tag:python], [(?<groupname>regex)][f12] [tag:.net], [(?<groupname>regex)][f13] [tag:perl], (?P<groupname>regex) and (?<groupname>regex) [tag:php]

Lookarounds

  • Lookaheads: [(?=...):positive][g1], [(?!...):negative][g2]
  • Lookbehinds: [(?<=...):positive][g3], [(?<!...):negative][g3]
  • Lookbehind limits in:
    • [Lookbehinds need to be constant-length][g4] [tag:php], [tag:perl], [tag:python], [tag:ruby]
    • [Lookarounds of limited length {0,n}][g5] [tag:java]
    • [Variable length lookbehinds are allowed][g5] [tag:.net]
  • Lookbehind alternatives:
    • [Using \K][g6] [tag:php], [tag:perl] ([Flavors that support \K][g7])
    • [Alternative regex module for Python][g8] [tag:python]
      • [The hacky way][g9]
      • [JavaScript negative lookbehind equivalents][g10] [External link][g11]

Modifiers

flag modifier flavors
a [ASCII][h17] [tag:python]
c [current position][h6] [tag:perl]
e [expression][h7] [tag:php] [tag:perl]
g [global][h1] most
i [case-insensitive][h2] most
m [multiline][h10] [tag:php] [tag:perl] [tag:python] [tag:javascript] [tag:.net] [tag:java]
m [(non)multiline][h9] [tag:ruby]
o [once][h8] [tag:perl] [tag:ruby]
S [study][h11] [tag:php]
s [single line][h12] [tag:ruby]
U [ungreedy][h4] [tag:php] [tag:r]
u [unicode][h3] most
x [whitespace-extended][h5] most
y [sticky ↪][h16] [tag:javascript]

Other:

  • [|:alternation (OR) operator][i1], [.:any character][h12], [.]:literal dot character
  • [What special characters must be escaped?][i2]
  • Control verbs ([tag:php] and [tag:perl]): [(*PRUNE)][i3], [(*SKIP)][i3], [(*FAIL) and (*F)][i3]
    • [tag:php] only: [(*BSR_ANYCRLF)][i4]
  • Recursion ([tag:php] and [tag:perl]): [(?R)][i5], [(?0) and (?1)][i6], [(?-1)][i7], [(?&groupname)][i8]

Common Tasks

  • [Get a string between two curly braces: {...}][j1]
  • [Match (or replace) a pattern except in situations s1, s2, s3...][j2]
  • https://stackoverflow.com/q/5830387
  • Validation:
    • Internet: [email addresses][j3], [URLs][j4] (host/port: [regex][j5] and [non-regex][j6] alternatives), [passwords][j7]
    • Numeric: [a number][j8], [min-max ranges (such as 1-31)][j9], [phone numbers][j10], [date][j11]
    • Parsing HTML with regex: See "General Information > When not to use Regex"

Advanced Regex-Fu

Flavor-Specific Information

(Except for those marked with *, this section contains non-Stack Overflow links.)

  • Java
    • Official documentation: [Pattern Javadoc ↪][l1], [Oracle's regular expressions tutorial ↪][l2]
    • The differences between functions in [java.util.regex.Matcher][l3]:
      • [matches()][l4]): The match must be anchored to both input-start and -end
      • [find()][l5]): A match may be anywhere in the input string (substrings)
      • [lookingAt()][l6]: The match must be anchored to input-start only
      • (For anchors in general, see the section "Anchors")
    • The only [java.lang.String][l7] functions that accept regular expressions: [matches(s)][l8], [replaceAll(s,s)][l9], [replaceFirst(s,s)][l10], [split(s)][l11], [split(s,i)][l13]
    • *[An (opinionated and) detailed discussion of the disadvantages of and missing features in java.util.regex][l14]
  • .NET
    • [How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?][l31]
  • Official documentation:
    • Boost regex engine: [General syntax][l14], [Perl syntax][l15] (used by TextPad, Sublime Text, UltraEdit, ...???)
    • JavaScript [general info][l16] and [RegExp object][l17]
    • [.NET][l18] ![][|] [MySQL][l19] ![][|] [Oracle][l20] ![][|] [Perl5 version 18.2][l21]
    • PHP: [pattern syntax][l22], [preg_match][l23]
    • Python: [Regular expression operations][l24], [search vs match][l25], [how-to][l26]
    • Rust: [crate regex][132], [struct regex::Regex][133]
    • Splunk: [regex terminology and syntax][l27] and [regex command][l28]
    • Tcl: [regex syntax][tclsyntax], [manpage][l29], [regexp command][l30]
    • Visual Studio Find and Replace

General information

(Links marked with * are non-Stack Overflow links.)

  • Other general documentation resources: https://stackoverflow.com/q/4736, *[Regular-expressions.info][m1], *[Wikipedia entry][m2], *[RexEgg][m3], [Open-Directory Project][m4]
  • [DFA versus NFA][m5]
  • Generating Strings matching regex
  • Books: Jeffrey Friedl's [Mastering Regular Expressions][m6]
  • When to not use regular expressions:
    • [Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.][m7] (blog post written by [Stack Overflow][so]'s founder)*
    • Do not use regex to parse HTML:
      • [Don't][m8]. ![][|] [Please, just don't][m9]
      • [Well, maybe...if you're really determined][m10] (other answers in this question are also good)

Examples of regex that can cause regex engine to fail

Tools: Testers and Explainers

(This section contains non-Stack Overflow links.)

  • Online (* includes replacement tester, + includes split tester):

    • [Debuggex][n1] (Also has a repository of useful regexes) [tag:javascript], [tag:python], [tag:pcre]
    • *[Regular Expressions 101][n2] [tag:php], [tag:pcre], [tag:python], [tag:javascript], [tag:java]
    • [Regex Pal][n3], [regular-expressions.info][n4] [tag:javascript]
    • [Rubular][n5] [tag:ruby] ![][|] [RegExr][n6] ![][|] [Regex Hero][n7] [tag:dotnet]
    • *+ [regexstorm.net][n8] [tag:.net]
    • *RegexPlanet: [Java][n9] [tag:java], [Go][n10] [tag:go], [Haskell][n11] [tag:haskell], [JavaScript][n12] [tag:javascript], [.NET][n13] [tag:dotnet], [Perl][n14] [tag:perl] [tag:php] [PCRE][n15] [tag:php], [Python][n16] [tag:python], [Ruby][n17] [tag:ruby], [XRegExp][n18] [tag:xregexp]
    • [freeformatter.com][n19] [tag:xregexp]
    • *+[regex.larsolavtorvik.com][n20] [tag:php] PCRE and POSIX, [tag:javascript]
  • Offline:

    • Microsoft Windows: [RegexBuddy][o1] (analysis), [RegexMagic][o2] (creation), [Expresso][o3] (analysis, creation, free)

[|]: https://i.stack.imgur.com/D41QM.png [so]: https://stackoverflow.com/ [a1]: https://stackoverflow.com/a/6908745 [a2]: https://stackoverflow.com/a/6908745 [a3]: https://stackoverflow.com/a/48832215 [a4]: https://stackoverflow.com/a/4020821 [a5]: https://stackoverflow.com/a/6908745 [a6]: https://stackoverflow.com/a/6664167 [a7]: https://stackoverflow.com/q/21971701 [b1]: https://stackoverflow.com/a/10764399 [b2]: https://stackoverflow.com/a/17064242 [b3]: https://stackoverflow.com/q/4489551 [b4]: https://stackoverflow.com/a/17400486 [b5]: https://stackoverflow.com/a/17032985 [b6]: https://stackoverflow.com/a/17120435 [b7]: https://stackoverflow.com/a/17829727 [b8]: https://stackoverflow.com/q/5319840 [b9]: https://stackoverflow.com/a/3075532 [b10]:https://stackoverflow.com/q/18006093 [b11]:https://stackoverflow.com/q/1117467 [b12]:https://stackoverflow.com/q/5537513 [b13]:https://docs.oracle.com/javase/tutorial/essential/regex/quant.html [b14]:https://www.regular-expressions.info/possessive.html [c1]: https://stackoverflow.com/q/9801630 [c2]: https://stackoverflow.com/a/16621778 [c3]: https://stackoverflow.com/a/19011185 [c4]: https://stackoverflow.com/a/11874899 [c5]: https://stackoverflow.com/a/21067350 [d1]: https://stackoverflow.com/a/4910093 [d2]: https://stackoverflow.com/a/17950891 [d3]: https://stackoverflow.com/a/3451192 [d4]: https://stackoverflow.com/a/18992691 [d5]: https://stackoverflow.com/q/26972688 [d6]: https://stackoverflow.com/q/12290224 [d7]: https://stackoverflow.com/a/4275788 [f1]: https://stackoverflow.com/q/21880127 [f2]: https://stackoverflow.com/q/3512471 [f3]: https://stackoverflow.com/a/23062553 [f4]: https://stackoverflow.com/q/5984633 [f5]: https://stackoverflow.com/a/3812728 [f5a]: https://stackoverflow.com/questions/10059673/named-regular-expression-group-pgroup-nameregexp-what-does-p-stand-for [f6]: https://stackoverflow.com/q/14411818 [f7]: https://stackoverflow.com/a/5333645 [f8]: https://stackoverflow.com/a/5378077 [f9]: https://blogs.oracle.com/xuemingshen/entry/named_capturing_group_in_jdk7 [f10]:https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#groupname [f11]:https://stackoverflow.com/q/10059673 [f12]:https://stackoverflow.com/a/20355718 [f13]:https://stackoverflow.com/a/288989 [f14]:https://stackoverflow.com/q/8624345 [g1]: https://stackoverflow.com/a/1570916 [g2]: https://stackoverflow.com/a/12210820 [g3]: https://stackoverflow.com/a/11197672 [g4]: https://stackoverflow.com/a/22821726 [g5]: https://stackoverflow.com/a/20994257 [g6]: https://stackoverflow.com/a/11640500 [g7]: https://stackoverflow.com/a/13543042 [g8]: https://stackoverflow.com/a/11641102 [g9]: https://stackoverflow.com/a/11640862 [g10]:https://stackoverflow.com/a/35143111 [g11]:http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript [h1]: https://stackoverflow.com/a/9622110 [h2]: https://stackoverflow.com/a/12411066 [h3]: https://stackoverflow.com/a/2553239 [h4]: https://stackoverflow.com/a/5978385 [h5]: https://stackoverflow.com/a/2710390 [h6]: https://stackoverflow.com/a/11395687 [h7]: https://stackoverflow.com/a/2468483 [h8]: https://stackoverflow.com/a/13334823 [h9]: https://stackoverflow.com/a/4257912 [h10]:https://stackoverflow.com/a/22438123 [h11]:https://stackoverflow.com/a/210027 [h12]:https://stackoverflow.com/a/13594017 [h13]:https://stackoverflow.com/a/1068308 [h14]:https://stackoverflow.com/q/16367404 [h15]:https://stackoverflow.com/a/43636 [h16]:https://javascript.info/regexp-sticky [h17]:https://stackoverflow.com/a/61203075 [i1]: https://stackoverflow.com/a/22187948 [i2]: https://stackoverflow.com/q/399078 [i3]: https://stackoverflow.com/a/20008790 [i4]: https://stackoverflow.com/a/7374702 [i5]: https://stackoverflow.com/q/8440911 [i6]: https://stackoverflow.com/a/20569361 [i7]: https://stackoverflow.com/a/17845034 [i8]: https://stackoverflow.com/a/18151617 [j1]: https://stackoverflow.com/a/413077 [j2]: https://stackoverflow.com/q/23589174 [j3]: https://stackoverflow.com/q/201323 [j4]: https://stackoverflow.com/a/190405 [j5]: https://stackoverflow.com/a/22697740 [j6]: https://stackoverflow.com/a/24399003 [j7]: https://stackoverflow.com/a/3802238 [j8]: https://stackoverflow.com/a/4247184 [j9]: https://stackoverflow.com/a/22131040 [j10]:https://stackoverflow.com/q/123559 [j11]:https://stackoverflow.com/q/15491894 [k1]: https://codegolf.stackexchange.com/q/19262 [k2]: https://stackoverflow.com/a/17845034 [k3]: https://stackoverflow.com/a/17004406 [k4]: https://codegolf.stackexchange.com/questions/tagged/regular-expression?sort=votes&pageSize=50 [k5]: https://stackoverflow.com/q/1723182 [k6]: https://stackoverflow.com/q/23589174 [k7]: https://stackoverflow.com/a/17845034 [k8]: https://stackoverflow.com/a/17004406 [k9]: https://stackoverflow.com/a/47162099 [l1]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html [l2]: https://docs.oracle.com/javase/tutorial/essential/regex/index.html [l3]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html [l4]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#matches-- [l5]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#find-- [l6]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#lookingAt-- [l7]: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html [l8]: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#matches-java.lang.String- [l9]: http://docs.oracle.com/javase/8/docs/api/java/lang/String.html#replaceAll-java.lang.String-java.lang.String- [l10]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#replaceFirst-java.lang.String-java.lang.String- [l11]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String- [l13]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String-int- [l14]:https://stackoverflow.com/a/5771326 [l15]:https://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax.html [l16]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions [l17]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp [l18]:https://msdn.microsoft.com/en-us/library/hs600312.aspx [l19]:https://dev.mysql.com/doc/refman/5.1/en/regexp.html [l20]:https://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm [l21]:https://perldoc.perl.org/perlre.html [l22]:https://www.php.net/manual/en/reference.pcre.pattern.syntax.php [l23]:https://us2.php.net/preg_match [l24]:https://docs.python.org/3/library/re.html [l25]:https://docs.python.org/3/library/re.html#search-vs-match [l26]:https://docs.python.org/3/howto/regex.html [l27]:https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/AboutSplunkregularexpressions#Terminology_and_syntax [l28]:https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Regex [l29]:https://www.tcl.tk/man/tcl8.6/TclCmd/regexp.htm [tclsyntax]:https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm [l30]:http://wiki.tcl.tk/986 [l31]:https://stackoverflow.com/q/36047988 [132]:https://docs.rs/regex/latest [133]:https://docs.rs/regex/latest/regex/struct.Regex.html [m1]: https://www.regular-expressions.info [m2]: https://en.wikipedia.org/wiki/Regular_expression [m3]: http://www.rexegg.com/ [m4]: http://www.dmoz.org/Computers/Programming/Languages/Regular_Expressions [m5]: https://stackoverflow.com/q/3978438 [m6]: http://regex.info/book.html [m7]: https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/ [m8]: https://stackoverflow.com/q/590747 [m9]: https://stackoverflow.com/a/1732454 [m10]:https://stackoverflow.com/a/4234491 [n1]: https://debuggex.com [n2]: https://regex101.com [n3]: https://regexpal.com [n4]: http://www.regular-expressions.info/javascriptexample.html [n5]: http://rubular.com/ [n6]: http://www.regexr.com/ [n7]: http://regexhero.net/tester [n8]: http://regexstorm.net/tester [n9]: http://www.regexplanet.com/advanced/java/index.html [n10]:http://www.regexplanet.com/advanced/golang/index.html [n11]:http://www.regexplanet.com/advanced/haskell/index.html [n12]:http://www.regexplanet.com/advanced/javascript/index.html [n13]:http://www.regexplanet.com/advanced/dotnet/index.html [n14]:http://www.regexplanet.com/advanced/perl/index.html [n15]:http://www.regexplanet.com/advanced/php/index.html [n16]:http://www.regexplanet.com/advanced/python/index.html [n17]:http://www.regexplanet.com/advanced/ruby/index.html [n18]:http://www.regexplanet.com/advanced/xregexp/index.html [n19]:http://www.freeformatter.com/regex-tester.html [n20]:http://regex.larsolavtorvik.com/ [o1]: http://regexbuddy.com [o2]: http://regexmagic.com [o3]: http://www.ultrapico.com/expresso.htm

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHamZaView Question on Stackoverflow
Solution 1 - RegexaliteralmindView Answer on Stackoverflow