Reference - What does this regex mean?
RegexRegex Problem Overview
What is this?
This is a collection of common Q&A. This is also a Community Wiki, so everyone is invited to participate in maintaining it.
Why is this?
[tag:regex] is suffering from give me ze code type of questions and poor answers with no explanation. This reference is meant to provide links to quality Q&A.
What's the scope?
This reference is meant for the following languages: [tag:php], [tag:perl], [tag:javascript], [tag:python], [tag:ruby], [tag:java], [tag:.net].
This might be too broad, but these languages share the same syntax. For specific features there's the tag of the language behind it, example:
- What are regular expression Balancing Groups? [tag:.net]
Regex Solutions
Solution 1 - Regex
The Stack Overflow Regular Expressions FAQ
See also a lot of general hints and useful links at the [tag:regex] tag details page.
Online tutorials
Quantifiers
- Zero-or-more: [
*
:greedy][b1],*?
:reluctant, [*+
:possessive][b2] - One-or-more: [
+
:greedy][b1],+?
:reluctant, [++
:possessive][b3] - [
?
:optional (zero-or-one)][b4] - Min/max ranges (all inclusive): [
{n,m}
:between n & m][b5], [{n,}
:n-or-more][b6], [{n}
:exactly n][b7] - Differences between greedy, reluctant (a.k.a. "lazy", "ungreedy") and possessive quantifier:
- [Greedy vs. Reluctant vs. Possessive Quantifiers][b8]
- [In-depth discussion on the differences between greedy versus non-greedy][b9]
- [What's the difference between
{n}
and{n}?
][b10] - [Can someone explain Possessive Quantifiers to me?][b11] [tag:php], [tag:perl], [tag:java], [tag:ruby]
- [Emulating possessive quantifiers][b12] [tag:.net]
- Non-Stack Overflow references: From [Oracle][b13], [regular-expressions.info][b14]
Character Classes
- [What is the difference between square brackets and parentheses?][c1]
[...]
: any one character,[^...]
: negated/any character but[^]
matches any one character including newlines [tag:javascript][\w-[\d]]
/[a-z-[qz]]
: set subtraction [tag:.net], [tag:xml-schema], [tag:xpath], JGSoft[\w&&[^\d]]
: set intersection [tag:java], [tag:ruby] 1.9+[[:alpha:]]
:POSIX character classes[[:<:]]
and[[:>:]]
Word boundaries- Why do
[^\\D2]
,[^[^0-9]2]
,[^2[^0-9]]
get different results in Java? [tag:java] - Shorthand:
- Digit: [
\d
:digit][c2], [\D
:non-digit][c3] - Word character (Letter, digit, underscore): [
\w
:word character][c4], [\W
:non-word character][c3] - Whitespace: [
\s
:whitespace][c5], [\S
:non-whitespace][c3]
- Digit: [
- Unicode categories (
\p{L}, \P{L}
, etc.)
Escape Sequences
- Horizontal whitespace: [
\h
:space-or-tab][d1], [\t
:tab][d2] - Newlines:
- [
\r
,\n
:carriage return and line feed][d3] - [
\R
:generic newline][d4] [tag:php] [tag:java-8]
- [
- Negated whitespace sequences: [
\H
:Non horizontal whitespace character,\V
:Non vertical whitespace character,\N
:Non line feed character][d5] [tag:pcre] [tag:php5] [tag:java-8] - Other: [
\v
:vertical tab][d6], [\e
:the escape character][d7]
Anchors
anchor | matches | flavors |
---|---|---|
^ |
Start of string | Common* |
^ |
Start of line | Commonm |
$ |
End of line | Commonm |
$ |
End of text | Common* |
$ |
The very end of string | [tag:php]D , [tag:javascript] |
\A |
Start of string | Common except [tag:js] |
\Z |
End of text | Common except [tag:js] [tag:python] |
\Z |
The very end of string | [tag:python] |
\z |
The very end of string | Common except [tag:js] [tag:python] |
\b |
Word boundary | Common |
\B |
Not a word boundary | Common |
\G |
End of previous match | Common except [tag:js], [tag:python] re |
Term | Definition |
---|---|
[Start of string][a1] | At the very start of the string. |
[Start of line][a2] | At the very start of the string, and after a non-terminal line terminator. |
[End of string][a3] | At the very end of the string. |
[End of text][a4] | At the very end of the string, and at a terminal line terminator. |
[End of line][a5] | At the very end of the string, and at a line terminator. |
[Word boundary][a6] | At a word character not preceded by a word character, and at a non-word character not preceded by a non-word character. |
[End of previous match][a7] | At a previously set position, usually where a previous match ended. At the very start of the string if no position was set. |
"Common" refers to the following: [tag:icu] [tag:java] [tag:js] [tag:.net] [tag:objective-c] [tag:pcre] [tag:perl] [tag:php] [tag:python] [tag:swift] [tag:ruby]
* Default [|]
m
Multi-line mode. [|]
D
Dollar end only mode.
Groups
- [
(...)
:capture group][f1], [(?:)
:non-capture group][f2]- [Why is my repeating capturing group only capturing the last match?][f3]
- [
\1
:backreference and capture-group reference,$1
:capture group reference][f1]- [What's the meaning of a number after a backslash in a regular expression?][f14]
- [
\g<1>123
:How to follow a numbered capture group, such as\1
, with a number?:][f4] [tag:python]
- [What does a subpattern
(?i:regex)
mean?][f5] - [What does the 'P' in
(?P<group_name>regexp)
mean?][f5a] - [
(?>)
:atomic group][f6] or independent group, [(?|)
:branch reset][f7]- [Equivalent of branch reset in .NET/C#][f8] [tag:.net]
- Named capture groups:
- General named capturing group reference at
regular-expressions.info
- [tag:java]:
(?<groupname>regex)
: [Overview][f9] and [naming rules][f10] (Non-Stack Overflow links) - Other languages: [
(?P<groupname>regex)
][f11] [tag:python], [(?<groupname>regex)
][f12] [tag:.net], [(?<groupname>regex)
][f13] [tag:perl],(?P<groupname>regex)
and(?<groupname>regex)
[tag:php]
- General named capturing group reference at
Lookarounds
- Lookaheads: [
(?=...)
:positive][g1], [(?!...)
:negative][g2] - Lookbehinds: [
(?<=...)
:positive][g3], [(?<!...)
:negative][g3] - Lookbehind limits in:
- [Lookbehinds need to be constant-length][g4] [tag:php], [tag:perl], [tag:python], [tag:ruby]
- [Lookarounds of limited length
{0,n}
][g5] [tag:java] - [Variable length lookbehinds are allowed][g5] [tag:.net]
- Lookbehind alternatives:
- [Using
\K
][g6] [tag:php], [tag:perl] ([Flavors that support\K
][g7]) - [Alternative regex module for Python][g8] [tag:python]
- [The hacky way][g9]
- [JavaScript negative lookbehind equivalents][g10] [External link][g11]
- [Using
Modifiers
flag | modifier | flavors |
---|---|---|
a |
[ASCII][h17] | [tag:python] |
c |
[current position][h6] | [tag:perl] |
e |
[expression][h7] | [tag:php] [tag:perl] |
g |
[global][h1] | most |
i |
[case-insensitive][h2] | most |
m |
[multiline][h10] | [tag:php] [tag:perl] [tag:python] [tag:javascript] [tag:.net] [tag:java] |
m |
[(non)multiline][h9] | [tag:ruby] |
o |
[once][h8] | [tag:perl] [tag:ruby] |
S |
[study][h11] | [tag:php] |
s |
[single line][h12] | [tag:ruby] |
U |
[ungreedy][h4] | [tag:php] [tag:r] |
u |
[unicode][h3] | most |
x |
[whitespace-extended][h5] | most |
y |
[sticky ↪][h16] | [tag:javascript] |
- [How to convert preg_replace e to preg_replace_callback?][h14]
- [What are inline modifiers?][h15]
- What is '?-mix' in a Ruby Regular Expression
Other:
- [
|
:alternation (OR) operator][i1], [.
:any character][h12],[.]
:literal dot character - [What special characters must be escaped?][i2]
- Control verbs ([tag:php] and [tag:perl]): [
(*PRUNE)
][i3], [(*SKIP)
][i3], [(*FAIL)
and(*F)
][i3]- [tag:php] only: [
(*BSR_ANYCRLF)
][i4]
- [tag:php] only: [
- Recursion ([tag:php] and [tag:perl]): [
(?R)
][i5], [(?0)
and(?1)
][i6], [(?-1)
][i7], [(?&groupname)
][i8]
Common Tasks
- [Get a string between two curly braces:
{...}
][j1] - [Match (or replace) a pattern except in situations s1, s2, s3...][j2]
- https://stackoverflow.com/q/5830387
- Validation:
- Internet: [email addresses][j3], [URLs][j4] (host/port: [regex][j5] and [non-regex][j6] alternatives), [passwords][j7]
- Numeric: [a number][j8], [min-max ranges (such as 1-31)][j9], [phone numbers][j10], [date][j11]
- Parsing HTML with regex: See "General Information > When not to use Regex"
Advanced Regex-Fu
- Strings and numbers:
- https://stackoverflow.com/q/406230
- https://stackoverflow.com/q/3746487
- [Match strings whose length is a fourth power][k1]
- https://stackoverflow.com/q/3627681
- https://stackoverflow.com/q/2795065
- How to match the middle character in a string with regex?
- Other:
- https://stackoverflow.com/q/3644266
- Match nested brackets
- [Using a recursive pattern][k7] [tag:php], [tag:perl]
- [Using balancing groups][k8] [tag:.net]
- “Vertical” regex matching in an ASCII “image”
- [List of highly up-voted regex questions on Code Golf][k4]
- How to make two quantifiers repeat the same number of times?
- [An impossible-to-match regular expression:
(?!a)a
][k5] - [Match/delete/replace
this
except in contexts A, B and C][k6] - [Match nested brackets with regex without using recursion or balancing groups?][k9]
Flavor-Specific Information
(Except for those marked with *
, this section contains non-Stack Overflow links.)
- Java
- Official documentation: [Pattern Javadoc ↪][l1], [Oracle's regular expressions tutorial ↪][l2]
- The differences between functions in [
java.util.regex.Matcher
][l3]:- [
matches()
][l4]): The match must be anchored to both input-start and -end - [
find()
][l5]): A match may be anywhere in the input string (substrings) - [
lookingAt()
][l6]: The match must be anchored to input-start only - (For anchors in general, see the section "Anchors")
- [
- The only [
java.lang.String
][l7] functions that accept regular expressions: [matches(s)
][l8], [replaceAll(s,s)
][l9], [replaceFirst(s,s)
][l10], [split(s)
][l11], [split(s,i)
][l13] - *[An (opinionated and) detailed discussion of the disadvantages of and missing features in
java.util.regex
][l14]
- .NET
- [How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?][l31]
- Official documentation:
- Boost regex engine: [General syntax][l14], [Perl syntax][l15] (used by TextPad, Sublime Text, UltraEdit, ...???)
- JavaScript [general info][l16] and [RegExp object][l17]
- [.NET][l18] ![][|] [MySQL][l19] ![][|] [Oracle][l20] ![][|] [Perl5 version 18.2][l21]
- PHP: [pattern syntax][l22], [
preg_match
][l23] - Python: [Regular expression operations][l24], [
search
vsmatch
][l25], [how-to][l26] - Rust: [crate
regex
][132], [structregex::Regex
][133] - Splunk: [regex terminology and syntax][l27] and [regex command][l28]
- Tcl: [regex syntax][tclsyntax], [manpage][l29], [
regexp
command][l30] - Visual Studio Find and Replace
General information
(Links marked with *
are non-Stack Overflow links.)
- Other general documentation resources: https://stackoverflow.com/q/4736, *[Regular-expressions.info][m1], *[Wikipedia entry][m2], *[RexEgg][m3], [Open-Directory Project][m4]
- [DFA versus NFA][m5]
- Generating Strings matching regex
- Books: Jeffrey Friedl's [Mastering Regular Expressions][m6]
- When to not use regular expressions:
- [Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.][m7] (blog post written by [Stack Overflow][so]'s founder)*
- Do not use regex to parse HTML:
- [Don't][m8]. ![][|] [Please, just don't][m9]
- [Well, maybe...if you're really determined][m10] (other answers in this question are also good)
Examples of regex that can cause regex engine to fail
Tools: Testers and Explainers
(This section contains non-Stack Overflow links.)
-
Online (* includes replacement tester, + includes split tester):
- [Debuggex][n1] (Also has a repository of useful regexes) [tag:javascript], [tag:python], [tag:pcre]
- *[Regular Expressions 101][n2] [tag:php], [tag:pcre], [tag:python], [tag:javascript], [tag:java]
- [Regex Pal][n3], [regular-expressions.info][n4] [tag:javascript]
- [Rubular][n5] [tag:ruby] ![][|] [RegExr][n6] ![][|] [Regex Hero][n7] [tag:dotnet]
- *+ [regexstorm.net][n8] [tag:.net]
- *RegexPlanet: [Java][n9] [tag:java], [Go][n10] [tag:go], [Haskell][n11] [tag:haskell], [JavaScript][n12] [tag:javascript], [.NET][n13] [tag:dotnet], [Perl][n14] [tag:perl] [tag:php] [PCRE][n15] [tag:php], [Python][n16] [tag:python], [Ruby][n17] [tag:ruby], [XRegExp][n18] [tag:xregexp]
- [
freeformatter.com
][n19] [tag:xregexp] - *+[
regex.larsolavtorvik.com
][n20] [tag:php] PCRE and POSIX, [tag:javascript]
-
Offline:
- Microsoft Windows: [RegexBuddy][o1] (analysis), [RegexMagic][o2] (creation), [Expresso][o3] (analysis, creation, free)
[|]: https://i.stack.imgur.com/D41QM.png [so]: https://stackoverflow.com/ [a1]: https://stackoverflow.com/a/6908745 [a2]: https://stackoverflow.com/a/6908745 [a3]: https://stackoverflow.com/a/48832215 [a4]: https://stackoverflow.com/a/4020821 [a5]: https://stackoverflow.com/a/6908745 [a6]: https://stackoverflow.com/a/6664167 [a7]: https://stackoverflow.com/q/21971701 [b1]: https://stackoverflow.com/a/10764399 [b2]: https://stackoverflow.com/a/17064242 [b3]: https://stackoverflow.com/q/4489551 [b4]: https://stackoverflow.com/a/17400486 [b5]: https://stackoverflow.com/a/17032985 [b6]: https://stackoverflow.com/a/17120435 [b7]: https://stackoverflow.com/a/17829727 [b8]: https://stackoverflow.com/q/5319840 [b9]: https://stackoverflow.com/a/3075532 [b10]:https://stackoverflow.com/q/18006093 [b11]:https://stackoverflow.com/q/1117467 [b12]:https://stackoverflow.com/q/5537513 [b13]:https://docs.oracle.com/javase/tutorial/essential/regex/quant.html [b14]:https://www.regular-expressions.info/possessive.html [c1]: https://stackoverflow.com/q/9801630 [c2]: https://stackoverflow.com/a/16621778 [c3]: https://stackoverflow.com/a/19011185 [c4]: https://stackoverflow.com/a/11874899 [c5]: https://stackoverflow.com/a/21067350 [d1]: https://stackoverflow.com/a/4910093 [d2]: https://stackoverflow.com/a/17950891 [d3]: https://stackoverflow.com/a/3451192 [d4]: https://stackoverflow.com/a/18992691 [d5]: https://stackoverflow.com/q/26972688 [d6]: https://stackoverflow.com/q/12290224 [d7]: https://stackoverflow.com/a/4275788 [f1]: https://stackoverflow.com/q/21880127 [f2]: https://stackoverflow.com/q/3512471 [f3]: https://stackoverflow.com/a/23062553 [f4]: https://stackoverflow.com/q/5984633 [f5]: https://stackoverflow.com/a/3812728 [f5a]: https://stackoverflow.com/questions/10059673/named-regular-expression-group-pgroup-nameregexp-what-does-p-stand-for [f6]: https://stackoverflow.com/q/14411818 [f7]: https://stackoverflow.com/a/5333645 [f8]: https://stackoverflow.com/a/5378077 [f9]: https://blogs.oracle.com/xuemingshen/entry/named_capturing_group_in_jdk7 [f10]:https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#groupname [f11]:https://stackoverflow.com/q/10059673 [f12]:https://stackoverflow.com/a/20355718 [f13]:https://stackoverflow.com/a/288989 [f14]:https://stackoverflow.com/q/8624345 [g1]: https://stackoverflow.com/a/1570916 [g2]: https://stackoverflow.com/a/12210820 [g3]: https://stackoverflow.com/a/11197672 [g4]: https://stackoverflow.com/a/22821726 [g5]: https://stackoverflow.com/a/20994257 [g6]: https://stackoverflow.com/a/11640500 [g7]: https://stackoverflow.com/a/13543042 [g8]: https://stackoverflow.com/a/11641102 [g9]: https://stackoverflow.com/a/11640862 [g10]:https://stackoverflow.com/a/35143111 [g11]:http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript [h1]: https://stackoverflow.com/a/9622110 [h2]: https://stackoverflow.com/a/12411066 [h3]: https://stackoverflow.com/a/2553239 [h4]: https://stackoverflow.com/a/5978385 [h5]: https://stackoverflow.com/a/2710390 [h6]: https://stackoverflow.com/a/11395687 [h7]: https://stackoverflow.com/a/2468483 [h8]: https://stackoverflow.com/a/13334823 [h9]: https://stackoverflow.com/a/4257912 [h10]:https://stackoverflow.com/a/22438123 [h11]:https://stackoverflow.com/a/210027 [h12]:https://stackoverflow.com/a/13594017 [h13]:https://stackoverflow.com/a/1068308 [h14]:https://stackoverflow.com/q/16367404 [h15]:https://stackoverflow.com/a/43636 [h16]:https://javascript.info/regexp-sticky [h17]:https://stackoverflow.com/a/61203075 [i1]: https://stackoverflow.com/a/22187948 [i2]: https://stackoverflow.com/q/399078 [i3]: https://stackoverflow.com/a/20008790 [i4]: https://stackoverflow.com/a/7374702 [i5]: https://stackoverflow.com/q/8440911 [i6]: https://stackoverflow.com/a/20569361 [i7]: https://stackoverflow.com/a/17845034 [i8]: https://stackoverflow.com/a/18151617 [j1]: https://stackoverflow.com/a/413077 [j2]: https://stackoverflow.com/q/23589174 [j3]: https://stackoverflow.com/q/201323 [j4]: https://stackoverflow.com/a/190405 [j5]: https://stackoverflow.com/a/22697740 [j6]: https://stackoverflow.com/a/24399003 [j7]: https://stackoverflow.com/a/3802238 [j8]: https://stackoverflow.com/a/4247184 [j9]: https://stackoverflow.com/a/22131040 [j10]:https://stackoverflow.com/q/123559 [j11]:https://stackoverflow.com/q/15491894 [k1]: https://codegolf.stackexchange.com/q/19262 [k2]: https://stackoverflow.com/a/17845034 [k3]: https://stackoverflow.com/a/17004406 [k4]: https://codegolf.stackexchange.com/questions/tagged/regular-expression?sort=votes&pageSize=50 [k5]: https://stackoverflow.com/q/1723182 [k6]: https://stackoverflow.com/q/23589174 [k7]: https://stackoverflow.com/a/17845034 [k8]: https://stackoverflow.com/a/17004406 [k9]: https://stackoverflow.com/a/47162099 [l1]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html [l2]: https://docs.oracle.com/javase/tutorial/essential/regex/index.html [l3]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html [l4]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#matches-- [l5]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#find-- [l6]: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#lookingAt-- [l7]: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html [l8]: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#matches-java.lang.String- [l9]: http://docs.oracle.com/javase/8/docs/api/java/lang/String.html#replaceAll-java.lang.String-java.lang.String- [l10]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#replaceFirst-java.lang.String-java.lang.String- [l11]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String- [l13]:https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String-int- [l14]:https://stackoverflow.com/a/5771326 [l15]:https://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax.html [l16]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions [l17]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp [l18]:https://msdn.microsoft.com/en-us/library/hs600312.aspx [l19]:https://dev.mysql.com/doc/refman/5.1/en/regexp.html [l20]:https://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm [l21]:https://perldoc.perl.org/perlre.html [l22]:https://www.php.net/manual/en/reference.pcre.pattern.syntax.php [l23]:https://us2.php.net/preg_match [l24]:https://docs.python.org/3/library/re.html [l25]:https://docs.python.org/3/library/re.html#search-vs-match [l26]:https://docs.python.org/3/howto/regex.html [l27]:https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/AboutSplunkregularexpressions#Terminology_and_syntax [l28]:https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Regex [l29]:https://www.tcl.tk/man/tcl8.6/TclCmd/regexp.htm [tclsyntax]:https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm [l30]:http://wiki.tcl.tk/986 [l31]:https://stackoverflow.com/q/36047988 [132]:https://docs.rs/regex/latest [133]:https://docs.rs/regex/latest/regex/struct.Regex.html [m1]: https://www.regular-expressions.info [m2]: https://en.wikipedia.org/wiki/Regular_expression [m3]: http://www.rexegg.com/ [m4]: http://www.dmoz.org/Computers/Programming/Languages/Regular_Expressions [m5]: https://stackoverflow.com/q/3978438 [m6]: http://regex.info/book.html [m7]: https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/ [m8]: https://stackoverflow.com/q/590747 [m9]: https://stackoverflow.com/a/1732454 [m10]:https://stackoverflow.com/a/4234491 [n1]: https://debuggex.com [n2]: https://regex101.com [n3]: https://regexpal.com [n4]: http://www.regular-expressions.info/javascriptexample.html [n5]: http://rubular.com/ [n6]: http://www.regexr.com/ [n7]: http://regexhero.net/tester [n8]: http://regexstorm.net/tester [n9]: http://www.regexplanet.com/advanced/java/index.html [n10]:http://www.regexplanet.com/advanced/golang/index.html [n11]:http://www.regexplanet.com/advanced/haskell/index.html [n12]:http://www.regexplanet.com/advanced/javascript/index.html [n13]:http://www.regexplanet.com/advanced/dotnet/index.html [n14]:http://www.regexplanet.com/advanced/perl/index.html [n15]:http://www.regexplanet.com/advanced/php/index.html [n16]:http://www.regexplanet.com/advanced/python/index.html [n17]:http://www.regexplanet.com/advanced/ruby/index.html [n18]:http://www.regexplanet.com/advanced/xregexp/index.html [n19]:http://www.freeformatter.com/regex-tester.html [n20]:http://regex.larsolavtorvik.com/ [o1]: http://regexbuddy.com [o2]: http://regexmagic.com [o3]: http://www.ultrapico.com/expresso.htm