What is a regex "independent non-capturing group"?

JavaRegex

Java Problem Overview


From the Java 6 Pattern documentation:

> Special constructs (non-capturing) > > (?:X)   X, as a non-capturing group > > … > > (?>X)   X, as an independent, non-capturing group

Between (?:X) and (?>X) what is the difference? What does the independent mean in this context?

Java Solutions


Solution 1 - Java

It means that the grouping is atomic, and it throws away backtracking information for a matched group. So, this expression is possessive; it won't back off even if doing so is the only way for the regex as a whole to succeed. It's "independent" in the sense that it doesn't cooperate, via backtracking, with other elements of the regex to ensure a match.

Solution 2 - Java

I think this tutorial explains what exactly "independent, non-capturing group" or "Atomic Grouping" is

The regular expression a(bc|b)c (capturing group) matches abcc and abc. The regex a(?>bc|b)c (atomic group) matches abcc but not abc.

When applied to abc, both regexes will match a to a, bc to bc, and then c will fail to match at the end of the string. Here their paths diverge. The regex with the capturing group has remembered a backtracking position for the alternation. The group will give up its match, b then matches b and c matches c. Match found!

The regex with the atomic group, however, exited from an atomic group after bc was matched. At that point, all backtracking positions for tokens inside the group are discarded. In this example, the alternation's option to try b at the second position in the string is discarded. As a result, when c fails, the regex engine has no alternatives left to try.

Solution 3 - Java

If you have foo(?>(co)*)co, that will never match. I'm sure there are practical examples of when this would be useful, try O'Reilly's book.

Solution 4 - Java

(?>X?) equals (?:X)?+, (?>X*) equals (?:X)*+, (?>X+) equals (?:X)++.

Edit: The "syntax" above means this: (?>X?) equals (?:X)?+, (?>X*) equals (?:X)*+, (?>X+) equals (?:X)++.

Taking away the fact that X must be a non-capturing group, the preceding equivalence is:

(?>X?) equals X?+, (?>X*) equals X*+, (?>X+) equals X++.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPeter HartView Question on Stackoverflow
Solution 1 - JavaericksonView Answer on Stackoverflow
Solution 2 - JavakajibuView Answer on Stackoverflow
Solution 3 - JavaJeff StuartView Answer on Stackoverflow
Solution 4 - JavabeibichunaiView Answer on Stackoverflow