How to make Regular expression into non-greedy?

JavascriptRegexFilterExpressionRegex Greedy

Javascript Problem Overview


I'm using jQuery. I have a string with a block of special characters (begin and end). I want get the text from that special characters block. I used a regular expression object for in-string finding. But how can I tell jQuery to find multiple results when have two special character or more?

My HTML:

<div id="container">
    <div id="textcontainer">
     Cuộc chiến pháp lý giữa [|cơ thử|nghiệm|] thtrường [|test2|đây là test lần 2|] chứng khoán [|Mỹ|day la nuoc my|] và ngân hàng đầu tư quyền lực nhất Phố Wall mới chỉ bắt đầu.
    </div>
</div>

and my JavaScript code:

$(document).ready(function() {
  var takedata = $("#textcontainer").text();
  var test = 'abcd adddb';
  var filterdata = takedata.match(/(\[.+\])/);
 
  alert(filterdata); 
 
  //end write js 
});

My result is: [|cơ thử|nghiệm|] thị trường [|test2|đây là test lần 2|] chứng khoán [|Mỹ|day la nuoc my|] . But this isn't the result I want :(. How to get [text] for times 1 and [demo] for times 2 ?


I've just done my work after searching info on internet ^^. I make code like this:

var filterdata = takedata.match(/(\[.*?\])/g);
  • my result is : [|cơ thử|nghiệm|],[|test2|đây là test lần 2|] this is right!. but I don't really understand this. Can you answer my why?

Javascript Solutions


Solution 1 - Javascript

The non-greedy regex modifiers are like their greedy counter-parts but with a ? immediately following them:

*  - zero or more
*? - zero or more (non-greedy)
+  - one or more
+? - one or more (non-greedy)
?  - zero or one
?? - zero or one (non-greedy)

Solution 2 - Javascript

You are right that greediness is an issue:

--A--Z--A--Z--
  ^^^^^^^^^^
     A.*Z

If you want to match both A--Z, you'd have to use A.*?Z (the ? makes the * "reluctant", or lazy).

There are sometimes better ways to do this, though, e.g.

A[^Z]*+Z

This uses negated character class and possessive quantifier, to reduce backtracking, and is likely to be more efficient.

In your case, the regex would be:

/(\[[^\]]++\])/

Unfortunately Javascript regex doesn't support possessive quantifier, so you'd just have to do with:

/(\[[^\]]+\])/
See also

Quick summary

*   Zero or more, greedy
*?  Zero or more, reluctant
*+  Zero or more, possessive

+   One or more, greedy
+?  One or more, reluctant
++  One or more, possessive

?   Zero or one, greedy
??  Zero or one, reluctant
?+  Zero or one, possessive

Note that the reluctant and possessive quantifiers are also applicable to the finite repetition {n,m} constructs.

Examples in Java:

System.out.println("aAoZbAoZc".replaceAll("A.*Z", "!"));  // prints "a!c"
System.out.println("aAoZbAoZc".replaceAll("A.*?Z", "!")); // prints "a!b!c"

System.out.println("xxxxxx".replaceAll("x{3,5}", "Y"));  // prints "Yx"
System.out.println("xxxxxx".replaceAll("x{3,5}?", "Y")); // prints "YY"

Solution 3 - Javascript

I believe it would be like this

takedata.match(/(\[.+\])/g);

the g at the end means global, so it doesn't stop at the first match.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRuetaView Question on Stackoverflow
Solution 1 - JavascriptAsaphView Answer on Stackoverflow
Solution 2 - JavascriptpolygenelubricantsView Answer on Stackoverflow
Solution 3 - JavascriptiangrahamView Answer on Stackoverflow