RegExp.exec() returns NULL sporadically

JavascriptRegex

Javascript Problem Overview


I am seriously going crazy over this and I've already spent an unproportionate amount of time on trying to figure out what's going on here. So please give me a hand =)

I need to do some RegExp matching of strings in JavaScript. Unfortunately it behaves very strangely. This code:

var rx = /(cat|dog)/gi;
var w = new Array("I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.");

for (var i in w) {
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    }else{
        document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
}

Returns "cat" and "dog" for the first two elements, as it should be, but then some exec()-calls start returning null. I don't understand why.

I posted a Fiddle here, where you can run and edit the code.

And so far I've tried this in Chrome and Firefox.

Javascript Solutions


Solution 1 - Javascript

Oh, here it is. Because you're defining your regex global, it matches first cat, and on the second pass of the loop dog. So, basically you just need to reset your regex (it's internal pointer) as well. Cf. this:

var w = new Array("I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.");

for (var i in w) {
    var rx = /(cat|dog)/gi;
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<p>" + i + "<br/>INPUT: " + w[i] + "<br/>MATCHES: " + w[i].length + "</p>");
    }else{
        document.writeln("<p><b>" + i + "<br/>'" + w[i] + "' FAILED.</b><br/>" + w[i].length + "</p>");
    }
    document.writeln(m);
}

Solution 2 - Javascript

The regex object has a property lastIndex which is updated when you run exec. So when you exec the regex on e.g. "I have a cat and a dog too.", lastIndex is set to 12. The next time you run exec on the same regex object, it starts looking from index 12. So you have to reset the lastIndex property between each run.

Solution 3 - Javascript

Two things:

  1. The mentioned need of reset when using the g (global) flag. To solve this I recommed simply assign 0 to the lastIndex member of the RegExp object. This have better performance than destroy-and-recreate.
  2. Be careful when use in keyword in order to walk an Array object, because can lead to unexpected results with some libs. Sometimes you should check with somethign like isNaN(i), or if you know it don't have holes, use the classic for loop.

The code can be:

var rx = /(cat|dog)/gi;
w = ["I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat."];

for (var i in w)
 if(!isNaN(i))        // Optional, check it is an element if Array could have some odd members.
  {
   var m = null;
   m = rx.exec(w[i]); // Run
   rx.lastIndex = 0;  // Reset
   if(m)
    {
     document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    } else {
     document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
  }

Solution 4 - Javascript

I had a similar problem using /g only, and the proposed solution here did not work for me in FireFox 3.6.8. I got my script working with

var myRegex = new RegExp("my string", "g");

I'm adding this in case someone else has the same problem I did with the above solution.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioncpakView Question on Stackoverflow
Solution 1 - JavascriptSilentGhostView Answer on Stackoverflow
Solution 2 - JavascriptFrodeView Answer on Stackoverflow
Solution 3 - JavascriptESLView Answer on Stackoverflow
Solution 4 - JavascriptDonView Answer on Stackoverflow