Regex exec only returning first match

JavascriptRegex

Javascript Problem Overview


I am trying to implement the following regex search found on golfscript syntax page.

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;
input = ptrn.exec(input);

Input is only ever the first match of the regexp. for example: "hello" "world" should return ["hello", "world"] but it only returns ["hello"].

Javascript Solutions


Solution 1 - Javascript

RegExp.exec is only able to return a single match result at once.

In order to retrieve multiple matches you need to run exec on the expression object multiple times. For example, using a simple while loop:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;

var match;
while ((match = ptrn.exec(input)) != null) {
    console.log(match);
}

This will log all matches to the console.

Note that in order to make this work, you need to make sure that the regular expression has the g (global) flag. This flag makes sure that after certain methods are executed on the expression, the lastIndex property is updated, so further calls will start after the previous result.

Solution 2 - Javascript

It is possible to call match method on the string in order to retrieve the whole collection of matches:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;
var results = "hello world".match(ptrn);

results are (according to the regular expression):

["hello", " ", "world"]

match spec is here

Solution 3 - Javascript

I did not get what is meant by "hello" "world" in your question, is it user input or regex but I was told that RegExp object has a state -- its lastIndex position that it starts the search from. It does not return all the results at once. It brings only the first match and you need to resume .exec to get the rest of results starting from lastIndex position:

const re1 = /^\s*(\w+)/mg; // find all first words in every line
const text1 = "capture discard\n me but_not_me" // two lines of text
for (let match; (match = re1.exec(text1)) !== null;) 
      console.log(match, "next search at", re1.lastIndex);

prints

["capture", "capture"] "next search at" 7
[" me", "me"] "next search at" 19

The functional JS6 way to build iterator for your results is here

RegExp.prototype.execAllGen = function*(input) {
    for (let match; (match = this.exec(input)) !== null;) 
      yield match;
} ; RegExp.prototype.execAll = function(input) {
  return [...this.execAllGen(input)]}

Please also note how, unlike poke, much more nicely I used match variable enclosed in the for-loop.

Now, you can capture your matches easily, in one line

const matches = re1.execAll(text1)

log("captured strings:", matches.map(m=>m[1]))
log(matches.map(m=> [m[1],m.index]))
for (const match of matches) log(match[1], "found at",match.index)

which prints

"captured strings:" ["capture", "me"]

[["capture", 0], ["me", 16]]
"capture" "found at" 0
"me" "found at" 16

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser181351View Question on Stackoverflow
Solution 1 - JavascriptpokeView Answer on Stackoverflow
Solution 2 - JavascriptEadelView Answer on Stackoverflow
Solution 3 - JavascriptLittle AlienView Answer on Stackoverflow