Return positions of a regex match() in Javascript?

JavascriptRegexMatchString Matching

Javascript Problem Overview


Is there a way to retrieve the (starting) character positions inside a string of the results of a regex match() in Javascript?

Javascript Solutions


Solution 1 - Javascript

exec returns an object with a index property:

var match = /bar/.exec("foobar");
if (match) {
    console.log("match found at " + match.index);
}

And for multiple matches:

var re = /bar/g,
    str = "foobarfoobar";
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
}

Solution 2 - Javascript

Here's what I came up with:

// Finds starting and ending positions of quoted text
// in double or single quotes with escape char support like \" \'
var str = "this is a \"quoted\" string as you can 'read'";

var patt = /'((?:\\.|[^'])*)'|"((?:\\.|[^"])*)"/igm;

while (match = patt.exec(str)) {
  console.log(match.index + ' ' + patt.lastIndex);
}

Solution 3 - Javascript

In modern browsers, you can accomplish this with string.matchAll().

The benefit to this approach vs RegExp.exec() is that it does not rely on the regex being stateful, as in @Gumbo's answer.

let regexp = /bar/g;
let str = 'foobarfoobar';

let matches = [...str.matchAll(regexp)];
matches.forEach((match) => {
    console.log("match found at " + match.index);
});

Solution 4 - Javascript

From developer.mozilla.org docs on the String .match() method:

> The returned Array has an extra input property, which contains the > original string that was parsed. In addition, it has an index > property, which represents the zero-based index of the match in the > string.

When dealing with a non-global regex (i.e., no g flag on your regex), the value returned by .match() has an index property...all you have to do is access it.

var index = str.match(/regex/).index;

Here is an example showing it working as well:

var str = 'my string here';

var index = str.match(/here/).index;

console.log(index); // <- 10

I have successfully tested this all the way back to IE5.

Solution 5 - Javascript

You can use the search method of the String object. This will only work for the first match, but will otherwise do what you describe. For example:

"How are you?".search(/are/);
// 4

Solution 6 - Javascript

Here is a cool feature I discovered recently, I tried this on the console and it seems to work:

var text = "border-bottom-left-radius";

var newText = text.replace(/-/g,function(match, index){
    return " " + index + " ";
});

Which returned: "border 6 bottom 13 left 18 radius"

So this seems to be what you are looking for.

Solution 7 - Javascript

I'm afraid the previous answers (based on exec) don't seem to work in case your regex matches width 0. For instance (Note: /\b/g is the regex that should find all word boundaries) :

var re = /\b/g,
    str = "hello world";
var guard = 10;
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
    if (guard-- < 0) {
      console.error("Infinite loop detected")
      break;
    }
}

One can try to fix this by having the regex match at least 1 character, but this is far from ideal (and means you have to manually add the index at the end of the string)

var re = /\b./g,
    str = "hello world";
var guard = 10;
while ((match = re.exec(str)) != null) {
    console.log("match found at " + match.index);
    if (guard-- < 0) {
      console.error("Infinite loop detected")
      break;
    }
}

A better solution (which does only work on newer browsers / needs polyfills on older/IE versions) is to use String.prototype.matchAll()

var re = /\b/g,
    str = "hello world";
console.log(Array.from(str.matchAll(re)).map(match => match.index))

Explanation:

String.prototype.matchAll() expects a global regex (one with g of global flag set). It then returns an iterator. In order to loop over and map() the iterator, it has to be turned into an array (which is exactly what Array.from() does). Like the result of RegExp.prototype.exec(), the resulting elements have an .index field according to the specification.

See the String.prototype.matchAll() and the Array.from() MDN pages for browser support and polyfill options.


Edit: digging a little deeper in search for a solution supported on all browsers

The problem with RegExp.prototype.exec() is that it updates the lastIndex pointer on the regex, and next time starts searching from the previously found lastIndex.

var re = /l/g, str = "hello world"; console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex)

This works great as long as the regex match actually has a width. If using a 0 width regex, this pointer does not increase, and so you get your infinite loop (note: /(?=l)/g is a lookahead for l -- it matches the 0-width string before an l. So it correctly goes to index 2 on the first call of exec(), and then stays there:

var re = /(?=l)/g, str = "hello world"; console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex) re.exec(str) console.log(re.lastIndex)

The solution (that is less nice than matchAll(), but should work on all browsers) therefore is to manually increase the lastIndex if the match width is 0 (which may be checked in different ways)

var re = /\b/g, str = "hello world"; while ((match = re.exec(str)) != null) { console.log("match found at " + match.index);

    // alternative: if (match.index == re.lastIndex) {
    if (match[0].length == 0) {
      // we need to increase lastIndex -- this location was already matched,
      // we don't want to match it again (and get into an infinite loop)
      re.lastIndex++
    }
}

Solution 8 - Javascript

This member fn returns an array of 0-based positions, if any, of the input word inside the String object

String.prototype.matching_positions = function( _word, _case_sensitive, _whole_words, _multiline )
{
   /*besides '_word' param, others are flags (0|1)*/
   var _match_pattern = "g"+(_case_sensitive?"i":"")+(_multiline?"m":"") ;
   var _bound = _whole_words ? "\\b" : "" ;
   var _re = new RegExp( _bound+_word+_bound, _match_pattern );
   var _pos = [], _chunk, _index = 0 ;

   while( true )
   {
      _chunk = _re.exec( this ) ;
      if ( _chunk == null ) break ;
      _pos.push( _chunk['index'] ) ;
      _re.lastIndex = _chunk['index']+1 ;
   }

   return _pos ;
}

Now try

var _sentence = "What do doers want ? What do doers need ?" ;
var _word = "do" ;
console.log( _sentence.matching_positions( _word, 1, 0, 0 ) );
console.log( _sentence.matching_positions( _word, 1, 1, 0 ) );

You can also input regular expressions:

var _second = "z^2+2z-1" ;
console.log( _second.matching_positions( "[0-9]\z+", 0, 0, 0 ) );

Here one gets the position index of linear term.

Solution 9 - Javascript

var str = "The rain in SPAIN stays mainly in the plain";

function searchIndex(str, searchValue, isCaseSensitive) {
  var modifiers = isCaseSensitive ? 'gi' : 'g';
  var regExpValue = new RegExp(searchValue, modifiers);
  var matches = [];
  var startIndex = 0;
  var arr = str.match(regExpValue);

  [].forEach.call(arr, function(element) {
    startIndex = str.indexOf(element, startIndex);
    matches.push(startIndex++);
  });

  return matches;
}

console.log(searchIndex(str, 'ain', true));

Solution 10 - Javascript

function trimRegex(str, regex){
    return str.substr(str.match(regex).index).split('').reverse().join('').substr(str.match(regex).index).split('').reverse().join('');
}

let test = '||ab||cd||';
trimRegex(test, /[^|]/);
console.log(test); //output: ab||cd

or

function trimChar(str, trim, req){
	let regex = new RegExp('[^'+trim+']');
	return str.substr(str.match(regex).index).split('').reverse().join('').substr(str.match(regex).index).split('').reverse().join('');
}

let test = '||ab||cd||';
trimChar(test, '|');
console.log(test); //output: ab||cd

Solution 11 - Javascript

var str = 'my string here';

var index = str.match(/hre/).index;

alert(index); // <- 10

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionstagasView Question on Stackoverflow
Solution 1 - JavascriptGumboView Answer on Stackoverflow
Solution 2 - JavascriptstagasView Answer on Stackoverflow
Solution 3 - JavascriptbrismuthView Answer on Stackoverflow
Solution 4 - JavascriptJimbo JonnyView Answer on Stackoverflow
Solution 5 - JavascriptJimmyView Answer on Stackoverflow
Solution 6 - JavascriptfelipeabView Answer on Stackoverflow
Solution 7 - JavascriptClaudeView Answer on Stackoverflow
Solution 8 - JavascriptSandro RosaView Answer on Stackoverflow
Solution 9 - JavascriptYaroslavView Answer on Stackoverflow
Solution 10 - JavascriptSwiftNinjaProView Answer on Stackoverflow
Solution 11 - JavascriptThomas FONTAINEView Answer on Stackoverflow