Regex to extract substring, returning 2 results for some reason

JavascriptRegexSubstring

Javascript Problem Overview


I need to do a lot of regex things in javascript but am having some issues with the syntax and I can't seem to find a definitive resource on this.. for some reason when I do:

var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test)

it shows

"afskfsd33j, fskfsd33"

I'm not sure why its giving this output of original and the matched string, I am wondering how I can get it to just give the match (essentially extracting the part I want from the original string)

Thanks for any advice

Javascript Solutions


Solution 1 - Javascript

match returns an array.

The default string representation of an array in JavaScript is the elements of the array separated by commas. In this case the desired result is in the second element of the array:

var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test[1]);

Solution 2 - Javascript

Each group defined by parenthesis () is captured during processing and each captured group content is pushed into result array in same order as groups within pattern starts. See more on http://www.regular-expressions.info/brackets.html and http://www.regular-expressions.info/refcapture.html (choose right language to see supported features)

var source = "afskfsd33j"
var result = source.match(/a(.*)j/);

result: ["afskfsd33j", "fskfsd33"]

The reason why you received this exact result is following:

First value in array is the first found string which confirms the entire pattern. So it should definitely start with "a" followed by any number of any characters and ends with first "j" char after starting "a".

Second value in array is captured group defined by parenthesis. In your case group contain entire pattern match without content defined outside parenthesis, so exactly "fskfsd33".

If you want to get rid of second value in array you may define pattern like this:

/a(?:.*)j/

where "?:" means that group of chars which match the content in parenthesis will not be part of resulting array.

Other options might be in this simple case to write pattern without any group because it is not necessary to use group at all:

/a.*j/

If you want to just check whether source text matches the pattern and does not care about which text it found than you may try:

var result = /a.*j/.test(source);

The result should return then only true|false values. For more info see http://www.javascriptkit.com/javatutors/re3.shtml

Solution 3 - Javascript

I think your problem is that the match method is returning an array. The 0th item in the array is the original string, the 1st thru nth items correspond to the 1st through nth matched parenthesised items. Your "alert()" call is showing the entire array.

Solution 4 - Javascript

Just get rid of the parenthesis and that will give you an array with one element and:

  • Change this line

    var test = tesst.match(/a(.*)j/);

  • To this

    var test = tesst.match(/a.*j/);

If you add parenthesis the match() function will find two match for you one for whole expression and one for the expression inside the parenthesis

  • Also according to developer.mozilla.org docs :

> If you only want the first match found, you might want to use > RegExp.exec() instead.

You can use the below code:

RegExp(/a.*j/).exec("afskfsd33j")

Solution 5 - Javascript

I've just had the same problem.

You only get the text twice in your result if you include a match group (in brackets) and the 'g' (global) modifier. The first item always is the first result, normally OK when using match(reg) on a short string, however when using a construct like:

while ((result = reg.exec(string)) !== null){
    console.log(result);
}

the results are a little different.

Try the following code:

var regEx = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
var result = sample_string.match(regEx);
console.log(JSON.stringify(result));
// ["1 cat","2 fish"]

var reg = new RegExp('[0-9]+ (cat|fish)','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null) {
    console.dir(JSON.stringify(result))
};
// '["1 cat","cat"]'
// '["2 fish","fish"]'

var reg = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null){
    console.dir(JSON.stringify(result))
};
// '["1 cat","1 cat","cat"]'
// '["2 fish","2 fish","fish"]'

(tested on recent V8 - Chrome, Node.js)

The best answer is currently a comment which I can't upvote, so credit to @Mic.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRickView Question on Stackoverflow
Solution 1 - JavascriptJacob RelkinView Answer on Stackoverflow
Solution 2 - JavascriptJan StanicekView Answer on Stackoverflow
Solution 3 - JavascriptBillP3rdView Answer on Stackoverflow
Solution 4 - JavascriptMahyar EkramianView Answer on Stackoverflow
Solution 5 - JavascripttechturbulenceView Answer on Stackoverflow