Count number of matches of a regex in Javascript
JavascriptRegexJavascript Problem Overview
I wanted to write a regex to count the number of spaces/tabs/newline in a chunk of text. So I naively wrote the following:-
numSpaces : function(text) {
return text.match(/\s/).length;
}
For some unknown reasons it always returns 1
. What is the problem with the above statement? I have since solved the problem with the following:-
numSpaces : function(text) {
return (text.split(/\s/).length -1);
}
Javascript Solutions
Solution 1 - Javascript
tl;dr: Generic Pattern Counter
// THIS IS WHAT YOU NEED
const count = (str) => {
const re = /YOUR_PATTERN_HERE/g
return ((str || '').match(re) || []).length
}
For those that arrived here looking for a generic way to count the number of occurrences of a regex pattern in a string, and don't want it to fail if there are zero occurrences, this code is what you need. Here's a demonstration:
/* * Example */
const count = (str) => {
const re = /[a-z]{3}/g
return ((str || '').match(re) || []).length
}
const str1 = 'abc, def, ghi'
const str2 = 'ABC, DEF, GHI'
console.log(`'${str1}' has ${count(str1)} occurrences of pattern '/[a-z]{3}/g'`)
console.log(`'${str2}' has ${count(str2)} occurrences of pattern '/[a-z]{3}/g'`)
Original Answer
The problem with your initial code is that you are missing the global identifier:
>>> 'hi there how are you'.match(/\s/g).length;
4
Without the g
part of the regex it will only match the first occurrence and stop there.
Also note that your regex will count successive spaces twice:
>>> 'hi there'.match(/\s/g).length;
2
If that is not desirable, you could do this:
>>> 'hi there'.match(/\s+/g).length;
1
Solution 2 - Javascript
As mentioned in my earlier answer, you can use RegExp.exec()
to iterate over all matches and count each occurrence; the advantage is limited to memory only, because on the whole it's about 20% slower than using String.match()
.
var re = /\s/g,
count = 0;
while (re.exec(text) !== null) {
++count;
}
return count;
Solution 3 - Javascript
(('a a a').match(/b/g) || []).length; // 0
(('a a a').match(/a/g) || []).length; // 3
Based on https://stackoverflow.com/a/48195124/16777 but fixed to actually work in zero-results case.
Solution 4 - Javascript
('my string'.match(/\s/g) || []).length;
Solution 5 - Javascript
Here is a similar solution to @Paolo Bergantino's answer, but with modern operators. I'll explain below.
const matchCount = (str, re) => {
return str?.match(re)?.length ?? 0;
};
// usage
let numSpaces = matchCount(undefined, /\s/g);
console.log(numSpaces); // 0
numSpaces = matchCount("foobarbaz", /\s/g);
console.log(numSpaces); // 0
numSpaces = matchCount("foo bar baz", /\s/g);
console.log(numSpaces); // 2
?.
is the optional chaining operator. It allows you to chain calls as deep as you want without having to worry about whether there is an undefined/null along the way. Think of str?.match(re)
as
if (str !== undefined && str !== null) {
return str.match(re);
} else {
return undefined;
}
This is slightly different from @Paolo Bergantino's. Theirs is written like this: (str || '')
. That means if str
is falsy, return ''
. 0 is falsy. document.all is falsy. In my opinion, if someone were to pass those into this function as a string, it would probably be because of programmer error. Therefore, I'd rather be informed I'm doing something non-sensible than troubleshoot why I keep on getting a length of 0.
??
is the nullish coalescing operator. Think of it as ||
but more specific. If the left hand side of ||
evaluates to falsy, it executes the right-hand side. But ??
only executes if the left-hand side is undefined or null.
Keep in mind, the nullish coalescing operator in ?.length ?? 0
will return the same thing as using ?.length || 0
. The difference is, if length
returns 0, it won't execute the right-hand side... but the result is going to be 0 whether you use ||
or ??
.
Honestly, in this situation I would probably change it to ||
because more JavaScript developers are familiar with that operator. Maybe someone could enlighten me on benefits of ??
vs ||
in this situation, if any exist.
Lastly, I changed the signature so the function can be used for any regex.
Oh, and here is a typescript version:
const matchCount = (str: string, re: RegExp) => {
return str?.match(re)?.length ?? 0;
};
Solution 6 - Javascript
This is certainly something that has a lot of traps. I was working with Paolo Bergantino's answer, and realising that even that has some limitations. I found working with string representations of dates a good place to quickly find some of the main problems. Start with an input string like this:
'12-2-2019 5:1:48.670'
and set up Paolo's function like this:
function count(re, str) {
if (typeof re !== "string") {
return 0;
}
re = (re === '.') ? ('\\' + re) : re;
var cre = new RegExp(re, 'g');
return ((str || '').match(cre) || []).length;
}
I wanted the regular expression to be passed in, so that the function is more reusable, secondly, I wanted the parameter to be a string, so that the client doesn't have to make the regex, but simply match on the string, like a standard string utility class method.
Now, here you can see that I'm dealing with issues with the input. With the following:
if (typeof re !== "string") {
return 0;
}
I am ensuring that the input isn't anything like the literal 0
, false
, undefined
, or null
, none of which are strings. Since these literals are not in the input string, there should be no matches, but it should match '0'
, which is a string.
With the following:
re = (re === '.') ? ('\\' + re) : re;
I am dealing with the fact that the RegExp constructor will (I think, wrongly) interpret the string '.'
as the all character matcher \.\
Finally, because I am using the RegExp constructor, I need to give it the global 'g'
flag so that it counts all matches, not just the first one, similar to the suggestions in other posts.
I realise that this is an extremely late answer, but it might be helpful to someone stumbling along here. BTW here's the TypeScript version:
function count(re: string, str: string): number {
if (typeof re !== 'string') {
return 0;
}
re = (re === '.') ? ('\\' + re) : re;
const cre = new RegExp(re, 'g');
return ((str || '').match(cre) || []).length;
}
Solution 7 - Javascript
Using modern syntax avoids the need to create a dummy array to count length 0
const countMatches = (exp, str) => str.match(exp)?.length ?? 0;
Must pass exp
as RegExp
and str
as String
.
Solution 8 - Javascript
how about like this
function isint(str){
if(str.match(/\d/g).length==str.length){
return true;
}
else {
return false
}
}