Number of regex matches

PythonRegex

Python Problem Overview


I'm using the finditer function in the re module to match some things and everything is working.

Now I need to find out how many matches I've got. Is it possible without looping through the iterator twice? (one to find out the count and then the real iteration)

Some code:

imageMatches = re.finditer("<img src\=\"(?P<path>[-/\w\.]+)\"", response[2])
# <Here I need to get the number of matches>
for imageMatch in imageMatches:
    doStuff

Everything works, I just need to get the number of matches before the loop.

Python Solutions


Solution 1 - Python

If you know you will want all the matches, you could use the re.findall function. It will return a list of all the matches. Then you can just do len(result) for the number of matches.

Solution 2 - Python

If you always need to know the length, and you just need the content of the match rather than the other info, you might as well use re.findall. Otherwise, if you only need the length sometimes, you can use e.g.

matches = re.finditer(...)
...
matches = tuple(matches)

to store the iteration of the matches in a reusable tuple. Then just do len(matches).

Another option, if you just need to know the total count after doing whatever with the match objects, is to use

matches = enumerate(re.finditer(...))

which will return an (index, match) pair for each of the original matches. So then you can just store the first element of each tuple in some variable.

But if you need the length first of all, and you need match objects as opposed to just the strings, you should just do

matches = tuple(re.finditer(...))

Solution 3 - Python

#An example for counting matched groups
import re

pattern = re.compile(r'(\w+).(\d+).(\w+).(\w+)', re.IGNORECASE)
search_str = "My 11 Char String"

res = re.match(pattern, search_str)
print(len(res.groups())) # len = 4  
print (res.group(1) ) #My
print (res.group(2) ) #11
print (res.group(3) ) #Char
print (res.group(4) ) #String

Solution 4 - Python

If you find you need to stick with finditer(), you can simply use a counter while you iterate through the iterator.

Example:

>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
        count +=1
>>> count
3

If you need the features of finditer() (not matching to overlapping instances), use this method.

Solution 5 - Python

I know this is a little old, but this but here is a concise function for counting regex patterns.

def regex_cnt(string, pattern):
    return len(re.findall(pattern, string))

string = 'abc123'

regex_cnt(string, '[0-9]')

Solution 6 - Python

For those moments when you really want to avoid building lists:

import re
import operator
from functools import reduce
count = reduce(operator.add, (1 for _ in re.finditer(my_pattern, my_string))) 

Sometimes you might need to operate on huge strings. This might help.

Solution 7 - Python

if you are using finditer method best way you can count the matches is to initialize a counter and increment it with each match

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionduttView Question on Stackoverflow
Solution 1 - PythonJoshDView Answer on Stackoverflow
Solution 2 - PythonintuitedView Answer on Stackoverflow
Solution 3 - PythonMods Vs RockersView Answer on Stackoverflow
Solution 4 - PythonRafe KettlerView Answer on Stackoverflow
Solution 5 - PythonTravis JonesView Answer on Stackoverflow
Solution 6 - PythonAdam GradzkiView Answer on Stackoverflow
Solution 7 - PythonRakesh DView Answer on Stackoverflow