Count number of occurrences of a substring in a string

PythonString

Python Problem Overview


How can I count the number of times a given substring is present within a string in Python?

For example:

>>> 'foo bar foo'.numberOfOccurrences('foo')
2

Python Solutions


Solution 1 - Python

string.count(substring), like in:

>>> "abcdabcva".count("ab")
2
Update:

As pointed up in the comments, this is the way to do it for non overlapping occurrences. If you need to count overlapping occurrences, you'd better check the answers at: "https://stackoverflow.com/questions/5616822/python-regex-find-all-overlapping-matches";, or just check my other answer below.

Solution 2 - Python

s = 'arunununghhjj'
sb = 'nun'
results = 0
sub_len = len(sb)
for i in range(len(s)):
    if s[i:i+sub_len] == sb:
        results += 1
print results

Solution 3 - Python

Depending what you really mean, I propose the following solutions:

  1. You mean a list of space separated sub-strings and want to know what is the sub-string position number among all sub-strings:

     s = 'sub1 sub2 sub3'
     s.split().index('sub2')
     >>> 1
    
  2. You mean the char-position of the sub-string in the string:

     s.find('sub2')
     >>> 5
    
  3. You mean the (non-overlapping) counts of appearance of a su-bstring:

     s.count('sub2')
     >>> 1
     s.count('sub')
     >>> 3
    

Solution 4 - Python

The best way to find overlapping sub-string in a given string is to use the python regular expression it will find all the overlapping matching using the regular expression library. Here is how to do it left is the substring and in right you will provide the string to match

print len(re.findall('(?=aa)','caaaab'))
3

Solution 5 - Python

To find overlapping occurences of a substring in a string in Python 3, this algorithm will do:

def count_substring(string,sub_string):
    l=len(sub_string)
    count=0
    for i in range(len(string)-len(sub_string)+1):
        if(string[i:i+len(sub_string)] == sub_string ):      
            count+=1
    return count  

I myself checked this algorithm and it worked.

Solution 6 - Python

You can count the frequency using two ways:

  1. Using the count() in str:

    a.count(b)

  2. Or, you can use:

    len(a.split(b))-1

Where a is the string and b is the substring whose frequency is to be calculated.

Solution 7 - Python

Scenario 1: Occurrence of a word in a sentence. eg: str1 = "This is an example and is easy". The occurrence of the word "is". lets str2 = "is"

count = str1.count(str2)

Scenario 2 : Occurrence of pattern in a sentence.

string = "ABCDCDC"
substring = "CDC"

def count_substring(string,sub_string):
    len1 = len(string)
    len2 = len(sub_string)
    j =0
    counter = 0
    while(j < len1):
        if(string[j] == sub_string[0]):
            if(string[j:j+len2] == sub_string):
                counter += 1
        j += 1

    return counter

Thanks!

Solution 8 - Python

The current best answer involving method count doesn't really count for overlapping occurrences and doesn't care about empty sub-strings as well. For example:

>>> a = 'caatatab'
>>> b = 'ata'
>>> print(a.count(b)) #overlapping
1
>>>print(a.count('')) #empty string
9

The first answer should be 2 not 1, if we consider the overlapping substrings. As for the second answer it's better if an empty sub-string returns 0 as the asnwer.

The following code takes care of these things.

def num_of_patterns(astr,pattern):
	astr, pattern = astr.strip(), pattern.strip()
    if pattern == '': return 0

	ind, count, start_flag = 0,0,0
    while True:
	    try:
		    if start_flag == 0:
			    ind = astr.index(pattern)
			    start_flag = 1
		    else:
			    ind += 1 + astr[ind+1:].index(pattern)
		    count += 1
	    except:
		    break
	return count

Now when we run it:

>>>num_of_patterns('caatatab', 'ata') #overlapping
2
>>>num_of_patterns('caatatab', '') #empty string
0
>>>num_of_patterns('abcdabcva','ab') #normal
2

Solution 9 - Python

The question isn't very clear, but I'll answer what you are, on the surface, asking.

A string S, which is L characters long, and where S[1] is the first character of the string and S[L] is the last character, has the following substrings:

  • The null string ''. There is one of these.
  • For every value A from 1 to L, for every value B from A to L, the string S[A]..S[B] (inclusive). There are L + L-1 + L-2 + ... 1 of these strings, for a total of 0.5L(L+1).
  • Note that the second item includes S[1]..S[L], i.e. the entire original string S.

So, there are 0.5L(L+1) + 1 substrings within a string of length L. Render that expression in Python, and you have the number of substrings present within the string.

Solution 10 - Python

One way is to use re.subn. For example, to count the number of occurrences of 'hello' in any mix of cases you can do:

import re
_, count = re.subn(r'hello', '', astring, flags=re.I)
print('Found', count, 'occurrences of "hello"')

Solution 11 - Python

If you want to count all the sub-string (including overlapped) then use this method.

import re
def count_substring(string, sub_string):
    regex = '(?='+sub_string+')'
    # print(regex)
    return len(re.findall(regex,string))

Solution 12 - Python

I will keep my accepted answer as the "simple and obvious way to do it" - however, that does not cover overlapping occurrences. Finding out those can be done naively, with multiple checking of the slices - as in: sum("GCAAAAAGH"[i:].startswith("AAA") for i in range(len("GCAAAAAGH")))

(which yields 3) - it can be done by trick use of regular expressions, as can be seen at https://stackoverflow.com/questions/5616822/python-regex-find-all-overlapping-matches - and it can also make for fine code golfing - This is my "hand made" count for overlappingocurrences of patterns in a string which tries not to be extremely naive (at least it does not create new string objects at each interaction):

def find_matches_overlapping(text, pattern):
    lpat = len(pattern) - 1
    matches = []
    text = array("u", text)
    pattern = array("u", pattern)
    indexes = {}
    for i in range(len(text) - lpat):
        if text[i] == pattern[0]:
            indexes[i] = -1
        for index, counter in list(indexes.items()):
            counter += 1
            if text[i] == pattern[counter]:
                if counter == lpat:
                    matches.append(index)
                    del indexes[index]
                else:
                    indexes[index] = counter
            else:
                del indexes[index]
    return matches
            
def count_matches(text, pattern):
    return len(find_matches_overlapping(text, pattern))

Solution 13 - Python

How about a one-liner with a list comprehension? Technically its 93 characters long, spare me PEP-8 purism. The regex.findall answer is the most readable if its a high level piece of code. If you're building something low level and don't want dependencies, this one is pretty lean and mean. I'm giving the overlapping answer. Obviously just use count like the highest score answer if there isn't overlap.

def count_substring(string, sub_string):
    return len([i for i in range(len(string)) if string[i:i+len(sub_string)] == sub_string])

Solution 14 - Python

Overlapping occurences:

def olpcount(string,pattern,case_sensitive=True):
    if case_sensitive != True:
        string  = string.lower()
        pattern = pattern.lower()
    l = len(pattern)
    ct = 0
    for c in range(0,len(string)):
        if string[c:c+l] == pattern:
            ct += 1
    return ct

test = 'my maaather lies over the oceaaan'
print test
print olpcount(test,'a')
print olpcount(test,'aa')
print olpcount(test,'aaa')

Results:

my maaather lies over the oceaaan
6
4
2

Solution 15 - Python

For overlapping count we can use use:

def count_substring(string, sub_string):
    count=0
    beg=0
    while(string.find(sub_string,beg)!=-1) :
        count=count+1
        beg=string.find(sub_string,beg)
        beg=beg+1
    return count

For non-overlapping case we can use count() function:

string.count(sub_string)

Solution 16 - Python

Here's a solution that works for both non-overlapping and overlapping occurrences. To clarify: an overlapping substring is one whose last character is identical to its first character.

def substr_count(st, sub):
    # If a non-overlapping substring then just
    # use the standard string `count` method
    # to count the substring occurences
    if sub[0] != sub[-1]:
        return st.count(sub)

    # Otherwise, create a copy of the source string,
    # and starting from the index of the first occurence
    # of the substring, adjust the source string to start
    # from subsequent occurences of the substring and keep
    # keep count of these occurences
    _st = st[::]
    start = _st.index(sub)
    cnt = 0

    while start is not None:
        cnt += 1
        try:
            _st = _st[start + len(sub) - 1:]
            start = _st.index(sub)
        except (ValueError, IndexError):
            return cnt

    return cnt

Solution 17 - Python

If you're looking for a power solution that works every case this function should work:

def count_substring(string, sub_string):
    ans = 0
    for i in range(len(string)-(len(sub_string)-1)):
        if sub_string == string[i:len(sub_string)+i]:
            ans += 1
    return ans

Solution 18 - Python

If you want to find out the count of substring inside any string; please use below code. The code is easy to understand that's why i skipped the comments. :)

string=raw_input()
sub_string=raw_input()
start=0
answer=0
length=len(string)
index=string.find(sub_string,start,length)
while index<>-1:
    start=index+1
    answer=answer+1
    index=string.find(sub_string,start,length)
print answer

Solution 19 - Python

Risking a downvote because 2+ others have already provided this solution. I even upvoted one of them. But mine is probably the easiest for newbies to understand.

def count_substring(string, sub_string):
    slen  = len(string)
    sslen = len(sub_string)
    range_s = slen - sslen + 1
    count = 0
    for i in range(range_s):
        if (string[i:i+sslen] == sub_string):
            count += 1
    return count

Solution 20 - Python

You could use the startswith method:

def count_substring(string, sub_string):
    x = 0
    for i in range(len(string)):
        if string[i:].startswith(sub_string):
            x += 1
    return x

Solution 21 - Python

def count_substring(string, sub_string):
    inc = 0
    for i in range(0, len(string)):
        slice_object = slice(i,len(sub_string)+i)
        count = len(string[slice_object])
        if(count == len(sub_string)):
            if(sub_string == string[slice_object]):
                inc = inc + 1
    return inc

if __name__ == '__main__':
    string = input().strip()
    sub_string = input().strip()
    
    count = count_substring(string, sub_string)
    print(count)

Solution 22 - Python

def count_substring(string, sub_string):
	k=len(string)
	m=len(sub_string)
	i=0
	l=0
	count=0
	while l<k:
		if string[l:l+m]==sub_string:
			count=count+1
		l=l+1
	return count

if __name__ == '__main__':
	string = input().strip()
	sub_string = input().strip()

	count = count_substring(string, sub_string)
	print(count)

Solution 23 - Python

I'm not sure if this is something looked at already, but I thought of this as a solution for a word that is 'disposable':

for i in xrange(len(word)):
if word[:len(term)] == term:
    count += 1
word = word[1:]

print count

Where word is the word you are searching in and term is the term you are looking for

Solution 24 - Python

string="abc"
mainstr="ncnabckjdjkabcxcxccccxcxcabc"
count=0
for i in range(0,len(mainstr)):
	k=0
	while(k<len(string)):
		if(string[k]==mainstr[i+k]):
			k+=1
		else:
			break	
	if(k==len(string)):
		count+=1;	
print(count)

Solution 25 - Python

import re
d = [m.start() for m in re.finditer(seaching, string)] 
print (d)

This finds the number of times sub string found in the string and displays index.

Solution 26 - Python

my_string = """Strings are amongst the most popular data types in Python. 
               We can create the strings by enclosing characters in quotes.
               Python treats single quotes the same as double quotes."""
               
Count = my_string.lower().strip("\n").split(" ").count("string")
Count = my_string.lower().strip("\n").split(" ").count("strings")
print("The number of occurance of word String is : " , Count)
print("The number of occurance of word Strings is : " , Count)

Solution 27 - Python

Below logic will work for all string & special characters

def cnt_substr(inp_str, sub_str):
    inp_join_str = ''.join(inp_str.split())
    sub_join_str = ''.join(sub_str.split())

    return inp_join_str.count(sub_join_str)

print(cnt_substr("the sky is   $blue and not greenthe sky is   $blue and not green", "the sky"))

Solution 28 - Python

For a simple string with space delimitation, using Dict would be quite fast, please see the code as below

def getStringCount(mnstr:str, sbstr:str='')->int:
    """ Assumes two inputs string giving the string and 
        substring to look for number of occurances 
        Returns the number of occurances of a given string
    """
    x = dict()
    x[sbstr] = 0
    sbstr = sbstr.strip()
    for st in mnstr.split(' '):
        if st not in [sbstr]:
            continue
        try:
            x[st]+=1
        except KeyError:
            x[st] = 1
    return x[sbstr]

s = 'foo bar foo test one two three foo bar'
getStringCount(s,'foo')

Solution 29 - Python

Here's the solution in Python 3 and case insensitive:

s = 'foo bar foo'.upper()
sb = 'foo'.upper()
results = 0
sub_len = len(sb)
for i in range(len(s)):
    if s[i:i+sub_len] == sb:
        results += 1
print(results)

Solution 30 - Python

j = 0
    while i < len(string):
        sub_string_out = string[i:len(sub_string)+j]
        if sub_string == sub_string_out:
            count += 1
        i += 1
        j += 1
    return count

Solution 31 - Python

#counting occurence of a substring in another string (overlapping/non overlapping)
s = input('enter the main string: ')# e.g. 'bobazcbobobegbobobgbobobhaklpbobawanbobobobob'
p=input('enter the substring: ')# e.g. 'bob'

counter=0
c=0

for i in range(len(s)-len(p)+1):
    for j in range(len(p)):
        if s[i+j]==p[j]:
            if c<len(p):
                c=c+1
                if c==len(p):
                    counter+=1
                    c=0
                    break
                continue
        else:
            break
print('number of occurences of the substring in the main string is: ',counter)

Solution 32 - Python

s = input('enter the main string: ')
p=input('enter the substring: ')
l=[]
for i in range(len(s)):
    l.append(s[i:i+len(p)])
print(l.count(p))

Solution 33 - Python

This makes a list of all the occurrences (also overlapping) in the string and counts them

def num_occ(str1, str2):
    l1, l2 = len(str1), len(str2)
    return len([str1[i:i + l2] for i in range(l1 - l2 + 1) if str1[i:i + l2] == str2])

Example:

str1 ='abcabcd'
str2 = 'bc'

will create this list but save only the BOLD values:

[ab, bc, ca, ab, bc, cd]

that will return:

len([bc, bc])

Solution 34 - Python

def count_substring(string, sub_string):
    counterList=[ 1 for i in range(len(string)-len(sub_string)+1) if string[i:i+len(sub_string)] == sub_string]
    count=sum(counterList)
    return count
    
if __name__ == '__main__':
    string = input().strip()
    sub_string = input().strip()
    
    count = count_substring(string, sub_string)
    print(count)

Solution 35 - Python

If you're looking to count the whole string this can works.

stri_count="If you're looking to count the whole string this can works"
print(len(stri_count))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsantoshView Question on Stackoverflow
Solution 1 - PythonjsbuenoView Answer on Stackoverflow
Solution 2 - PythonArun Kumar KhattriView Answer on Stackoverflow
Solution 3 - PythonDon QuestionView Answer on Stackoverflow
Solution 4 - PythonDeepak YadavView Answer on Stackoverflow
Solution 5 - PythonBharath Kumar RView Answer on Stackoverflow
Solution 6 - PythonAnuj GuptaView Answer on Stackoverflow
Solution 7 - PythonAmith V VView Answer on Stackoverflow
Solution 8 - PythonNuhmanView Answer on Stackoverflow
Solution 9 - PythonJim DeLaHuntView Answer on Stackoverflow
Solution 10 - PythonEugene YarmashView Answer on Stackoverflow
Solution 11 - PythonRahul VermaView Answer on Stackoverflow
Solution 12 - PythonjsbuenoView Answer on Stackoverflow
Solution 13 - PythonRyan DinesView Answer on Stackoverflow
Solution 14 - PythonfyngyrzView Answer on Stackoverflow
Solution 15 - PythonDhiraj DwivediView Answer on Stackoverflow
Solution 16 - PythonsrmView Answer on Stackoverflow
Solution 17 - PythonPedro DannaView Answer on Stackoverflow
Solution 18 - PythonHemantView Answer on Stackoverflow
Solution 19 - PythonBabar-BaigView Answer on Stackoverflow
Solution 20 - PythonTrevor MaselemeView Answer on Stackoverflow
Solution 21 - Pythonsooraj ksView Answer on Stackoverflow
Solution 22 - PythonWild_Hunter_View Answer on Stackoverflow
Solution 23 - PythonAlan VintonView Answer on Stackoverflow
Solution 24 - Pythonkamran shaikView Answer on Stackoverflow
Solution 25 - PythonBhaskar Reddi KView Answer on Stackoverflow
Solution 26 - PythonVinay Kumar KuresiView Answer on Stackoverflow
Solution 27 - PythonskayView Answer on Stackoverflow
Solution 28 - PythonAmit GowdaView Answer on Stackoverflow
Solution 29 - PythonattachPostView Answer on Stackoverflow
Solution 30 - PythonvengatView Answer on Stackoverflow
Solution 31 - Pythonpawan kumarView Answer on Stackoverflow
Solution 32 - Pythonpawan kumarView Answer on Stackoverflow
Solution 33 - PythonElad L.View Answer on Stackoverflow
Solution 34 - PythonMd. Rizwan RabbaniView Answer on Stackoverflow
Solution 35 - PythonJean de Dieu NyandwiView Answer on Stackoverflow