2

I'm trying to write a program which counts how many times a substring appears within a string.

word = "wejmfoiwstreetstreetskkjoih"
streets = "streets"
count = 0

if streets in word:
    count += 1

print(count)

as you can see "streets" appears twice but the last s of streets is also the beginning of streets. I can't think of a way to loop this.

Thanks!

2
  • Maybe start here, docs.python.org/2.7/tutorial/index.html Commented Jun 21, 2014 at 14:44
  • Just so you know, a proper substr function typically only searches for one occurrence of the sub-string, and usually returns the index of it's position in the string. In your example, you are only searching for one occurrence as well, if streets in word well only increment count once, no matter how many times the sub string 'streets' shows up in your string. There are already some good answers here, I just wanted to provide some insight into why your function wasn't working. Commented Jun 21, 2014 at 14:53

3 Answers 3

4

Can be done using a regex

>>> import re
>>> text = 'streetstreets'
>>> len(re.findall('(?=streets)', text))
2

From the docs:

(?=...)

Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.

Sign up to request clarification or add additional context in comments.

2 Comments

That works, but this answer would be better if the regex was explained somewhat. In particular, what does the ?= do, as it doesn't work with that omitted.
@EricWilson agreed, added the part from the docs where it's explained
2

Quick and dirty:

>>> word = "wejmfoiwstreetstreetskkjoih"
>>> streets = "streets"
>>> sum(word[start:].startswith(streets) for start in range(len(word)))
2

Comments

0

A generic (though not as elegant) way would be a loop like this:

def count_substrings(stack, needle):
    idx = 0
    count = 0
    while True:
        idx = stack.find(needle, idx) + 1 # next time look after this idx
        if idx <= 0:
            break
        count += 1
    return count

My measurement shows that it's ~8.5 times faster than the solution with startswith for every substring.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.