Return to Revisions

2 of 2

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/

edited Apr 13, 2017 at 12:40

Mysterious special case

For any inputString with length 1, you print two lines of output: 0 followed by 1. The incorrect 0 comes from the special case

if (len(inputString) == 1):
  print frequency

which as far as I can tell should not be there at all.

Style

Consistent indentation is important, especially in Python, where whitespace matters. Use 4 spaces, as recommended in PEP 8.

inputString should be input_string to be consistent with the naming style and Python guidelines.

It is customary to omit parentheses after while, if, and for.

You should split your code according to responsibilities. I recommend the following outline:

def string_root(s):
    …
    return n

def lines(terminator='*'):
    s = raw_input()
    while s != terminator:
        yield s
        s = raw_input()

if __name__ == '__main__':
    for line in lines():
        print string_root(line)

Advantages include:

The main loop read like English. It makes it obvious that each line is treated as an independent problem, unaffected by all other lines.
The two calls to raw_input() are close to each other. (Unfortunately, there's no do-while loop in Python.)
The string_root() function is independently testable.

Algorithm

You've implemented a brute-force solution. The complexity is \$O(L^2)\$, there \$L\$ is the length of inputString. (You can tell because for current_symbol in inputString is \$O(L)\$, and scanned_string.startswith(matched_characters) is also \$O(L)\$, and there are no break statements anywhere to serve as shortcuts.)

To bring it down to \$O(L)\$, you would need a smarter string-matching algorithm. The Z Algorithm is useful here: it detects string self-similarities in \$O(L)\$ time.

>>> compute_z('abc' * 4)
[12, 0, 0, 9, 0, 0, 6, 0, 0, 3, 0, 0]

The result above means that 'abcabcabcabc' has

12 leading characters in common with itself
no leading characters in common with itself after dropping the initial 'a'
no leading characters in common with itself after dropping the initial 'ab'
9 leading characters in common with itself after dropping the initial 'abc'
…

This implementation works in \$O(L)\$ time.

def string_root(s):
    z = compute_z(s)
    z_iter = enumerate(z)
    _, s_len = next(z_iter)
    for block_len, remain_len in z_iter:
        if remain_len == s_len - block_len and remain_len % block_len == 0:
            return s_len // block_len
    return 1

Documentation and testing

A doctest would be useful here.

def string_root(s):
    """Returns the maximum number of repeating blocks into which string s
    can be split.
    
    >>> string_root('abcabcabcabc')
    4
    >>> string_root('abcdefgh012')
    1       
    >>> string_root('aaaaaaaaaa')
    10
    >>> string_root('abcabcabcabcd')
    1
    >>> string_root('abcabcdabcabcd')
    2
    >>> string_root('abcabcabcabx')
    1
    """
    z = compute_z(s)
    …

Then you can test the code using

import findsr
import doctest
doctest.testmod(findsr)

answered Dec 7, 2014 at 11:42

200_success

145.6k
22
191
481