Here is my code for this problem. Any bugs, performance in terms of algorithm time complexity, code style advice are appreciated.
This is a continued discussion from (Find smallest subset prefixes) and since it is new code, I post a new thread.
Problem:
Given a set of strings, return the smallest subset that contains the longest possible prefixes for every string.
If the list is ['foo', 'foog', 'food', 'asdf'] return ['foo', 'asdf']
The return is foo since foo is prefix for foo (itself), prefix for foog and prefix for food (in other words, foo could "represent" longer string like foog and food). Output also contains asdf because it is not prefix for any other words in the input list, so output itself.
The empty set is not a correct answer because it does not contain the longest possible prefixes.
Source code:
from collections import defaultdict
class TrieNode:
def __init__(self):
self.children = defaultdict(TrieNode)
self.isEnd = False
def insert(self, word):
node = self
for w in word:
node = node.children[w]
node.isEnd = True
def find_prefix(self, prefix, result):
if self.isEnd:
result.append(''.join(prefix[:]))
return
for w, child_node in self.children.items():
prefix.append(w)
child_node.find_prefix(prefix, result)
prefix.pop(-1)
if __name__ == "__main__":
words = ['foo', 'foog', 'food', 'asdf']
root = TrieNode()
for w in words:
root.insert(w)
result = []
root.find_prefix([], result)
print result