Skip to main content
Tweeted twitter.com/#!/StackCodeReview/status/450271148673335296
edited tags
Link
200_success
  • 145.6k
  • 22
  • 191
  • 481
Source Link

Use up all characters to form words

Task:

Given a dictionary of words, and a set of characters, judge if all the characters can form the words from the dictionary, without any characters left. For example, given the dictionary {hello, world, is, my, first, program}, if the characters set is "iiifrssst", you should return 'true' because you can form {is, is, first} from the set; if the character set is "eiifrsst", you should return 'false' because you cannot use all the characters from the set.

P.S. there may be tens of thousands of words in the dictionary, and the chars set length could be up to hundreds, so I really need some efficient algorithm.

Source: http://www.careercup.com/question?id=5841660122497024

Here is my solution in Python:

from collections import Counter
from itertools import combinations_with_replacement
import copy


# Check if word can be made up with the characters in char_count
def valid_word(word):
    # build character count for word
    word_char_count = Counter(word)

    # check if character count for word
    for char in word_char_count.keys():
        if char not in char_count or word_char_count[char] > char_count[char]:
            return False

    return True


# Only return words that can be made up from characters with char_set
def prune_words(words):
    res = []
    for word in words:
        if valid_word(word):
            res.append(word)

    return res


def main(words, char_set):
    global char_count
    char_count = Counter(char_set)  # count character occurrences

    # Prune words that cannot possible be made from the set of characters
    words = prune_words(words)
    words_len = [len(word) for word in words]

    # Maximum number of r with generating combinations
    max_r = len(char_set) / min(words_len)

    # Generate permutations with the valid words
    for r in range(1, max_r + 1):
        for comb in combinations_with_replacement(words, r):
            # If the total length of the combination matches length of char_set,
            # do further checks
            if sum(map(len, comb)) == len(char_set):
                temp_count = copy.deepcopy(char_count)

                # Subtract char count of each word in combination from char_set
                for word in comb:
                    temp_count.subtract(word)

                # If the remaining count are all zero, then we have found an
                # answer
                if all([count == 0 for count in temp_count.values()]):
                    print comb
                    return True

    return False

if __name__ == '__main__':
    print main(['hello', 'world', 'is', 'my', 'first', 'program'],
               'iiifrssst')

What improvements can be made to my code (e.g. algorithmic speed, make it more 'Python', style, etc etc)? Happy to hear any constructive comments!