Skip to main content
18 of 18
replaced http://stackoverflow.com/ with https://stackoverflow.com/

If the different kinds of elements are limited, you could build a table of flags. Each set is an array of bools where each position represents a word. The words are kept in a list, where their index is equal to the index of the bool that represents them. To find the supersets, you compare the values of the flags; to be a superset, every position with value True in the subset should contain True in the candidate set. All this can be done in O(n), which is not bad in my opinion.

Python:

WORDS = (
    'a',
    'b',
    'c',
    'f',
)

def values_to_words(s):
    return set(WORDS[i] for i, v in enumerate(s) if v)

def words_to_values(s):
    return tuple(True if w in s else False for i, w in enumerate(WORDS)) # Unoptimized

SETS = tuple(words_to_values(s) for s in (
    ('a','b',),
    ('a','b','c',),
    ('a','c',),
    ('a','c','f',),
))

def get_supersets(q):
    values = words_to_values(q)
    is_superset = lambda s: all(v1 or not v2 for v1, v2 in zip(s, values))
    return (values_to_words(s) for s in SETS if is_superset(s))

print list(get_supersets(('a','c',)))
# [set(['a', 'c', 'b']), set(['a', 'c']), set(['a', 'c', 'f'])]

Be of course careful not to build your own SQL engine. You could in fact use one for this pattern.

Also, you'll find https://stackoverflow.com/questions/1263524/superset-search useful - I just found it.

Thijs van Dien
  • 1.1k
  • 1
  • 7
  • 13