0

I have 8 lists (jan, feb, mar, apr, may, jun, jul, aug) each of which contain names in list format, i.e.

['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba', 'Louis le Roy']

etc.

How do I compare these lists in order, and see when a name appeared (i.e. subscribed) and when a name disappeared (i.e. unsubscribed).

So, say John Smith didn't appear until February, I want to have this information. Lets say he unsubscribed in July, I want this information too (this is FAR more important than the former).

2
  • 1
    I feel like your data structure is not well suited to this. Perhaps ship these out to a database and then you can connect the name, time joined, and time left? Commented Sep 9, 2011 at 18:31
  • are there multiple subscriptions for a name? if so, which one do you want to have? Commented Sep 9, 2011 at 18:32

5 Answers 5

6

Don't use lists, use a set instead.

You could find who (un)subscribed between jan and feb simply using set difference:

subs = feb - jan
unsubs = jan - feb

That being said, you would be better off following Daenyth's suggestion. Put these in a database, add a joined and left date field and you'll have finer granularity than just months and you won't need to stored duplicated data.

Sign up to request clarification or add additional context in comments.

Comments

1
data = {
 'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'],
 'feb': ['Louis le Roy', 'John Smith'],
 'mar': ['Cat Stevens', 'Louis le Roy']
}

from itertools import izip

keys = 'jan feb mar'.split()
for m1,m2 in izip(keys,keys[1:]):
    a = set(data[m1])
    b = set(data[m2])
    print m1, '\n\tsubscribed:', ','.join(b-a), '\n\tquit:', ','.join(a - b )

result:

jan 
    subscribed: Louis le Roy 
    quit: Andrew Alexander,Cat Stevens,El Gordo Baba
feb 
    subscribed: Cat Stevens 
    quit: John Smith

Comments

0
data = {
 'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'],
 'feb': ['Louis le Roy', 'John Smith'],
 'mar': ['Cat Stevens', 'Louis le Roy']
}

subs = {}
unsubs = {}
for mon in data:
    for name in data[mon]:
        if name not in subs:
            subs[name] = mon
        else:
            unsubs[name] = mon
>>> subs
{'Andrew Alexander': 'jan', 'Louis le Roy': 'mar', 'John Smith': 'jan', 'El Gordo Baba': 'jan', 'Cat Stevens': 'jan'}
>>> unsubs
{'Louis le Roy': 'feb', 'John Smith': 'feb', 'Cat Stevens': 'mar'}

Comments

0

As a starter:

from collections import defaultdict
dd = dict(jan=(0,jan), feb=(1, feb), ...)

appearances = defaultdict(list)

for k, (i, li) in dd.items():
   for name in li:
       appearances[name].append((i,k))

for name in appearances.keys():
    months = [ (name, i) for i, name in sorted(appearances[name]) ]
    print name, months

You get for each name this sorted list of pairs (month, index) the name appears. index is the index of the month. Now you can check for gaps, for a minimal index and for a maximal index.

Comments

0

Here is a quick example:

jan,feb,mar,apr,may,jun,jul,aug = [1],[1,2],[1,2,3],[1,2,3,4],[2,3,4],[3,4],[4],[]
months = [set(m) for m in [jan,feb,mar,apr,may,jun,jul,aug]]
changes = [(list(b-a), list(a-b)) for a, b in zip(months, months[1:])]

>>> changes
[([2], []), ([3], []), ([4], []), ([], [1]), ([], [2]), ([], [3]), ([], [4])]

Each element in changes is a transition from one month to the next, where the first item in the tuple is a list of all that were added, and the second item in the tuple is a list of all that left.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.