Comparing Lists

Question

I have 8 lists (jan, feb, mar, apr, may, jun, jul, aug) each of which contain names in list format, i.e.

['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba', 'Louis le Roy']

etc.

How do I compare these lists in order, and see when a name appeared (i.e. subscribed) and when a name disappeared (i.e. unsubscribed).

So, say John Smith didn't appear until February, I want to have this information. Lets say he unsubscribed in July, I want this information too (this is FAR more important than the former).

I feel like your data structure is not well suited to this. Perhaps ship these out to a database and then you can connect the name, time joined, and time left? — Daenyth
– Daenyth, Commented Sep 9, 2011 at 18:31
are there multiple subscriptions for a name? if so, which one do you want to have? — Karoly Horvath
– Karoly Horvath, Commented Sep 9, 2011 at 18:32

NullUserException · Accepted Answer · 2011-09-09 18:44:26Z

6

Don't use lists, use a set instead.

You could find who (un)subscribed between jan and feb simply using set difference:

subs = feb - jan
unsubs = jan - feb

That being said, you would be better off following Daenyth's suggestion. Put these in a database, add a joined and left date field and you'll have finer granularity than just months and you won't need to stored duplicated data.

edited Sep 9, 2011 at 18:44

answered Sep 9, 2011 at 18:34

NullUserException

85.8k31 gold badges212 silver badges239 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

fabmilo · Accepted Answer · 2011-09-09 20:23:14Z

data = {
 'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'],
 'feb': ['Louis le Roy', 'John Smith'],
 'mar': ['Cat Stevens', 'Louis le Roy']
}

from itertools import izip

keys = 'jan feb mar'.split()
for m1,m2 in izip(keys,keys[1:]):
    a = set(data[m1])
    b = set(data[m2])
    print m1, '\n\tsubscribed:', ','.join(b-a), '\n\tquit:', ','.join(a - b )

result:

jan 
    subscribed: Louis le Roy 
    quit: Andrew Alexander,Cat Stevens,El Gordo Baba
feb 
    subscribed: Cat Stevens 
    quit: John Smith

Karoly Horvath · Accepted Answer · 2011-09-09 18:36:46Z

data = {
 'jan': ['John Smith', 'Cat Stevens', 'Andrew Alexander', 'El Gordo Baba'],
 'feb': ['Louis le Roy', 'John Smith'],
 'mar': ['Cat Stevens', 'Louis le Roy']
}

subs = {}
unsubs = {}
for mon in data:
    for name in data[mon]:
        if name not in subs:
            subs[name] = mon
        else:
            unsubs[name] = mon
>>> subs
{'Andrew Alexander': 'jan', 'Louis le Roy': 'mar', 'John Smith': 'jan', 'El Gordo Baba': 'jan', 'Cat Stevens': 'jan'}
>>> unsubs
{'Louis le Roy': 'feb', 'John Smith': 'feb', 'Cat Stevens': 'mar'}

rocksportrocker · Accepted Answer · 2011-09-09 18:37:46Z

As a starter:

from collections import defaultdict
dd = dict(jan=(0,jan), feb=(1, feb), ...)

appearances = defaultdict(list)

for k, (i, li) in dd.items():
   for name in li:
       appearances[name].append((i,k))

for name in appearances.keys():
    months = [ (name, i) for i, name in sorted(appearances[name]) ]
    print name, months

You get for each name this sorted list of pairs (month, index) the name appears. index is the index of the month. Now you can check for gaps, for a minimal index and for a maximal index.

Andrew Clark · Accepted Answer · 2011-09-09 18:38:47Z

Here is a quick example:

jan,feb,mar,apr,may,jun,jul,aug = [1],[1,2],[1,2,3],[1,2,3,4],[2,3,4],[3,4],[4],[]
months = [set(m) for m in [jan,feb,mar,apr,may,jun,jul,aug]]
changes = [(list(b-a), list(a-b)) for a, b in zip(months, months[1:])]

>>> changes
[([2], []), ([3], []), ([4], []), ([], [1]), ([], [2]), ([], [3]), ([], [4])]

Each element in changes is a transition from one month to the next, where the first item in the tuple is a list of all that were added, and the second item in the tuple is a list of all that left.

Collectives™ on Stack Overflow

Comparing Lists

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Related