Let's say that I have a list given by:
a = [
    'www.google.com',
    'google.com',
    'tvi.pt',
    'ubs.ch',
    'google.it',
    'www.google.com'
]
I want to remove the duplicates and the substrings to keep a list like:
b = [
    'www.google.com',
    'tvi.pt',
    'ubs.ch',
    'google.it'
]
Do you know an efficient way to do that?
The goal is to keep the string that is longer, that's why www.google.com is preferred over google.com.
www.google.comand notgoogle.com?set('.'.join(x.split('.')[-2:]) for x in a)gives{'tvi.pt', 'google.com', 'google.it', 'ubs.ch'}. Close enough?3levels, for example.substringis a misused word here. As far as I can see, what OP really means is top level domains.