2

Within a list, I want to get rid of elements that are different from the previous and next ones (example: difference greater than 5)

n=[1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]
[x for x,y in zip(n,n[1:]) if  y-x<5]

It nearly works: it returns: [2048, 2049, 2050, 2052, 2052, 2054]

The point is that the last element is omitted.

Is there a fast and efficient way to get [2048, 2049, 2050, 2052, 2052, 2054, 2055] Thanks in advance Dom

2 Answers 2

2

zip normally works till the smallest iterable is exhausted. That is why your last value is ignored. We can fix that with itertools.izip_longest, which by default returns None if the shortest iterable is exhausted.

We take the value of x itself, if y is None with the expression y or x.

from itertools import izip_longest as ex_zip
n = [1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]
print [x for x, y in ex_zip(n,n[1:]) if (y or x) - x < 5]
# [2048, 2049, 2050, 2052, 2052, 2054, 2055]
Sign up to request clarification or add additional context in comments.

1 Comment

Please do not import it as zip, you will shadow the built-in zip.
2

You could fudge it by appending something to your slice which would ensure the last element gets added.

n=[1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]
>>> [x for x,y in zip(n,n[1:]+[n[-1]]) if y-x<5]
[2048, 2049, 2050, 2052, 2052, 2054, 2055]

You will get a performance boost from using itertools.izip instead of zip, of course. I compared my method to thefourtheye's and with izip they are close:

>>> timeit.repeat(
        stmt="[x for x,y in zip(n,n[1:]+[n[-1]]) if y-x<5]",
        setup="n=[1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]")
[5.881312771296912, 5.983433510327245, 5.889796803416459]
>>> timeit.repeat(
         stmt="[x for x,y in izip(n,n[1:]+[n[-1]]) if y-x<5]",
         setup="from itertools import izip; n=[1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]")
[4.871789328236275, 4.895227617064933, 4.80257417537436]
>>> timeit.repeat(
         stmt="[x for x, y in ex_zip(n,n[1:]) if (y or x) - x < 5]",
         setup="from itertools import izip_longest as ex_zip; n=[1913, 2048, 2049, 2050, 2052, 2052, 2054, 2055]")
[4.3260582542764245, 4.375828323146993, 4.177447625285289]

2 Comments

Is this not kind of a hack?
@AlexThornton And what if it is? It's straightforward and it works.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.