Why is numpy ndarray much slower than lists for simple loops?

Question

I'm new to Python so I've decided to solve some common challenges to improve my knowledge of the language. I learned about numpy and its efficient ndarrays so I attempted the following experiment:

Consider the 2 sum problem (e.g. here) and let's solve it the naive way (it doesn't matter for the purpose of this question). Here's a solution with python's lists:

from  itertools import combinations

def twosum1(n_lst):
    pairs=list(combinations(n_lst,2))
    solutions=[]
    for pair in pairs:
        if sum(pair)==7: solutions.append(pair)
    return(solutions)

Then I created a version using np.arrays expecting it will drastically speed up the calculation:

from  itertools import combinations
import numpy as np

def twosum2(n_lst):
    pairs=np.array(list(combinations(n_lst,2)),dtype=int)
    return pairs[pairs[:,1]+pairs[:,0]==7]

However, after timing the two functions, twosum2 is about 2x slower than twosum1. So I thought that the problem maybe in the dynamical selection of elements, so I've written an exact copy of twosum1 by replacing lists with ndarrays ...

def twosum3(n_lst):
    pairs=np.array(list(combinations(n_lst,2)))
    solutions=np.empty((0,2))
    for pair in pairs:
        if np.sum(pair)==7: 
            solutions=np.append(solutions,[pair],axis=0)
    return(solutions)

... and the resulting function was 10x slower than the original!

How is this possible? What I'm I doing wrong here? Clearly, removing loops and replacing lists with ndarrays is not enough to gain speed (contrary to what I learned reading this).

Edit:

I use %timeit in jupyter to time the functions.
I take identical benchmarks for all the functions I'm timing.
The fact that I calculate combinations in the same way in the 3 functions tells me that the slowing down is due to numpy ... but don't see how.

How large is n_lst? There is some copying overhead in the NumPy solutions, when you create an array from a list. And could you also mention or show your timing methodology? — 9769953
– 9769953, Commented Jan 15, 2019 at 15:54
You are doing most of the work the same way in both functions: list(combinations(n_lst,2)). Adding a numpy wrapper after forcing the whole generator into memory is just clobbering your RAM for no good purpose. The actual comparison is not the bottleneck at all. — Mad Physicist
– Mad Physicist, Commented Jan 15, 2019 at 15:55
This is a really good example of why you have to be mindful to get a boost from using numpy sometimes. — Mad Physicist
– Mad Physicist, Commented Jan 15, 2019 at 15:57
@MadPhysicist it definitely does not, np.append does not work in-place anyway. It always creates new arrays. — juanpa.arrivillaga
– juanpa.arrivillaga, Commented Jan 15, 2019 at 17:25
np.append is just a confusing front end to np.concatenate. It should be deprecated. Building an array by repeated concatenate is slow. It's better to build a list and do one array construction at the end. — hpaulj
– hpaulj, Commented Jan 15, 2019 at 17:28

B. M. · Accepted Answer · 2019-01-15 17:28:26Z

3

The costly operation is np.array(list(combinations(n_lst,2)),dtype=int) because python must scan each member of the list, check if member is 'int compatible', convert it in integer and store it in the array.

To reach numpy performance, you must conceive all the algorithm in numpy. For example :

In [63]: n_lst=list(range(100))

In [64]: %timeit twosum1(n_lst)
11.2 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [65]: np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
Out[65]: 
array([[0, 7],
       [1, 6],
       [2, 5],
       [3, 4],
       [4, 3],
       [5, 2],
       [6, 1],
       [7, 0]], dtype=int64)

In [66]: %timeit np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
306 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This way you will win a 30 to 100 factor , depending of the problem.

edited Jan 15, 2019 at 17:28

answered Jan 15, 2019 at 17:04

B. M.

18.7k2 gold badges40 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

xi45 Over a year ago

Thanks for your answer. It's odd that numpy is systematically slower than just using python's list though. It makes me wonder about more complex problems since this is a trivial scenario. I also don't know what to think anymore about the text of jakevdp I linked to above.

Michael Choi Over a year ago

@xi45 So it's not that numpy is slower, it's that translating between numpy and python is slower. numpy's backend is written in C++ which allows it to perform so quickly, but translating back and forth will cause most of the performance penalties.

JE_Muc Over a year ago

@xi45 numpy and scipy definitely is great. Numerical operations are easy to write, easy to understand and have a great performance. Especially if you compare the performance of numpy to non-optimized C code... And as Mad Physicist pointed out, this "only one transformation" is exactly the problem. List do not have a homogenous type, whereas numpy arrays have a homogenous type (except for object dtype). This is the bottleneck. If you stick to numpy functions, your code will be several magnitudes faster.

JE_Muc Over a year ago

When profiling the performance of operations, you need to consider several pitfalls: Type conversions (one of the bottlenecks of python, so this problem is not numpy-specific), sample size, timing methods, etc... Just look at the answer of B. M. and notice how much faster his "pure numpy" method is compared to your python method.

JE_Muc Over a year ago

@xi45 You are welcome. Avoiding non-numpy functions is always a good idea when working with numpy. But for example most of the basic python operations like +, += will be "translated" to the corresponding numpy functions automatically, so no need to worry about basic syntax. External modules that you have to import, also built-ins like itertools, in many cases cause type conversions. Just take a look at the module description if the specific module can handle numpy inputs and check if the output is also a numpy array.

|

Collectives™ on Stack Overflow

Why is numpy ndarray much slower than lists for simple loops?

1 Answer 1

9 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Linked

Related