Numpy array copy slower than Python list copy

Question

I've seen several posts here about accessing individual items in numpy arrays and python lists via a for loop.

My program is a little different. What I'm doing is copying a small array (or list) of about 10 elements and then using it. I'm doing this many times, so I want it to be fast. The application if you're interested is that I'm searching a tree, and each small array/list is a 'state' in the tree.

But I'm finding that the numpy.copy() function is slower than the Python list() function.

To demonstrate what I'm saying, here's a small program with timings:

import time

import numpy as np
def numPyArrays(iterations:int):

    initialArray = np.array([1,0,0,1,0,1,1,1,0,0])

    for i in range(iterations):
        nextArray = initialArray.copy()
    
    print(f"Numpy Arrays:\n{nextArray}")
    return    

def pythonLists(iterations:int):

    initialList = [1,0,0,1,0,1,1,1,0,0]

    for i in range(iterations):
        nextList = list(initialList)
    
    print(f"Python Lists:\n{nextList}")
    return

def main():

    numIterations = 10000000

    startTime = time.time()
    numPyArrays(numIterations)
    print(f"Time taken: {round(time.time() - startTime, 2)} seconds.\n")

    startTime = time.time()
    pythonLists(numIterations)
    print(f"Time taken: {round(time.time() - startTime, 2)} seconds.\n")

main()

Timings:

Numpy Arrays:
[1 0 0 1 0 1 1 1 0 0]
Time taken: 4.68 seconds.

Python Lists:
[1, 0, 0, 1, 0, 1, 1, 1, 0, 0]
Time taken: 1.5 seconds.

I would have thought the numpy.copy function would have been as fast as a list copy.

EDIT: For those wanting to know what the underlying problem is, it's an Advent of Code problem. Day 19, 2022. https://adventofcode.com/2022/day/19

Is this just an observation or do you have any specific question about it? — Klaus D.
– Klaus D., Commented Mar 11, 2023 at 3:57
Creating a new np.array takes more time than creating a new list. With 10 elements, mainly the effort required for this is measured. Compare your benchmark with arrays/lists of size 1, 10 (no measurable difference between 1 and 10), 100, 1000, 10000. — Michael Szczesny
– Michael Szczesny, Commented Mar 11, 2023 at 5:40
I suspect you have an XY problem. Numpy is for doing stuff with arrays. If you make a Python tree with Numpy arrays as the nodes it's unlikely that you'll get any speed benefit from Numpy. OTOH, if it's a simple binary tree, you can efficiently "pack" it into an array, similar to what heapq does. docs.python.org/3/library/heapq.html — PM 2Ring
– PM 2Ring, Commented Mar 11, 2023 at 7:45
I reopened this. I think our answers provide a more focused response than the link. And comments suggest there's more to the question than simply copying. But maybe that's a topic for a new question. — hpaulj
– hpaulj, Commented Mar 11, 2023 at 16:20
@hpaulj " I think our answers provide a more focused response than the link" - I don't follow. The question here is about something less focused: the overall copy process. The bottleneck is the instantiation of Numpy arrays, and the other question is about specifically that. — Karl Knechtel
– Karl Knechtel, Commented Mar 12, 2023 at 4:09

hpaulj · Accepted Answer · 2023-03-11 16:28:16Z

Using ipython timeit

In [82]: %%timeit x= [1,0,0,1,0,1,1,1,0,0]
    ...: list(x)
185 ns ± 1.97 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

another way of making a copy of a list:

In [83]: %%timeit x= [1,0,0,1,0,1,1,1,0,0]
    ...: z=x[:]
146 ns ± 1.98 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

A list consists of a databuffer that holds references to the list objects (plus a small growth space). These copies make a new list, with simple copy of that databuffer. Same references.

The array copy, as you note is slower, but I consider any timeit measured in ns to be trivially fast.

In [84]: %%timeit x=np.array([1,0,0,1,0,1,1,1,0,0])
    ...: x.copy()
756 ns ± 2.74 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

But let's look at a larger array/list:

In [88]: %%timeit x=np.arange(10000)
    ...: x.copy()
5.37 µs ± 8.77 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [89]: %%timeit x= list(range(10000))
    ...: x[:]
40.4 µs ± 100 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

The array copy scales better. That means that a large part of the difference you saw has to do higher "startup" cost in the array copy. The copy per element is faster.

In general, operations on small lists often are faster than similar ones with arrays. It's when the length gets large(r) that the arrays are advantageous. But don't just go by speed; what you are doing with the list or array is just as important.

edit

Let's look at what tweaking a few values does to the times

In [97]: %%timeit x= list(range(10))
    ...: y=x[:]; y[3] = 4; y[1]=0
243 ns ± 2.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [98]: %%timeit x=np.arange(10)
    ...: y=x.copy(); y[[3,1]] =[4,0]
9.89 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Even though the list tweaking has to be done one by one, it's still faster. In fact accessing individual items adds substantially to the time.

This addition doesn't change the times for the large example. There the copying, for both, dominates the times.

Hi, thanks for all this. I'm basically coming to the idea that I need to allocated memory upfront for my states. And then just use that memory over and over again. Something like this globalStateList = np.zeros((MAX_STATES, 10), dtype = int) where MAX_STATES is some number large enough to hold all the states in my search tree.

Grismar · Accepted Answer · 2023-03-11 03:58:38Z

The operation your doing seems pointless, you just assign the same array or list to a new variable over and over and over. What happens if you actually make that many copies of the data in a useful manner?

How about this:

import time

import numpy as np


def numPyArrays(iterations: int):
    initialArray = np.array([1, 0, 0, 1, 0, 1, 1, 1, 0, 0])

    result = np.zeros((iterations, len(initialArray)), dtype=int)
    result[:, :] = initialArray

    print(result[0], result[1], result[-1])
    return


def pythonLists(iterations: int):
    initialList = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0]

    result = [list(initialList) for i in range(iterations)]

    print(result[0], result[1], result[-1])
    return


def main():
    numIterations = 10000000

    startTime = time.time()
    numPyArrays(numIterations)
    print(f"Time taken: {round(time.time() - startTime, 2)} seconds.\n")

    startTime = time.time()
    pythonLists(numIterations)
    print(f"Time taken: {round(time.time() - startTime, 2)} seconds.\n")


main()

Result:

[1 0 0 1 0 1 1 1 0 0] [1 0 0 1 0 1 1 1 0 0] [1 0 0 1 0 1 1 1 0 0]
Time taken: 0.13 seconds.

[1, 0, 0, 1, 0, 1, 1, 1, 0, 0] [1, 0, 0, 1, 0, 1, 1, 1, 0, 0] [1, 0, 0, 1, 0, 1, 1, 1, 0, 0]
Time taken: 4.26 seconds.

All your example shows is that initialising an entirely new numpy array has some overhead compared to defining a new list. But in any realistic scenario, you'd only do that a few times anyway, while the copy operation itself is way faster in numpy, as this example shows.

Instead of copying the same array over and over to a completely new array, my code copies the array to a new row of an array that many times. And then it does the same with lists. Here, numpy is more than an order of magnitude faster.

(Note that my example just prints the first 2 and then the last copy of each result, but of course that's arbitary - you can write some code to check that all rows do in fact have the same values assigned)

Yes I know my program doesn't do anything useful, I was just highlighting the problem I'm having. As I said, I'm doing a tree search, and so need to be able to create these states (which can be arrays or lists or even classes with integer data members). And to create a state's children, I need to be able to create new child states and add them onto a stack for a DFS search. Which means I need to create a new state somehow and change a couple of numbers in it, reflecting that new state. I might be able to use what you have done here, which is really allocating a lot of memory up front.
So it's the new array creation that is slow compared to creating a new list. Not the copying over of the data...
Per my own timing results, creating a new, empty Numpy array takes about as long as making the copy. That's where basically all the work goes.
How are you using the states? Looking at specific values, or doing math and other numpy array things? A test that includes' tweaking a few values, and doing some sort of state test would be more realistic than just copy.
That's OK, no offense was intended - I just wanted to make it clear that you need to tell us more about the actual problem (or some aspect of the problem you consider critical) for us to tell you whether you're right (lists would be better) or whether there's actually a better or more efficient way to use numpy to resolve it. I'll have a quick look at that problem though, and see what it's about.

Collectives™ on Stack Overflow

Numpy array copy slower than Python list copy

2 Answers 2

edit

1 Comment

12 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

edit

1 Comment

12 Comments

Linked

Related