0

I was writing a code for the adjacency matrix for a graph and found this question:

Best and/or fastest way to create lists in python

So, I simply wrote this to initialize my adjacency matrix as it seemed the fastest way to initialize a list so far.

import time
t1 = time.time()
matrix = [[0]*5000]*5000
t2 = time.time()

t2-t1

0.0

But after doing some operations, I realized each time I changed/appended an element the effect is applied to all the sub-lists, which means each list is just a reference and this will not work for a real scenario even though it's fast.

I can't use numpy, as algorithmic sites don't allow external modules/libraries. I think numpy would've been the ideal solution for the general scenario.

Now, obviously most other 2d/multi-dim list initialization answers suggest list comprehension,

import time
t1 = time.time()
matrix = [[0 for i in range(5000)] for j in range(5000)]
t2 = time.time()
print(t2-t1)

0.7021145820617676

But, it seems slow (compared to other languages) considering the strict time limit for solving a graph problem in an algorithmic site.

Is there a faster way to initialize a 2d/3d list in python?

Let me know if there is a duplicate, so far I didn't find anything which shows some time comparison between methods for multi-dimensional list initialization.

5
  • how do you change an element of the list? I suppose when you append, there is no problem because it just appends that the element to the current list. Commented May 21, 2020 at 14:02
  • Not actually, append doesn't work too in the first code snippet. If I append 54 to let's say 0th sublist, all of the sub-lists looks like [0, 54]. Commented May 21, 2020 at 14:04
  • 1
    The reason I realized is * operation makes many copies of a reference, it does not actually make any list with numbers (0 in this case). Commented May 21, 2020 at 14:06
  • 1
    "it seems slow (compared to other languages)" Welcome to using Python. Do you have any other constraints? Can you use PyPy, Cython, Numba, ...? Do you actually need an adjacency matrix, or is an adjacency list (Dict[Node, [List[Node]]]) fine as well? Commented Jun 7, 2020 at 8:50
  • 1
    Thanks, @MisterMiyagi. I know I can optimize with using cython/numba/numpy etc. but they are not allowed in algo sites. Yes, I can definitely use the adjacency list in most of the algorithms, but sometimes I prefer the adjacency matrix as it results in shorter code. So, it was a general inquiry to know if there was anything that I didn't do but could do to improve speed in the same setup (without additional module, adjacency matrix). Commented Jun 7, 2020 at 8:57

2 Answers 2

2

The problem comes from the second level of list. [0] * 5000 is fine because is builds a list of immutable values. So if you change a value in that list, you will actually replace it. But as lists are mutable you cannot use that method to build a list of lists.

But this:

[[0] * 5000 for i in range(5000)]

is still correct. It build 5000 different lists of (immutable) numbers, so it will allow you to change any individual element with no unwanted side effect. And it is much more efficient than your current code.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that clears up many confusions I had. I guess this is the best I can do in terms of speed.
0

I found a method that is faster than doing your comprehension list, but still much slower than doing matrix=[[0]*5000]*5000. I am not an expert, but I'd suspect that your method is so fast because it isn't really creating a list of size 5000x5000, but instead a list of size 5000 and repeating it, as you noticed.

Your method takes 2.098e-5 seconds in my computer. My method takes 0.1236 seconds in my computer (it is possible that my method takes longer in your computer, so it could be even slower).

mat = []
[mat.append([0]*5000) for i in range(5000)]

You could try it, maybe it mitigates your problem.

2 Comments

I would expect consistently appending to a list to be less efficient than building the same list with a comprehension. Did you try to benchmark both ways with timeit?
The timing difference to mat = [[0] * 5000 for i in range(5000)] is marginal (less than 5% difference). Readability of the directly used list comprehension is much better, though. Idiomatically, a for loop is preferable to a list comprehension to trigger side-effects.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.