How can I make this for loop more efficient and faster

Question

Im trying to run this kind of loop (its simplified in this example) that generates and adds up random consumption´s for 1000 clients, which takes approximately 1h30h.

import numpy as np

rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)

for t in range(0,1000):
   consumption = np.zeros(35000)
   consumption[0] = 0.5
   rand_array = np.random.rand(35000)

   for i in range(1,35000):
      consumption[i] = rand_array[i] * consumption[i-1]

   total_consumption = total_consumption + consumption

Is there a way I can make this faster and more efficient? I tried to use list comprehension to no avail

Have you tried sum()? Care with numpy.sum() as it does not always return overflow errors if your type is too small. — jwal
– jwal, Commented Sep 7, 2021 at 19:25
Creating brand new arrays every pass through can be prohibitively time consuming. Why do you need a rand array anyway? Can't you just generate a random number for each multiplication? For that matter, why is there a consumption array when you only need the previous consumption value? — RufusVS
– RufusVS, Commented Sep 7, 2021 at 19:26
I edited the code so it run without an overflow or syntactic error. Please check the modifications are correct. The result is a zero-based array because it quickly converge to 0 due to the product by values between 0 and 1... — Jérôme Richard
– Jérôme Richard, Commented Sep 7, 2021 at 19:41
@RufusVs This a very simplified example of my code, the original uses a random distribution and complex algorithm built in excel that Im now trying to port to python. Its simplified so it easier to understand. — unamed19
– unamed19, Commented Sep 7, 2021 at 19:44
If your algorithm requires applying 35,000 values to 1000 customers, I can't see a shortcut. Any savings would be from finding a better algorithm, or implementing the bottleneck in a faster language that you can call from Python. — John Bayko
– John Bayko, Commented Sep 7, 2021 at 20:04

Jérôme Richard · Accepted Answer · 2021-09-07 20:12:39Z

2

You can use np.cumprod to vectorize the computation and make it much faster. Here is the resulting code:

total_consumption = np.zeros(35000)

for t in range(0,1000):
    rand_array = np.random.rand(35000)
    rand_array[0] = 0.5 # Needed for the cumprod
    consumption = np.cumprod(rand_array)
    total_consumption += consumption

This code takes 267 milliseconds on my machine while the original one takes 11.8 seconds. Thus, it is about 44 time faster.

answered Sep 7, 2021 at 20:12

Jérôme Richard

53.3k6 gold badges48 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mark Setchell · Accepted Answer · 2021-09-07 20:24:03Z

2

I had a try at doing the middle part with numba:

import numba
from numba import jit

@jit(nopython=True)
def speedy(consumption, rand_array):
    for i in range(35000):
        consumption[i] = rand_array[i] * consumption[i-1]
    return consumption

rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)

for t in range(0,1000):
    consumption = np.zeros(35000)
    consumption[0] = 0.5
    rand_array = np.random.rand(35000)

    consumption = speedy(consumption, rand_array)
    total_consumption = total_consumption + consumption

The time was 259 ms versus 9.6 seconds for your code. I guess you could do more in numba too if you wanted to try.

answered Sep 7, 2021 at 20:24

Mark Setchell

210k32 gold badges308 silver badges502 bronze badges

2 Comments

unamed19 Over a year ago

This solution seems interesting but in my real algorithm im using consumption[i] = scipy.stats.beta.ppf(rand_array[r], 5 * d[r-1], 5 * (1 - d[r-1]). Im new to python and numba is there a way to integrate scipy.stats.beta.ppf function in numba?

Mark Setchell Over a year ago

I'm unfamiliar with that function. If you can find the source, you could try putting it into my numba function... maybe. Not sure.

Collectives™ on Stack Overflow

How can I make this for loop more efficient and faster

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related