Parallel processing in python for calculation of a function

Question

I have a heavy code and running it takes a long time. I ran into the following code, but I really don't know how it does work. Consider def sigma(b) is a huge function in the middle of a code script. This part is the main reason of slowing the code running. We use the results of sigma function in the other parts of code. I just put it in the following code.

from multiprocessing import Pool
from multiprocessing import Process
import multiprocessing
def sigma(b):
    n=0
    for i in range (1,550):
        n=n+i+b
        return n
p = multiprocessing.Process(target=sigma)
p.start()

print(sigma(2))

Can any one help me please?

I do not understand your question. What is the problem? Is the sigma function actual code? — Pax Vobiscum
– Pax Vobiscum, Commented Jun 5, 2018 at 9:23
@PaxVobiscum Yes, I have a code with hundreds of lines. Among them, one function because of calculating an ODE takes a long time to get results. I want it to do the calculation faster using multiprocessing. — Ehsan
– Ehsan, Commented Jun 5, 2018 at 9:26
The code you have now for the sigma function makes no sense however. Also, is it supposed to return anything or change something in place? — Pax Vobiscum
– Pax Vobiscum, Commented Jun 5, 2018 at 9:27
It would make sense on using multiprocessing only if the work done in sigma can be split in multiple "little" jobs. On another hand, if your code has to/can do something else while the sigma is executed,you could rewrite your code a little so that it calls "the other parts of code" it self when the execution is completed (implement some kind of "event handler") — Lohmar ASHAR
– Lohmar ASHAR, Commented Jun 5, 2018 at 9:30
Not able understand your question. Im not sure how the return function is utilized for multiprocessing in your case. Does it need to be ? your last print statement does exaclty what its supposed to and has nothing to do with multiprocessing. Are you trying to get results of the function ? To make linear calculations faster you can just split it and run asynchronously. Use the Pool.map. — Kevin Roy
– Kevin Roy, Commented Jun 5, 2018 at 9:36

Mathieu · Accepted Answer · 2018-06-05 09:30:34Z

One way to do it (after fixing the indentation of sigma):

# -*- coding: utf-8 -*-
import multiprocessing as mp

def sigma(b):
    n=0
    for i in range (1,550):
        n=n+i+b
    return n

if __name__ == '__main__':
    inputs_b = [1, 2, 3, 4]

    with mp.Pool(processes = 2) as p:
        res = p.map(sigma, inputs_b)

The only issue with multiprocessing is that you can't run it in an IDE (like spyder), thus you need to save the results and retrieve it later.

It can be done with numpy, pandas, pickle, or others.

Then you might need to have multiple arguments. In this case, use starmap():

# -*- coding: utf-8 -*-
import multiprocessing as mp

def sigma(a, b):
    n=0
    for i in range (1,550):
        n=n+i+b+a
    return n

if __name__ == '__main__':
    inputs_b = [(a,b) for a in range(5) for b in range(6, 10)]

    with mp.Pool(processes = 2) as p:
        res = p.starmap(sigma, inputs_b)

N.B: processes = N gives the number of processes to open. It is recommended to use the number of physical CPUs or the number of CPUs-1.

EDIT2: Your dummy example is a very simple case. You have 2 options: write your function to do an elementary task and parallelize the elementary tasks OR take your big function running for 72 hours and run 4 or more at the same time on different input.

You also need to make sure that the processes do not use shared resources or you'll need to use more complex implementation.

Finally, using multiprocessing on functions which generates a lot of data might end in a Memory error (Not enough RAM). This will depend on the application.

No where. You can place a print statement: print (res) after the handler with or save the variable to the disk (after handler = same indentation as with). However since your example was really simple (and I assumed far from the reality), I did not bother with the return.
@KevinRoy Yes but then you get in the case of shared resources which becomes more complex. Not the best way to start with multiprocessing :)
@Mathieu Could I sent my code through the email to get benefit of your advice please?
@Ehsan Not sure I'll have time to look at a full code + I do not think you have a link to my e-mail through SO. You should try to implement this yourself and ask mostly for code optimization on code review (part of stack exchange). Despite you having probably done tonnes of optimization, it's rather hard for me to believe that a fully optimized code runs for 72 hours. I ran into the same problem, and with a few tricks I could speed up the programs by 1 000 times.

Collectives™ on Stack Overflow

Parallel processing in python for calculation of a function

1 Answer 1

5 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Related