2

I am struggling for a while with Multiprocessing in Python. I would like to run 2 independent functions simultaneously, wait until both calculations are finished and then continue with the output of both functions. Something like this:

# Function A: 
def jobA(num):   
    result=num*2
    return result

# Fuction B:        
def jobB(num):
    result=num^3
    return result

 # Parallel process function: 

{resultA,resultB}=runInParallel(jobA(num),jobB(num))

I found other examples of multiprocessing however they used only one function or didn't returned an output. Anyone knows how to do this? Many thanks!

4
  • Are the functions I/O bound or processor bound? Commented Dec 5, 2017 at 1:28
  • have you tried Pool.apply_async? Commented Dec 5, 2017 at 4:55
  • Both functions are screenscraping data from a website so I guess it is I/O bound. Commented Dec 5, 2017 at 12:34
  • Raymond Hettinger, Keynote on Concurrency, PyBay 2017 Commented Dec 5, 2017 at 15:17

1 Answer 1

2

I'd recommend creating processes manually (rather than as part of a pool), and sending the return values to the main process through a multiprocessing.Queue. These queues can share almost any Python object in a safe and relatively efficient way.

Here's an example, using the jobs you've posted.

def jobA(num, q):
    q.put(num * 2)

def jobB(num, q):
    q.put(num ^ 3)

import multiprocessing as mp
q = mp.Queue()
jobs = (jobA, jobB)
args = ((10, q), (2, q))
for job, arg in zip(jobs, args):
    mp.Process(target=job, args=arg).start()

for i in range(len(jobs)):
    print('Result of job {} is: {}'.format(i, q.get()))

This prints out:

Result of job 0 is: 20
Result of job 1 is: 1

But you can of course do whatever further processing you'd like using these values.

Sign up to request clarification or add additional context in comments.

8 Comments

Seems that the values/objects that each function puts in the queue should contain information about where it came from so that they can be used correctly - maybe like a tuple (func_name, return_value). ...wait until both calculations are finished...: you probably want to join each process or a loop while either is_alive before executing the last for loop in your example.
@wwii Sure, you can add information about the "source" of any of the data, if you want. But there's no need to call join(), since the queue is thread/process-safe, and the call to get() will block until a new object is available.
@georgexsh No. apply_async calls a single function in one of the worker processes of a pool. The question asks how to call multiple different functions in parallel.
@bnaecker any difference with call apply_async multiple times and start enough processes in the pool? basically, you implemented a simplified version of Pool. besides, you don't need to pass Queue object as an argument to the worker function.
When all the processes finish and stop putting things in the queue will the q.get statement, in the last for loop, block indefinitely?
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.