16

I have a Python program which looks like this:

total_error = []
for i in range(24):
    error = some_function_call(parameters1, parameters2)
    total_error += error

The function 'some_function_call' takes a lot of time and I can't find an easy way to reduce time complexity of the function. Is there a way to still reduce the execution time while performing parallel tasks and later adding them up in total_error. I tried using pool and joblib but could not successfully use either.

4
  • 2
    Is some_function_call() CPU or I/O bound? If it is CPU bound, take a look at the first answer to this post. That is a very simple implementation using the multiprocessing library. Commented Jan 3, 2018 at 17:30
  • I am not sure whether to classify it as CPU or I/O bound. The function basically calculates the least cost walk in a graph. The input that I give to the function is typically two vectors of 1200 length. Commented Jan 3, 2018 at 17:34
  • 1
    The work is CPU bound. When determining if work is CPU or I/O bound, think in terms of "If my CPU was faster, would the work be done faster?". For example, calculations are generally CPU bound, but reading and writing to file are not (they are either network I/O or disk I/O). Commented Jan 3, 2018 at 17:44
  • It seems like you should probably use pool, can you post your attempt at it so we can troubleshoot? Commented Jan 3, 2018 at 17:52

2 Answers 2

24

You can also use concurrent.futures in Python 3, which is a simpler interface than multiprocessing. See this for more details about differences.

from concurrent import futures

total_error = 0

with futures.ProcessPoolExecutor() as pool:
  for error in pool.map(some_function_call, parameters1, parameters2):
    total_error += error

In this case, parameters1 and parameters2 should be a list or iterable of the same size as the number of times you want to run the function (24 times as per your example).

If paramters<1,2> are not iterables/mappable, but you just want to run the function 24 times, you can submit the jobs for the function for the required number of times, and later acquire the result using a callback.

class TotalError:
    def __init__(self):
        self.value = 0

    def __call__(self, r):
        self.value += r.result()

total_error = TotalError()
with futures.ProcessPoolExecutor() as pool:
  for i in range(24):
    future_result = pool.submit(some_function_call, parameters1, parameters2)
    future_result.add_done_callback(total_error)

print(total_error.value)
Sign up to request clarification or add additional context in comments.

Comments

10

You can use python multiprocessing:

from multiprocessing import Pool, freeze_support, cpu_count
import os

all_args = [(parameters1, parameters2) for i in range(24)]

# call freeze_support() if in Windows
if os.name == "nt":
    freeze_support()

# you can use whatever, but your machine core count is usually a good choice (although maybe not the best)
pool = Pool(cpu_count()) 

def wrapped_some_function_call(args): 
    """
    we need to wrap the call to unpack the parameters 
    we build before as a tuple for being able to use pool.map
    """ 
    sume_function_call(*args) 

results = pool.map(wrapped_some_function_call, all_args)
total_error = sum(results)

2 Comments

Why def wrapped_some_function_call(args) if you never call it?
@pstatix, because i did not realize, thaks for pointing it :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.