0

I have pandas.DataFrame object called df, and I want to interpolate its missing values via parallelization. This is what I do:

def func(df):
    return df.interpolate(method='linear', axis=1)


ddf = dd.from_pandas(df, npartitions=8)
res = ddf.map_partitions(func)
res2 = res.compute()

The outcome is,

print(res2)
0    None
1    None
2    None
3    None
4    None
5    None
6    None
7    None
dtype: object

and

type(res)
dask.dataframe.core.Series

Edit 1 After following @mdurant suggestion, I've changed the function to this

def func(df):
    return df.interpolate(method='linear', axis=1, inplace=True) 

and now the result is the expected one.

However, I'm still having some newbie questions regarding this code. The benchmarks below show that the non parallel version is faster than the parallel one.

Non parallel:

%time df.interpolate(method='linear', axis=1, inplace=True)
Interpolating missing values.
CPU times: user 19.5 s, sys: 162 ms, total: 19.7 s
Wall time: 19.8 s

Parallel:

res = ddf.map_partitions(func)
%time res2 = res.compute()
Interpolating missing values.Interpolating missing values.
Interpolating missing values.Interpolating missing values.
Interpolating missing values.Interpolating missing values.
Interpolating missing values.Interpolating missing values.
CPU times: user 29.1 s, sys: 2.3 s, total: 31.4 s
Wall time: 26.5 s

res.visualize()

enter image description here

This interpolation is a row-wise operation (interpolation is in row=1), so any chunk(thread) show run without penalization (chunking happens between indices).

1
  • 1
    The func in Edit 1 is the same as before. Commented Oct 30, 2018 at 11:12

1 Answer 1

3

The problem here is inplace=True - with this, the call to interpolate doesn't return anything, so the output of func() is None, and you get the results you see. Generally, Dask functions should return processed data rather than trying to change data in-place. Simply remove the keyword, and things will probably work.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.