Everything I can find indicates that dask map_partitions should return a dask dataframe object. But the following code snippet and the corresponding output (using logzero) does not. (note -- calc_delta returns a np.array of floats).
352 logger.debug(type(self.dd))
353 self.dd = self.dd.map_partitions(
354 lambda df: df.assign(
355 duration1=lambda r: calc_delta(r['a'], r['b'])
356 , duration2=lambda r: calc_delta(r['a'], r['c'])
357 )
358 ).compute(scheduler='processes')
359 logger.debug(type(self.dd))
[D 200316 19:19:28 exploratory:352] <class'dask.dataframe.core.DataFrame'>
[D 200316 19:19:43 exploratory:359] <class 'pandas.core.frame.DataFrame'>
All the guidance (with lots of hacking) suggests that this is the way to add (logical) columns to the partitioned dask dataframe. But not if it doesn't actually return a dask dataframe.
What am I missing?