Skip to content

Conversation

@sinhrks
Copy link
Member

@sinhrks sinhrks commented Jul 30, 2016

  • tests added / passed
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry

When ufunc is applied to sparse, it is not applied to fill_value. Thus results are incorrect.

on current master:

np.abs(pd.SparseArray([1, -2, -1], fill_value=-2))
# [1.0, -2, 1.0]
# Fill: -2
# IntIndex
# Indices: array([0, 2], dtype=int32)

np.add(pd.SparseArray([1, -2, -1], fill_value=-2), 1)
# [2.0, -2, 0.0]
# Fill: -2
# IntIndex
# Indices: array([0, 2], dtype=int32)

cc @gfyoung

@sinhrks sinhrks added Bug Sparse Sparse Data Type Compat pandas objects compatability with Numpy or Python functions labels Jul 30, 2016
@sinhrks sinhrks added this to the 0.19.0 milestone Jul 30, 2016
@codecov-io
Copy link

codecov-io commented Jul 30, 2016

Current coverage is 85.27% (diff: 93.33%)

Merging #13853 into master will decrease coverage by <.01%

@@             master     #13853   diff @@
==========================================
  Files           139        139          
  Lines         50020      50031    +11   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42657      42666     +9   
- Misses         7363       7365     +2   
  Partials          0          0          

Powered by Codecov. Last update 97de42a...a14f573

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this result is correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because assert_sp_array_equal compares sparse internal representation, it is for prepare correct internal repr. You can see the result is correct from its dense repr.

# test case
sparse = pd.SparseArray([1, -1, 2, -2], fill_value=1)
abs(sparse).to_dense()
# array([ 1.,  1.,  2.,  2.])
# result
pd.SparseArray([1, 2, 2], sparse_index=sparse.sp_index, fill_value=1).to_dense()
# array([ 1.,  1.,  2.,  2.])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, good to know. I wasn't 100% clear on how the sparse comparison worked. Thanks!

@gfyoung
Copy link
Member

gfyoung commented Aug 1, 2016

LGTM

cc @jreback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put a comment here of what this is doing

@jreback
Copy link
Contributor

jreback commented Aug 1, 2016

I understand why this is needed, but it feels a tad unnatural. I am not sure a user will be expecting that the fill value will have the ufunc be applied here. Can we add a section to the docs showing this?

@sinhrks
Copy link
Member Author

sinhrks commented Aug 1, 2016

Sure, added small section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Compat pandas objects compatability with Numpy or Python functions Sparse Sparse Data Type

4 participants