How are pandas objects able to be passed into numpy functions?

Question

Curious how this works under the hood. Did numpy just build functionality to handle pandas objects or is there something else going on here?

data = pandas.Series([1,2,3,4])
numpy.sqrt(data) # returns pandas.Series

Mustafa Aydın · Accepted Answer · 2021-04-12 18:40:20Z

2

In addition to overriding __array__ as the other answer mentions:

pd.Series: it implements __array_ufunc__ so it overrides the ufunc behaviour with that, including how the output should look like.

pd.DataFrame: it doesn't implement that method, but implements __array_wrap__; and this gives the control for how output should look like.

See here for the output type determination. Pandas docs also mentions the series case.

answered Apr 12, 2021 at 18:40

Mustafa Aydın

18.4k4 gold badges21 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

hpaulj · Accepted Answer · 2021-04-12 18:44:19Z

The dataframe (and Series) has an __array__ method:

In [138]: df
Out[138]: 
   Account1  Account2  m_solution
0       150        18 -117.857143
1       130      1200  104.586466
2       150        18 -117.857143
3       106      1200   88.793262
4       150        18 -117.857143
5       170      1200  127.810219
6       150       138   -6.250000
7      1056      1200   67.404255
In [139]: df.__array__()
Out[139]: 
array([[ 150.        ,   18.        , -117.85714286],
       [ 130.        , 1200.        ,  104.58646617],
       [ 150.        ,   18.        , -117.85714286],
       [ 106.        , 1200.        ,   88.79326187],
       [ 150.        ,   18.        , -117.85714286],
       [ 170.        , 1200.        ,  127.81021898],
       [ 150.        ,  138.        ,   -6.25      ],
       [1056.        , 1200.        ,   67.40425532]])

Equivalently you can get the array with:

In [140]: df.values
Out[140]: 
array([[ 150.        ,   18.        , -117.85714286],
       [ 130.        , 1200.        ,  104.58646617],
       [ 150.        ,   18.        , -117.85714286],
       [ 106.        , 1200.        ,   88.79326187],
       [ 150.        ,   18.        , -117.85714286],
       [ 170.        , 1200.        ,  127.81021898],
       [ 150.        ,  138.        ,   -6.25      ],
       [1056.        , 1200.        ,   67.40425532]])
In [141]: df.to_numpy()
Out[141]: 
array([[ 150.        ,   18.        , -117.85714286],
       [ 130.        , 1200.        ,  104.58646617],
       [ 150.        ,   18.        , -117.85714286],
       [ 106.        , 1200.        ,   88.79326187],
       [ 150.        ,   18.        , -117.85714286],
       [ 170.        , 1200.        ,  127.81021898],
       [ 150.        ,  138.        ,   -6.25      ],
       [1056.        , 1200.        ,   67.40425532]])

I think pandas docs encourage the use of to_numpy.

The data of the frame is stored in one or more arrays (depending on dtypes). Whether the array you get these ways is actually that array, a view or copy may vary.

Code for __array__

Signature: df.__array__(dtype=None) -> 'np.ndarray'
Docstring: <no docstring>
Source:   
    def __array__(self, dtype=None) -> np.ndarray:
        return np.asarray(self._values, dtype=dtype)

See also Series.__array__. It's a bit different.

and Series.__array_wrap__:

S.__array_wrap__(
    result: 'np.ndarray',
    context: 'Optional[Tuple[Callable, Tuple[Any, ...], int]]' = None,
)
Docstring:
Gets called after a ufunc and other functions.

Parameters
----------
result: np.ndarray
    The result of the ufunc or other function called on the NumPy array
    returned by __array__

Thanks, seems like numpy functions operate on the array method but how does it preserve the return type? E.g. returning pandas.Series instead of numpy.array?
Looking at the methods of a Series, I see __array_wrap__. A Series isn't a subclass of ndarray, but it appears to have many of the methods that make it behave as one.

Collectives™ on Stack Overflow

How are pandas objects able to be passed into numpy functions?

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related