I am trying to understand exactly how numpy and pandas interact. In particular, a pandas.Series object is an ndarray with labels according to the docstring. numpy methods seem to work just fine on these types of objects.. is there casting somewhere under the hood? The best I have been able to dig up is that numpy.asanyarray() is called to convert the pandas.Series to an ndarray in some of the numpy functions. Is there anything else happening internally?
1 Answer
No, the pandas containers are not numpy.ndarray objects, strictly speaking. That is, they do not inherit from them. IOW:
In [5]: import pandas as pd
In [6]: df = pd.DataFrame()
In [7]: s = pd.Series()
In [8]: import numpy as np
In [9]: isinstance(df, np.ndarray)
Out[9]: False
In [10]: isinstance(s, np.ndarray)
Out[10]: False
Or, stated more directly:
In [12]: issubclass(pd.DataFrame, np.ndarray), issubclass(pd.Series, np.ndarray)
Out[12]: (False, False)
Simply put, these containers wrap numpy.ndarray objects, and expose many of the same methods. There is no casting going on. Casting is not a very useful concept in a language that uses duck-typing like Python.
._dataattribute of Pandas objects that does most of the actual "wrapping" around NumPy arrays.