2

simply asking, why this std are different?

>>> import numpy
>>> import pandas as pd
>>>
>>> arr = [10, 386, 479, 627, 20, 523, 482, 483, 542, 699, 535, 617, 577, 471, 615, 583, 441, 562, 5
63, 527, 453, 530, 433, 541, 585, 704, 443, 569, 430, 637, 331, 511, 552, 496, 484, 566, 554, 472, 3
35, 440, 579, 341, 545, 615, 548, 604, 439, 556, 442, 461, 624, 611, 444, 578, 405, 487, 490, 496, 3
98, 512, 422, 455, 449, 432, 607, 679, 434, 597, 639, 565, 415, 486, 668, 414, 665, 763, 557, 304, 4
04, 454, 689, 610, 483, 441, 657, 590, 492, 476, 437, 483, 529, 363, 711, 543]
>>> elements = numpy.asarray(arr)
>>> arr_D = {"A":arr}
>>> df = pd.DataFrame(arr_D)
>>>
>>> print(numpy.std(elements, axis=0))
118.51857760182034
>>> print(numpy.std(df['A']))
118.5185776018204
>>> print(df['A'].std(axis=0))
119.15407050904474

Is it problem with my comprehension of topic? As far as i know there pandas use numpy. datafram std and numpy std of same column should be same.

Is it a bug?

0

2 Answers 2

2

pandas uses the Unbiased estimation by default and numpy does not by default, So neither of them are incorrect they use different approach to calculate std
To make numpy use Unbiased estimation pass ddof=1 to std

>>> import numpy
>>> import pandas

>>> df = pandas.DataFrame(numpy.random.rand(100))

>>> numpy.std(df[0]) #default std biased estimation
0.2877601644414916

>>> numpy.std(df[0],ddof=1) #with ddof=1 i.e unbiased estimation
0.2892098469889083

>>> df[0].std() # unbiased estimation match with numpy std with ddof=1
0.2892098469889083


Sign up to request clarification or add additional context in comments.

Comments

2

Numpy uses biased std and pandas unbiased. In other words, numpy divides by n (number of elements) and pandas divides by n-1. Try following to see that if matches:

print(df['A'].std(axis=0)/np.sqrt(len(arr))*np.sqrt((len(arr)-1)))
#118.51857760182033

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.