I updated to v.0.2.8 today, and I noticed my code to be much slower than before. This seem to be related to the inclusion of the lqrt test in the results.
- test 1: virtual env with python 3.7.5 pandas 0.24.0 dabest 0.2.7
import numpy as np
import pandas as pd
import dabest
np.random.seed(1234)
df = pd.DataFrame({'Group1':np.random.normal(loc=0, size=(1000,)),
'Group2':np.random.normal(loc=1, size=(1000,))})
test = dabest.load(df, idx=['Group1','Group2'])
%time print(test.mean_diff)
DABEST v0.2.7
Good morning!
The current time is Tue Dec 31 11:46:00 2019.
The unpaired mean difference between Group1 and Group2 is 1.03 [95%CI 0.941, 1.11].
The two-sided p-value of the Mann-Whitney test is 2.63e-97.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
To get the results of all valid statistical tests, use .mean_diff.statistical_tests
CPU times: user 558 ms, sys: 5.83 ms, total: 564 ms
Wall time: 564 ms
- test 2: virtual env with python 3.7.5 pandas 0.25.3 dabest 0.2.8
import numpy as np
import pandas as pd
import dabest
np.random.seed(1234)
df = pd.DataFrame({'Group1':np.random.normal(loc=0, size=(1000,)),
'Group2':np.random.normal(loc=1, size=(1000,))})
test = dabest.load(df, idx=['Group1','Group2'])
%time print(test.mean_diff)
DABEST v0.2.8
Good morning!
The current time is Tue Dec 31 11:47:09 2019.
The unpaired mean difference between Group1 and Group2 is 1.03 [95%CI 0.941, 1.11].
The two-sided p-value of the Mann-Whitney test is 2.63e-97.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
To get the results of all valid statistical tests, use .mean_diff.statistical_tests
CPU times: user 2.46 s, sys: 8.69 ms, total: 2.47 s
Wall time: 2.47 s
Would it be possible to delay doing the statistical tests to when effect_size.statistical_tests is called instead of calculating all the tests a priori?
I updated to v.0.2.8 today, and I noticed my code to be much slower than before. This seem to be related to the inclusion of the
lqrttest in the results.Would it be possible to delay doing the statistical tests to when
effect_size.statistical_testsis called instead of calculating all the tests a priori?