-
-
Couldn't load subscription status.
- Fork 19.2k
PERF: uses bincount instead of hash table in categorical value counts #10874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think the soln in the issues is faster than this no? |
|
I get but this does not check for nulls, and the index is not categorical. |
|
with |
24f1e3f to
436e96e
Compare
|
wow, this does even better! |
|
ping when green |
|
Maybe worth adding a benchmark? |
|
I will add benchmark later today |
436e96e to
855b804
Compare
|
added the benchmark, all green. |
asv_bench/benchmarks/categoricals.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should have only 1 action per timing function (so make 2 functions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should it be only 1 action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You get a timing per function. So if you want to track performance of both with dropna True and False, it has to be in two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added separate calls
855b804 to
c5a47e3
Compare
PERF: uses bincount instead of hash table in categorical value counts
|
thank you sir! |
closes #10804
on branch: