Skip to content

optimize sort by inlining comparison functions #17608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 9, 2020

Conversation

xenu
Copy link
Member

@xenu xenu commented Mar 2, 2020

This makes special-cased forms such as sort { $b <=> $a }
even faster.

Also, since this commit removes PL_sort_RealCmp, it fixes the
issue with nested sort calls mentioned in #16129

PS. Because of b578cab commit (normalize indentation), I recommend using "hide whitespace changes" when reviewing.

@xenu
Copy link
Member Author

xenu commented Mar 3, 2020

Some benchmarks (the source code is here), "fastblead" is the one with my changes:

Key:
    Ir   Instruction read
    Dr   Data read
    Dw   Data write
    COND conditional branches
    IND  indirect branches
    _m   branch predict miss
    _m1  level 1 cache miss
    _mm  last cache (e.g. L3) miss
    -    indeterminate percentage (e.g. 1/0)

The numbers represent relative counts per loop iteration, compared to
/home/xenu/blead/bin/perl5.31.10 at 100.0%.
Higher is better: for example, using half as many instructions gives 200%,
while using twice as many gives 50%.

nsort_asc
numeric sort ascending

       /home/xenu/blead/bin/perl5.31.10 /home/xenu/fastblead/bin/perl5.31.10
       -------------------------------- ------------------------------------
    Ir                           100.00                               115.45
    Dr                           100.00                               113.63
    Dw                           100.00                               139.99
  COND                           100.00                               100.00
   IND                           100.00                           1000090.00

COND_m                           100.00                                88.89
 IND_m                           100.00                               100.00

 Ir_m1                           100.00                               100.00
 Dr_m1                           100.00                               100.00
 Dw_m1                           100.00                               100.00

 Ir_mm                           100.00                               100.00
 Dr_mm                           100.00                               100.00
 Dw_mm                           100.00                               100.00

nsort_desc
numeric sort descending

       /home/xenu/blead/bin/perl5.31.10 /home/xenu/fastblead/bin/perl5.31.10
       -------------------------------- ------------------------------------
    Ir                           100.00                               119.64
    Dr                           100.00                               120.00
    Dw                           100.00                               153.32
  COND                           100.00                               104.97
   IND                           100.00                           2000080.00

COND_m                           100.00                               100.00
 IND_m                           100.00                               100.00

 Ir_m1                           100.00                               100.00
 Dr_m1                           100.00                               100.00
 Dw_m1                           100.00                               100.00

 Ir_mm                           100.00                               100.00
 Dr_mm                           100.00                               100.00
 Dw_mm                           100.00                               100.00

ssort_asc
string sort ascending

       /home/xenu/blead/bin/perl5.31.10 /home/xenu/fastblead/bin/perl5.31.10
       -------------------------------- ------------------------------------
    Ir                           100.00                               100.00
    Dr                           100.00                               101.25
    Dw                           100.00                               100.00
  COND                           100.00                               100.00
   IND                           100.00                               150.00

COND_m                           100.00                               100.00
 IND_m                           100.00                               100.00

 Ir_m1                           100.00                               100.00
 Dr_m1                           100.00                               100.00
 Dw_m1                           100.00                               100.00

 Ir_mm                           100.00                               100.00
 Dr_mm                           100.00                               100.00
 Dw_mm                           100.00                               100.00

ssort_desc
string sort descending

       /home/xenu/blead/bin/perl5.31.10 /home/xenu/fastblead/bin/perl5.31.10
       -------------------------------- ------------------------------------
    Ir                           100.00                               103.35
    Dr                           100.00                               106.23
    Dw                           100.00                               106.06
  COND                           100.00                               100.00
   IND                           100.00                               200.00

COND_m                           100.00                                99.99
 IND_m                           100.00                               100.00

 Ir_m1                           100.00                               100.00
 Dr_m1                           100.00                               100.00
 Dw_m1                           100.00                               100.00

 Ir_mm                           100.00                               100.00
 Dr_mm                           100.00                               100.00
 Dw_mm                           100.00                               100.00

AVERAGE

       /home/xenu/blead/bin/perl5.31.10 /home/xenu/fastblead/bin/perl5.31.10
       -------------------------------- ------------------------------------
    Ir                           100.00                               109.00
    Dr                           100.00                               109.82
    Dw                           100.00                               120.87
  COND                           100.00                               101.20
   IND                           100.00                               342.82

COND_m                           100.00                                96.97
 IND_m                           100.00                               100.00

 Ir_m1                           100.00                               100.00
 Dr_m1                           100.00                               100.00
 Dw_m1                           100.00                               100.00

 Ir_mm                           100.00                               100.00
 Dr_mm                           100.00                               100.00
 Dw_mm                           100.00                               100.00
@xenu
Copy link
Member Author

xenu commented Mar 3, 2020

Here are some non-scientific benchmarks of numeric sort (source code):

> time ~/blead/bin/perl5.31.10 num_sort.pl

real    0m2.988s
user    0m2.875s
sys     0m0.047s
> time ~/blead/bin/perl5.31.10 num_sort.pl

real    0m2.984s
user    0m2.922s
sys     0m0.047s
> time ~/blead/bin/perl5.31.10 num_sort.pl

real    0m2.982s
user    0m2.953s
sys     0m0.000s
> time ~/fastblead/bin/perl5.31.10 num_sort.pl

real    0m2.485s
user    0m2.453s
sys     0m0.016s
> time ~/fastblead/bin/perl5.31.10 num_sort.pl

real    0m2.481s
user    0m2.406s
sys     0m0.047s
> time ~/fastblead/bin/perl5.31.10 num_sort.pl

real    0m2.519s
user    0m2.438s
sys     0m0.047s

and reverse numeric sort (source code):

> time ~/blead/bin/perl5.31.10 reverse_num_sort.pl

real    0m3.198s
user    0m3.141s
sys     0m0.016s
> time ~/blead/bin/perl5.31.10 reverse_num_sort.pl

real    0m3.206s
user    0m3.172s
sys     0m0.016s
> time ~/blead/bin/perl5.31.10 reverse_num_sort.pl

real    0m3.197s
user    0m3.156s
sys     0m0.016s
> time ~/fastblead/bin/perl5.31.10 reverse_num_sort.pl

real    0m2.578s
user    0m2.516s
sys     0m0.047s
> time ~/fastblead/bin/perl5.31.10 reverse_num_sort.pl

real    0m2.577s
user    0m2.531s
sys     0m0.016s
> time ~/fastblead/bin/perl5.31.10 reverse_num_sort.pl

real    0m2.566s
user    0m2.484s
sys     0m0.063s

As you can see, on my machine the optimized numeric sort is ~15% faster and reverse numeric sort is ~20% faster.

@xenu xenu force-pushed the xenu/faster-sort branch from 932ef98 to 8d7c259 Compare March 3, 2020 03:11
@xenu
Copy link
Member Author

xenu commented Mar 3, 2020

I force pushed to improve a comment in embed.fnc.

Copy link
Contributor

@khwilliamson khwilliamson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xenu xenu force-pushed the xenu/faster-sort branch from 8d7c259 to 8bc8473 Compare March 4, 2020 23:10
@@ -179,7 +175,7 @@ typedef SV * gptr; /* pointers in our lists */
*/


static IV
PERL_STATIC_FORCE_INLINE IV __attribute__always_inline__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here and another place, the attribute__always_inline is a relic of an earlier version, and is generated by the macro. Most functions declared as such in this file don't have an attribute specified explicitly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added sortsv_flags_impl to embed.fnc and removed the attribute from its definition but I would rather not do the same for dynprep().

dynprep() takes arguments of types that exist only inside pp_sort.c, so I'd have to either move the typedefs to perl.h (which is an obfuscation) or remove them (which is beyond the scope of this PR).

@dur-randir
Copy link
Member

I observe ~5% overall improvement, which is very nice for such a used function.

xenu added 7 commits March 8, 2020 02:46
It's the same thing as PERL_STATIC_INLINE but it also adds
__attribute__(always_inline) or __forceinline if the compiler
supports that.
According to perlhack, code should be indented with four spaces.

This commit doesn't contain any functional changes. If you're
seeing it in "git blame" output, try using -w switch, it will
hide whitespace-only changes.
This will make the future changes a bit easier.
This makes special-cased forms such as sort { $b <=> $a }
even faster.

Also, since this commit removes PL_sort_RealCmp, it fixes the
issue with nested sort calls mentioned in gh Perl#16129
@xenu xenu force-pushed the xenu/faster-sort branch from 8bc8473 to a2dde53 Compare March 9, 2020 04:30
@khwilliamson khwilliamson merged commit 044d25c into Perl:blead Mar 9, 2020
@hvds
Copy link
Contributor

hvds commented Mar 11, 2020

@xenu I note that this is now generating warning: unused parameter 'flags' on the renamed S_sortsv_flags_impl. It is not immediately clear to me whether it just wants a PERL_UNUSED_ARG or whether the interface should actually remove flags from _impl and all the specific sort functions.

@jkeenan
Copy link
Contributor

jkeenan commented Mar 12, 2020

@xenu I note that this is now generating warning: unused parameter 'flags' on the renamed S_sortsv_flags_impl. It is not immediately clear to me whether it just wants a PERL_UNUSED_ARG or whether the interface should actually remove flags from _impl and all the specific sort functions.

As reported in #17632

@xenu xenu deleted the xenu/faster-sort branch May 31, 2020 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants
close