-
Notifications
You must be signed in to change notification settings - Fork 597
Move PL_curstack code from Perl_av_extend_guts to Perl_stack_grow #18014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I have no idea if anything actually calls av_store with PL_curstack but it definitely has code to handle that case. If the stack handling code is removed from Perl_av_extend_guts, then it will have to be removed from av_store too. |
Thanks for pointing that out. I'd seen it but had a brain fart and ignored its implications. Will try to dig into it over the weekend. "XPUSH in disguise" goes all the way back to Perl 5.0.0: https://perl5.git.perl.org/perl5.git/blob/a0d0e21ea6ea90a22318550944fe6cb09ae10cda:/av.c |
I am not sure about the implications of this. It may break important stuff, it may not. |
FWIW I put a conditional Perl_croak at the start of av_store, it wasn't triggered by make test. Advanced Perl Programming, 1st Ed mentions that |
Looking more at when "XPUSH in disguise" was introduced:
A wild supposition:
|
On Sat, Aug 01, 2020 at 01:31:48PM -0700, Richard Leach wrote:
[stuff about PL_curstack etc ]
I think you're getting the wrong end of the stick here.
The av == PL_curstack checking is there because in some places,
such as the @ary = split(...) optimisation, perl cheats and takes an
existing array (@ary in that example) and tells the core to temporarily use
that as the stack. In other words, it temporarily sets PL_curstack to
point to @ary (and updates PL_stack_base etc).The problem is that if
something grows @ary while this is happening, e.g.
@ary = split(/(?{ push @ary, 1 })/, ....);
Then PL_stack_sp etc may be left dangling pointing to freed memory.
Hence the checks for PL_stack_sp in av.c
…--
Diplomacy is telling someone to go to hell in such a way that they'll
look forward to the trip
|
So not only there are no comments explaining that but it also isn't being tested by our tests. |
Well volunteered! ;-) |
Thanks, Dave!
Bleugh, I see. If you have time for questions:
Also, while I'm asking likely-dumb questions about av.c:
|
Happy to help with this....once I understand it more. |
First draft at a comment block for av_extend_guts:
Comments/suggestions welcome. |
Minor tweak.
|
Started working on a test, but my patch doesn't break this (even under ASan): Devel::Peek shows that, at the end, @ary:
PL_stack_sp has somehow survived or a better/bigger test case is needed? |
On Mon, Aug 03, 2020 at 03:23:57PM -0700, Richard Leach wrote:
1. I'll try to walk through pp_split, but what is the split optimisation
in a nutshell? Saving the need to copy the resulting elements from the
stack into the array? Or is there more to it than that? (Other than not
potentially making the stack massive, if that can be avoided.)
That optimisation has been there since 5.000, well before my time, but I
assume it helps performance a bit and as you say, avoids overgrowing the
stack on a large split. Personally I don' like it - it seems too tricksy.
2. Is that the only sane way to achieve the split optimisation?
Of the top of my head, yes.
It seems to make others pay for split's cleverness. :-( e.g. `perl -e
"push @gg, 1 for (1..50); print @gg"` incurs the cost of checking
whether ***@***.*** == PL_curstack`) multiple times as @gg gets expanded.
The check in av_store is cheap(): it is guarded by if (!AvREAL(av)),
and all normal AVs are AvREAL(), so the extra PL_curstack will only be
done occasionally.
3. Do you know of anything that uses PL_curstack with av_store in
https://github.com/Perl/perl5/blob/blead/av.c#L365? As noted above,
nothing in the test suite does.
if (av == PL_curstack && key > PL_stack_sp - PL_stack_base)
PL_stack_sp = PL_stack_base + key; /* XPUSH in disguise */
There appears to be a bug here. That check in av_store() only applies
to !AvREAL() arrays, but in the case of split, the array being temporarily
used as a stack is still AvREAL(), so the check isn't done. Indeed on a
vanilla debugging perl:
$ perl5320 -e'@ary = split(/\w(?{ @ary[1000] = 1 })/, "abc")'
Split loop at -e line 1.
panic: POPSTACK
$
4. Is using the stack like this supported for XS modules or embedded use
cases, or is it core-only fun?
Its undocumented. No idea if any XS uses it.
Also, while I'm asking likely-dumb questions about av.c:
5. Is there a compelling reason nowadays why array elements are initialised in a loop, rather than potentially more efficiently with the Zero macro?
```
if (av && AvREAL(av)) {
while (tmp)
ary[--tmp] = NULL;
}
```
Probably not.
6. Related to (5): OO applications, e.g. a Mojolicious app, surely
create far more non-stack arrays than stacks, so can arrays not just be
created with _Newxz_ rather than doing the following? Or are there also
some creates-lots-of-stacks use cases?
```
Newx(*allocp, newmax+1, SV*);
ary = *allocp + 1;
tmp = newmax;
*allocp[0] = NULL; /* For the stacks */
That looks another bug to me. That line was originally
*allocp[0] = &PL_sv_undef; /* For the stacks */
, since every stack has &PL_sv_undef as its fist element. FC changed it
with
commit ce0d59f
Author: Father Chrysostomos <[email protected]>
AuthorDate: Tue Jul 2 13:07:45 2013 -0700
Commit: Father Chrysostomos <[email protected]>
CommitDate: Tue Aug 20 21:38:07 2013 -0700
[perl #7508] Use NULL for nonexistent array elems
Since something somewhere else is still setting stack[0] to &PL_sv_undef
anyway, I suspect that line is just redundant.
…--
This email is confidential, and now that you have read it you are legally
obliged to shoot yourself. Or shoot a lawyer, if you prefer. If you have
received this email in error, place it in its original wrapping and return
for a full refund. By opening this email, you accept that Elvis lives.
|
Yeah, but when the array has to be extended, av_extend_guts still checks for PL_curstack. (Sorry for not being clearer above.)
Huh. So the check in av_store() should apply to all arrays if it is to be effective? Or, in the wee, small, heat-insomnia hours I wondered if the following could work:
Still tricksy, but then only pp_split has to care?
Thanks, I'll add it to the ideas-to-try pile. |
On Tue, 11 Aug 2020 at 10:28, Richard Leach ***@***.***> wrote:
Is there a compelling reason nowadays why array elements are initialised
in a loop, rather than potentially more efficiently with the Zero macro? if
(av && AvREAL(av)) { while (tmp) ary[--tmp] = NULL; }
Probably not.
Thanks, I'll add it to the ideas-to-try pile.
Could it be because of overflow on some systems? I mean, calloc() exists
for a reason distinct from malloc(), is there a bzero equivalent that takes
the size of element independently from the number of elements? Would that
ever even matter?
Also, could it simply be faster on some systems? It would be zeroing 8
bytes at a time on a 64 bit box. I dont know enough portability of C to say
myself, but it seems conceivable that on some systems it could be a faster
way to deal with aligned data than a byte oriented operation like bzero().
cheers,
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
I don't know. I can't imagine that, but perhaps my imagination - and exposure to multiple platforms - isn't good enough. ;) Any change would definitely have a dependence on the underlying malloc/calloc implementations, as well as the ability of a compiler to recognize what's being done and convert it anyway during optimization.
It seems very conceivable that the existing code used to be faster on some platforms, don't know if it still is or if perl still supports those platforms. Modern memset()/memzero(), apart from perhaps fixing up a handful of unaligned bytes, should do at least a pointer-width of bytes at a time. On x86-64, it's very likely that even with a non-targeted build, it will do more than that (e.g. using SSE2 instructions). Glibc can memset 64 bytes at a time using AVX instructions - but I suppose that code would have to be compiled for the native/targeted architecture to get that? I should check if at -02, the likes of Clang and GCC already convert |
@iabyn - also anything that equates to this, which tweaking the av_store code wouldn't fix:
Feels like a game of whack-a-mole. ;) Presumably there are some scenarios that will always end badly, e.g. if the reference count on @ ary is dropped to zero in the But if you think the following - or some such - might be worth trying, I'm happy to find out and prepare a PR if it pans out:
|
Couldn't sleep, so I did this. No test failures, but needs tidying up before any PR. |
On Thu, Aug 20, 2020 at 01:52:53AM -0700, Richard Leach wrote:
> Indeed on a vanilla debugging perl:
>
> $ perl5320 ***@***.*** = split(/\w(?{ @ary[1000] = 1 })/, "abc")'
> Split loop at -e line 1.
> panic: POPSTACK
@iabyn - also anything that equates to this, which tweaking the _av_store_ code wouldn't fix:
```
# perl -e ***@***.*** = split(/\w(?{ undef @ary })/, "abc")'
Segmentation fault
```
Feels like a game of whack-a-mole. ;) Presumably there are some scenarios that will always end badly, e.g. if the reference count on @ ary is dropped to zero in the `?{}` block.
But if you think the following - or some such - might be worth trying, I'm happy to find out and prepare a PR if it pans out:
- When the optimization kicks in, create a new AV
- SWITCHSTACK that temp AV
- Sometime after the SWITCHSTACK has been undone, probably in the if (realarray) section, the AvARRAY/AvALLOC/AvMAX details of @ ary and the temp AV could be swapped over. (More likely, I'd swap the stash pointers over in the xpvav bodies, then swap the xpvav pointers in the SV heads.)
- mortalize the the temp AV so that the next FREETMPS will clear it up
Sorry I should have replied to this earlier, before you went to the
trouble of producing a PR.
I don't like this idea. For a start, pp_slit() is already overly-complex;
adding more complexity isn't good.
I think the best approach is to just to partially remove the optimisation.
I.e. split onto to the stack as normal, then at the end, if
OPpSPLIT_ASSIGN, then empty the array, extend it, and Copy() the stack to
AvARRAY. This means the code is still fast due to not needing to execute
padav and aassign ops. The only downsize is that that stack's high water
mark for the rest of the program's execution might be excessive.
You might want to add some more split entries to t/perf/benchmarks then
use Porting/bench.pl to see what affect any changes have on performance.
Although I'm rejecting your PR, I'll make some quick comments for your
future benefit.
1) There were about 400 lines separating tmp4array being created and it
being mortalised. That's lots of scope for something to die inbetween and
for the array to be leaked.
2) I didn't like your choice of variable names: tmp4ary, tmp1, tmp2, tmp3,
tmp4. Not very meaningful.
…--
In England there is a special word which means the last sunshine
of the summer. That word is "spring".
|
That's fair enough. I wasn't sure if partially removing the optimisation was an option, hence #18090.
Ok, happy to start work on that at the weekend.
Do you think it's worth comparing the stack size before/after and potentially shrinking the stack if it grows excessively? (For some value of "excessively".)
Thanks, I appreciate that.
Would mortalising it early have been the correct approach?
Noted. :) |
On Thu, Aug 27, 2020 at 04:26:38AM -0700, Richard Leach wrote:
Do you think it's worth comparing the stack size before/after and potentially shrinking the stack if it grows excessively? (For some value of "excessively".)
Probably not.
Would mortalising it early have been the correct approach?
Would it have to be pushed to the tmps stack as well - like below - and then taken off at the end?
```
EXTEND_MORTAL(1);
PL_tmps_stack[++PL_tmps_ix] = SvREFCNT_inc_simple_NN(av);
orig_ix = PL_tmps_ix;
Well that's effectively what sv2mortal() does anyway. The popping is
usually done via bracketing the whole thing in SAVETMPS/FREETMPs to set a
new high water mark for the temps stack then popping it back to that.
Often with the whole thing wrapped in an ENTER/LEAVE. But it all depends
on the circumstances and what else is being done at the time with the
tmps and mortals stacks. For example things pushed onto the args stack are
often mortalised, so freeing the tmps stack before any args have been
popped and embedded into an array would prematurely free them.
…--
Music lesson: a symbiotic relationship whereby a pupil's embellishments
concerning the amount of practice performed since the last lesson are
rewarded with embellishments from the teacher concerning the pupil's
progress over the corresponding period.
|
On Thu, Aug 27, 2020 at 05:04:48AM -0700, iabyn wrote:
tmps and mortals stacks.
s/mortals/save/
…--
"You're so sadly neglected, and often ignored.
A poor second to Belgium, When going abroad."
-- Monty Python, "Finland"
|
118ac16
to
dc44b36
Compare
#18232 has removed the use of SWITCHSTACK by the @ary = split(...) optimisation, so like a dog with a bone, I'd like to return to the original discussion. i.e. Two
For av_store, @iabyn noted earlier that "There appears to be a bug here. That check in av_store() only applies to !AvREAL() arrays, but in the case of split, the array being temporarily used as a stack is still AvREAL(), so the check isn't done." The remaining users of SWITCHSTACK in core (PUSHSTACKi, POPSTACK, SAVESWITCHSTACK) and on CPAN seem to be of the type Can anyone point to known usage of those two code paths now? Or should I try smoking CPAN to see if anything breaks when they are removed? |
@richardleach what do you want to do with this P.R.? |
I'd like to smoke cpan with the |
This pull request has had a "do not merge" label on it since July 2020. If it's not under active development, I recommend that we close it and open up a new ticket when and if needed. Thank you very much. |
Guess I should have pushed for merging this at the start of the 5.37 cycle to see what broke. I'll close it for now and make a note to revisit it later. |
The extend-an-existing-array code in Perl_av_extend_guts contains the following:
However, the stack probably represents a small proportion of calls to Perl_av_extend_guts.
At least nowadays, the
via av_store()?
comment seems to be a red herring, as all in-core instances where (av == PL_curstack) seem to originate from Perl_stack_grow.This commit therefore moves the PL_stack* allocations to Perl_stack_grow.
Local test builds of this commit were successful, but this commit should be: