Skip to content

Dedicated SV copying code in place of Perl_sv_setsv_flags #23202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: blead
Choose a base branch
from

Conversation

richardleach
Copy link
Contributor

@richardleach richardleach commented Apr 15, 2025

Perl_sv_setsv_flags is the heavyweight function for assigning the value(s) of
a source SV to a destination SV. It contains many branches for preparing the
destination SV prior to assignment. However:

  • If the destination SV has just been created, much of that logic isn't needed.
  • When cloning a SV, simple assignments (particularly IVs and PVs) dominate.

This set of commits:

  • Extracts the "is this CoWable?" test from Perl_sv_setsv_flags into a macro.
  • Adds Perl_sv_freshcopy_flags and two static helper functions.
  • Modifies Perl_newSVsv_flags and Perl_sv_mortalcopy_flags to use them.
  • Standardizes a number of call sites that did their own things but really
    should use Perl_newSVsv_flags or Perl_sv_mortalcopy_flags.

Using perl's test harness as a guide:

  • Bodyless code handles 45% of calls to Perl_newSVsv_flags and
    57% of calls to Perl_sv_mortalcopy_flags.
  • The SVt_PV/SVp_POK code handles 32% of calls to
    Perl_newSVsv_flags and 36% of calls to Perl_sv_mortalcopy_flags.
  • S_sv_freshcopy_flags code handles 95% of the remainder in
    Perl_newSVsv_flags and 91% of the remainder in to Perl_sv_mortalcopy_flags.

With these changes compared with a build of blead:

  • perl -e 'for (1..100_000) { my $x = [ (1) x 1000 ]; }' runs 10% faster

  • perl -e 'for (1..100_000) { my $x = [ ("Perl") x 250 ]; }' runs 45% faster


  • This set of changes does require a perldelta entry and I'll write one post-merge.
@richardleach richardleach added the defer-next-dev This PR should not be merged yet, but await the next development cycle label Apr 15, 2025
@Leont
Copy link
Contributor

Leont commented Apr 29, 2025

Cloning is rather unfortunate choice of words, given that it has a very specific meaning in our codebase that is quite different from what this PR is about. Renaming the PR may be helpful.

@richardleach richardleach changed the title Dedicated SV cloning code in place of Perl_sv_setsv_flags Dedicated SV copying code in place of Perl_sv_setsv_flags Apr 29, 2025
@richardleach richardleach force-pushed the S_sv_freshcopy_flags branch 2 times, most recently from c392526 to 91c2b99 Compare May 8, 2025 16:54
@richardleach
Copy link
Contributor Author

I've made a lot of changes following earlier comments - thanks for those - and have finally force-pushed.

These changes aren't complete. For example:

  • Measured performance seems worse than in the PR version, so I need to look into that
  • Might change sflag handling/ SvFLAGS(dsv) setting
  • Not settled on struct membet initialisation
  • Might still rename the function that is currently Perl_newSVsv_flags and have newSVsv_flags be a macro that checks (ssv) before calling the sv.c function.
sv.c Outdated
Comment on lines 4833 to 4840
if (sflags & SVf_IsCOW) {
sv_buf_to_rw(ssv);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear if there's a bug here carried over from Perl_sv_setsv_flags.

If (!(sflags & SVf_IsCOW)) above, sflags isn't updated to reflect that SVf_Is_COW is now set on ssv, but why would we not want to do sv_buf_to_rw(ssv) in that situation?

Copy link
Contributor

@bulk88 bulk88 Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to make another PR about a very sketchy function in Perl 2003-2025 with 1 parent caller, its name is sv_setsv_cow(). I might be mistaken, but your code highlight, and what I think 1 branch in sv_setsv_cow() is doing, in my eyes, is a virgin never been HW ROed buffer, that for the first time will be getting HW RO perms added, therefore no ring 0 call is needed to mark it RW perms. verify I am making any sense, this feature isn't available on Windows and I don't see the point/personal necessity to port it to Windows. It would take me 2-5 hours to do that, somewhere between easy and monotonous/unchallenging on the difficulty scale.

Edit:

Since my Win32::API::Callback for many years already has MsMalloc() blocks that are inter-ithread safe, clone safe, Perl ref counted, executable perm-ed (x86/x64 mach code) read only perm-ed (DEP/being a good citizen). If I wrote such an API, I would probably write the patch for WinPerl to have HW RO COW buffers for production grade default WinPerl stable. Win32::API::Callback flips the RO RW perm bit manually, with minor help from the MS-only Public API HeapCreate().

AFAIK Father C's (???) #define PERL_DEBUG_READONLY_COW is un-usable in a production stable perl b/c extreme memory waste. I think it give each 'Newx/malloc()block an entire 4096 byte long page valgrind/dr memory style. Win 7's undocumented turn-key HW RO malloc feature,RtlProtectHeap()has no !!! memory/RAM overhead, since the blocks still have WinOS's default 8 byte long headers. RtlProtectHeap()'s runtime CPU overhead, ...., well ....., that is a different story. RtlProtectHeap()internally is a large tree of if/else branches and some loops. B/c its closed source, I can't describe exactly what all the asm code logic does. ButRtlProtectHeap()is very simple internally, the only 2 functions it is capable of calling areZwQueryVirtualMemory()andZwProtectVirtualMemory()which areinterrupt 0x2e` system calls.

More realistically, Perl 5 needs to just verbatim copy Chrome Browser's hardware RO COW/Heap implementation. I'm sure by now Webkit and Mozilla also use Chrome's HW RO COW/Heap design.

Remember Chrome's design is penetration tested and extensively fuzzed, to prevent rootkit-ed/exploited chroot jailed OS processes aka same origin browser tabs aka V8 isolates (my_perl in Perl-ese) aka JS VMs instances, where blackhat javascript code escaped the javascript sandbox and now has C lang/Asm machine code control over an OS thread in a chroot jailed OS process.

Chrome's design is simple, 2 mmaps at different address ranges, or in 2 different OS processes, to the same phy RAM pages, or to the same pagefile virtual RAM pages. The JS runtime/runloop engine only knows the address of RO-perm-ed view or address range of mmap() object. The hypervisor, which is a same process different OS thread, or a different OS process, knows the location of BOTH address ranges, the RO public address range, and the RW secret address range. The "hypervisor" (in Perl-ese that means the "proprietary undocumented internals of sv.c and libperl.so"), so the "hypervisor", gets called from untrusted JS bytecode, to RC++ or RC-- an JS data-ish object or a JS Function object, and the hypervisor uses the secret RW memory window to change the reference count on a ref counted/color counted JS object, Untrusted byte code can read the RC and other JS object header fields (SV * header), but untrusted JS byte code, even if exploited to machine code/C level, can't rewrite a COWed/RO/immutable/static/const JS object.

Since libperl.so and PP code fundamentally, as designed, are a https://en.wikipedia.org/wiki/Arbitrary_code_execution exploit, P5P's only duty is user error/dev error/idiot mitigation. Perl and P5P have minimal security obligation to think of, P5P would just embed the RO ptr mmap to RW ptr mmap conversion equation into macro #define share_hek_hek(hek) and mission accomplished,

If a user has suffered damages from Perl or /bin/sh being an arbitrary code execution exploit, please send all legal threats to https://kernel.org/ and https://www.opengroup.org and https://canonical.com/ .

Since around the 5.10 era, `Perl_sv_setsv_flags` has unconditionally set
`SvPOK_only(dsv)` in the `SVp_POK` branch. The associated comment reads:

    /* Whichever path we take through the next code, we want this true,
       and doing it now facilitates the COW check.  */

Things have changed since 5.10 though, in particular using the `‎SVf_POK`
to distinguish between a value that started off as a string from one that
was originally an integer/float and later stringified.

This commit:
* Removes the `SvPOK_only(dsv)` in favour of `SvOK_off(dsv)` and hoisting
  the copying of `sflags` over.
* Transforms the subsequent now-redundant `SVf_POK` toggles into asserts
  (to help reduce) the chance of inadvertent behaviour changes.
`Perl_sv_setsv_flags` is a hot function that contains liberal sprinklings
of `SvOK_off()`. This commit changes two instances, where the operand SV
cannot possibly be using the OOK hack, to do direct flag twiddling instead.

`SvOK_off()` does two things:
1. Toggles off some flags:
    SvFLAGS(sv) &= ~(SVf_OK|SVf_IVisUV|SVf_UTF8)

2. Checks for use of the OOK hack and undoes it:
    ((void)(SvOOK(sv) && (sv_backoff(sv),0)))

At least some compilers seem to struggle to figure out when `SvOOK(sv)`
cannot be true and to then elide the call to `sv_backoff()`. This is
desirable when:

1. ssv & dsv are both lower types than SVt_PV and cannot support OOK
2. inside a block following a conditional check that OOK is not in use

In the two cases identified, the flag toggling is now done explicitly.
Perl_sv_freshcopy_flags creates a fresh SV that contains the
values of it source SV argument. It's like calling `new_SV(dsv)`
followed by `sv_setsv_flags(dsv, ssv, flags`, but is optimized
for a brand new destination SV and the most common code paths.

The intended initial users for this new function were:
* Perl_sv_mortalcopy_flags (still in sv.c)
* Perl_newSVsv_flags (now a simple function in sv_inline.h)

Perl_sv_freshcopy_flags handles the following cases:
* SVt_NULL
* SVt_IV
* SVt_NV
* SVt_PV
* SVt_LAST

For everything else, it calls S_sv_freshcopy_PVxx which handles:
* SVt_INVLIST
* SVt_PVIV
* SVt_PVNV
* SVt_PVMG - with no GET magic

For everything else, there's a fall back to sv_setsv_flags.

S_sv_freshcopy_POK is a dedicated helper for string swipe/COW/copy
logic and is called from both Perl_sv_freshcopy_flags and
S_sv_freshcopy_PVxx.

With these changes compared with the previous commit:

* `perl -e 'for (1..100_000_0) { my $x = { (1) x 1000 }; }'` runs about 20% faster

* `perl -e 'for (1..100_000_0) { my $x = { ("Perl") x 250 }' runs about 40% faster

* `perl -e 'for (1..100_000_0) { my $x = { a => 1, b => 2, c => 3, d => 4, e => 5 }; }'`
   is a touch faster, but within the margin for error

* `perl -e 'for (1..100_000_0) { my $x = { a => "Perl", b => "Perl", c => "Perl", d => "Perl", e => "Perl" } ; }'`
   runs about 17% faster
Besides using the just-introduced faster path for SV copying, this
allows the check for SV_GMAGIC to be pushed into the called function
without having to worry about SV leaks.

Two additional micro-optimizations are also in this commit:
* A pointer to xav_fill is cached for use in the loop. This can
  be used directly to update AvFILLp(av), rather than having to
  get there from av's SV* each time.

* The value of the loop iterator, i, is directly written into
  xav_fill, rather than getting the value in that slot,
  incrementing it (to get the same value as i), and writing it back.
@richardleach richardleach force-pushed the S_sv_freshcopy_flags branch from 05d5efa to bf95ea0 Compare June 8, 2025 22:28
#define S_SvPV_shared_hkey_or_CoWable(ssv, dsv, sflags, cur, len) \
(sflags & SVf_IsCOW \
? (!len || \
( (CHECK_COWBUF_THRESHOLD(cur,len) || SvLEN(dsv) < cur+1) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK the answer id have to compile this brach and do heavy single step to learn an tech correct answer.

Amateur question, shouldn't the bounds check be SvLEN(dsv) < cur+2) not SvLEN(dsv) < cur+1)?
dsv can't cow propogate further unless minimum 2 invisible bytes. 1 invis byte is '\0'.

/* many COW "copies" are possible. */ \
&& CowREFCNT(ssv) != SV_COW_REFCNT_MAX )) \
: ( (sflags & CAN_COW_MASK) == CAN_COW_FLAGS \
&& !(SvFLAGS(dsv) & SVf_BREAK) \
Copy link
Contributor

@bulk88 bulk88 Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is SVf_BREAK on a SvPINok()? ive never grepped an answer from the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not followed the actual code, but perlhacktips.pod mentions use:

  • during global destruction
  • on DEBUGGING builds to help detect leaked SVs

https://github.com/Perl/perl5/blob/blead/pod/perlhacktips.pod#leaked-sv-spotting-sv_mark_arenas-and-sv_sweep_arenas

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in a production stable/default perl build, expresssion !(SvFLAGS(dsv) & SVf_BREAK) means to make a new Newx() block and do a Copy() if sv_setsv() is calling during global destruction phase? Correct? IDK enough.

&& !(SvFLAGS(dsv) & SVf_BREAK) \
&& CHECK_COW_THRESHOLD(cur,len) && cur+1 < len \
&& (CHECK_COWBUF_THRESHOLD(cur,len) || SvLEN(dsv) < cur+1) \
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, +1 logic or +2 logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just extracting this pre-existing condition code. I don't want to change it in this PR. (I suspect it's probably okay as-is, but would have to sit down with lots of post-its to be sure.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I look around the commit, i see where you copy pasted it from, so I guess its fine for this commit. Any flaws or bugs or concerns about +1 or +2 logic are not part of the PR and are probably already discussed in a random stalled open p5p gh ticket.

#else
#define S_SvPV_shared_hkey_or_CoWable(ssv, dsv, sflags, cur, len) \
( sflags & SVf_IsCOW \
&& !(SvFLAGS(dsv) & SVf_BREAK) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesnt look like the usual macro combo to detect SVPV HEK*s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's straight out of Perl_sv_setsv_flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is correct, testing fCOW by itself, in isolation, confirms its 1 of 3 things, COW STATIC, COW 255 or COW HEK.

2 permutations of "dont Safefree() it" are ignored, SvLEN 0 and OOK_on. But we dont care about those 2 permutation at this particular LOC.

@@ -4306,25 +4328,19 @@ Perl_sv_setsv_flags(pTHX_ SV *dsv, SV* ssv, const I32 flags)
/* Should preserve some dsv flags - at least SVs_TEMP, */
/* so cannot just set SvFLAGS(dsv) = new_dflags */
/* First clear the flags that we do want to clobber */
(void)SvOK_off(dsv);
SvFLAGS(dsv) &= ~SVTYPEMASK;
SvFLAGS(dsv) &= ~(SVTYPEMASK|SVf_OK|SVf_IVisUV);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(void)SvOK_off(dsv); is a dtor-ing function call involving a malloc block, are you sure this change is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 99.9% sure in this bodyless SVs code block. The only thing these SVs should be capable of pointing to is an RV to another SV head.

SvOK_off is this:
SvFLAGS(sv) &= ~(SVf_OK|SVf_IVisUV|SVf_UTF8), SvOOK_off(sv))

with SvOOK_off being:
((void)(SvOOK(sv) && (sv_backoff(sv),0)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, 80-150 lines upwards, I see the if() test for type <= IV or <= NV or <= BODYLESS. my 1st comment about (void)SvOK_off(dsv) is N/A.

(void)SvPOK_only(dsv);
(void)SvOK_off(dsv);
SvFLAGS(dsv) |= sflags &
(SVf_POK|SVp_POK|SVf_IOK|SVp_IOK|SVf_NOK|SVp_NOK|SVf_UTF8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isnt SVF_ISUV copied over from sflags? did anything earlier wipe the SVf_UTF8 inside SvFLAGS(dsv)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, I think the flag handling around this bit of code could be tidied up further. Will look at it again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember (void)SvOK_off(dsv); being a func call was IIRC a Father C 2014 or 2012 quick fix for random memory corruption/leaks involving SvOOK() back when the OOK delta count was stored in a SVt_PVIV inside field SvIVX(). Someone later on changed OOK()'s design to be a BER/ASN.1/SMI/compressed integer stored in the PV buffer. Eliminating the former design limitation, that an OOK POK SV, can NEVER EVER EVER be a dual var with POK + OOK + IOK.

I'm not gonna touch cactus that is sv_backoff()/SvOK_off(dsv) unless I'm at a YAPC or a Toolchain summit with another qualified Perl C human 3-6 feet away.

/* Passes the swipe test. */
SvLEN_set(dsv, len);
SvCUR_set(dsv, cur);
SvPV_set(dsv, SvPVX_mutable(ssv));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split mem read SvPVX_mutable(ssv) from SvPV_set(dsv, ), and move the mem read SvPVX_mutable(ssv) and save it to a C auto, right before SvLEN_set(dsv, len);. pointer aliasing rules in C. C abstract machine doesn't know if SvCUR_set(dsv, cur) will write ontop of 8 bytes of memory backing SvPVX_mutable(ssv).

/* (void)SvOK_off(ssv); but without the superfluous SvOOK_off(ssv)) */
SvFLAGS(ssv) &= ~(SVf_OK|SVf_IVisUV|SVf_UTF8|SVs_TEMP);

SvPV_set(ssv, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this line to below SvCUR_set(ssv, 0);, pointer aliasing, ILP, pipeline stall reasons. The value of SvANY(ssv) is unknown until SvPV_set(ssv, NULL); is 100% completed by the CPU pipeline/ CPU conveyor belt.

#else
dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv );
#endif
SvFLAGS(dsv) = SVt_IV|SVf_ROK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move SvFLAGS(dsv) = SVt_IV|SVf_ROK; up to between these 2 lines

            assert(SvANY(dsv));
///////////////    here
#if defined (DEBUGGING) || defined (PERL_DEBUG_COW)

that way

            dsv->sv_u.svu_iv = old->sv_u.svu_iv;
           dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv );

are likely to get some cpu ops deduped by the CC

#if defined (DEBUGGING) || defined (PERL_DEBUG_COW)
dsv->sv_u.svu_rv = SvREFCNT_inc(SvRV(old));
#else
dsv->sv_u.svu_rv = SvREFCNT_inc( old->sv_u.svu_rv );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ADD SvREFCNT_inc_simple_NN(sv) its faster

* taint and vstring magic, which are also handled here. */

static void
S_sv_freshcopy_PVxx(pTHX_ SV* dsv, SV* ssv, const I32 flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change retval type to SV* and return SV* dsv as the retval. Never waste a free register. Tailcall or reducing liveness or reducing peak non volatile reg usage opportunities exist now, or will exist in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try it and (IIRC) it performed either worse or no better. I'll look again before force-pushing changes for some of the other comments above though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be benchmark able, Its just a provision for the future. A human still needs to single step the asm code and manually look for callsites where that "setter" call is very close or right above a possible return some_size_t; statement.

I have a never finished branch turning all to Perl_sv_set**v**_*mg*() to return their SV* arg, sv_2mortal() style. Doing the change I described would allow 25%-50% of all XSUBs on CPAN, to use 100% volatile CPU registers, and zero reads/writes to the C stack, and zero save/restores of non-vol registers, since that 25%-50% group of XSUBs don't need to carry any C land variables, around any function call in their body.

* This function is marked for inlining, also to benefit the hot SVt_PV case.
*/
PERL_STATIC_FORCE_INLINE void
S_sv_freshcopy_POK(pTHX_ SV* dsv, SV* ssv, const I32 flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change void to SV*, return as retval SV* dsv see oth cmt 4 rational

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply to other comment. ;) I will still look at it again though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer-next-dev This PR should not be merged yet, but await the next development cycle
4 participants