Return to Answer

added 38 characters in body

Source Link

edited Nov 30, 2023 at 17:37

42.3k
3
38
157

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, with your current optimization flags, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

Passing in a one-dimensional array seems like a pretty inconvenient way of representing your higher level business concepts. Couldn't we cast it to a two-dimensional array? I'd even be willing to suffer the cost of a single big memcpy()if it meant the compiler could easily see that (x, y) remain within bounds.

I know / division isn't quite as expensive as it used to be on earlier CPUs. But still, that cellY / anchorStepY expression to recover a y-coordinate seems inconvenient. I'm sad that in that loop we don't have row number already available for use. (Or perhaps you've arranged for anchorStepY to be a power-of-two, leading to simple bit shifting.)

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

Does this even trigger, does it ever report out-of-bounds? The y < lastYCell + max and x < width + max guards seem like they're already performing the same work, no? If that predicate does sometimes report false, then consider changing the loop around, not unlike a loop unroll. Maybe you'd like to iterate over slightly fewer cells with no check, and then at the end take care of those remaining cells while carefully checking.

I wonder if a few judicious assert statements would help the compiler to see that indexes are provably within-bounds. Comparing generated object code for very slightly different functions over at https://godbolt.org may prove instructive.

This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

Which leads us to...

1-D versus 2-D

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Source Link

answered Nov 30, 2023 at 17:30

J_H

42.3k
3
38
157

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.