Skip to main content
added 38 characters in body
Source Link
J_H
  • 42.3k
  • 3
  • 38
  • 157

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, with your current optimization flags, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

Passing in a one-dimensional array seems like a pretty inconvenient way of representing your higher level business concepts. Couldn't we cast it to a two-dimensional array? I'd even be willing to suffer the cost of a single big memcpy()if it meant the compiler could easily see that (x, y) remain within bounds.

I know / division isn't quite as expensive as it used to be on earlier CPUs. But still, that cellY / anchorStepY expression to recover a y-coordinate seems inconvenient. I'm sad that in that loop we don't have row number already available for use. (Or perhaps you've arranged for anchorStepY to be a power-of-two, leading to simple bit shifting.)

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

Does this even trigger, does it ever report out-of-bounds? The y < lastYCell + max and x < width + max guards seem like they're already performing the same work, no? If that predicate does sometimes report false, then consider changing the loop around, not unlike a loop unroll. Maybe you'd like to iterate over slightly fewer cells with no check, and then at the end take care of those remaining cells while carefully checking.

I wonder if a few judicious assert statements would help the compiler to see that indexes are provably within-bounds. Comparing generated object code for very slightly different functions over at https://godbolt.org may prove instructive.


This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

Passing in a one-dimensional array seems like a pretty inconvenient way of representing your higher level business concepts. Couldn't we cast it to a two-dimensional array? I'd even be willing to suffer the cost of a single big memcpy()if it meant the compiler could easily see that (x, y) remain within bounds.

I know / division isn't quite as expensive as it used to be on earlier CPUs. But still, that cellY / anchorStepY expression to recover a y-coordinate seems inconvenient. I'm sad that in that loop we don't have row number already available for use. (Or perhaps you've arranged for anchorStepY to be a power-of-two, leading to simple bit shifting.)

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

Does this even trigger, does it ever report out-of-bounds? The y < lastYCell + max and x < width + max guards seem like they're already performing the same work, no? If that predicate does sometimes report false, then consider changing the loop around, not unlike a loop unroll. Maybe you'd like to iterate over slightly fewer cells with no check, and then at the end take care of those remaining cells while carefully checking.

I wonder if a few judicious assert statements would help the compiler to see that indexes are provably within-bounds. Comparing generated object code for very slightly different functions over at https://godbolt.org may prove instructive.


This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, with your current optimization flags, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

Passing in a one-dimensional array seems like a pretty inconvenient way of representing your higher level business concepts. Couldn't we cast it to a two-dimensional array? I'd even be willing to suffer the cost of a single big memcpy()if it meant the compiler could easily see that (x, y) remain within bounds.

I know / division isn't quite as expensive as it used to be on earlier CPUs. But still, that cellY / anchorStepY expression to recover a y-coordinate seems inconvenient. I'm sad that in that loop we don't have row number already available for use. (Or perhaps you've arranged for anchorStepY to be a power-of-two, leading to simple bit shifting.)

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

Does this even trigger, does it ever report out-of-bounds? The y < lastYCell + max and x < width + max guards seem like they're already performing the same work, no? If that predicate does sometimes report false, then consider changing the loop around, not unlike a loop unroll. Maybe you'd like to iterate over slightly fewer cells with no check, and then at the end take care of those remaining cells while carefully checking.

I wonder if a few judicious assert statements would help the compiler to see that indexes are provably within-bounds. Comparing generated object code for very slightly different functions over at https://godbolt.org may prove instructive.


This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.

Source Link
J_H
  • 42.3k
  • 3
  • 38
  • 157

Yay, a performance question that comes with profiler measurements! Excellent.

function call overhead

Thank you for those three tiny helper functions, they do a good job of explaining what's going on.

However, maybe we'd like to inline them? I mean, maybe the compiler cannot "see through" the function call boundary, so it's having a tough time proving some very basic arithmetic facts?

Which leads us to...

1-D versus 2-D

Passing in a one-dimensional array seems like a pretty inconvenient way of representing your higher level business concepts. Couldn't we cast it to a two-dimensional array? I'd even be willing to suffer the cost of a single big memcpy()if it meant the compiler could easily see that (x, y) remain within bounds.

I know / division isn't quite as expensive as it used to be on earlier CPUs. But still, that cellY / anchorStepY expression to recover a y-coordinate seems inconvenient. I'm sad that in that loop we don't have row number already available for use. (Or perhaps you've arranged for anchorStepY to be a power-of-two, leading to simple bit shifting.)

in-bounds by construction

                if (!IsInBounds(index0, min, max) || ... )
                    continue;

Does this even trigger, does it ever report out-of-bounds? The y < lastYCell + max and x < width + max guards seem like they're already performing the same work, no? If that predicate does sometimes report false, then consider changing the loop around, not unlike a loop unroll. Maybe you'd like to iterate over slightly fewer cells with no check, and then at the end take care of those remaining cells while carefully checking.

I wonder if a few judicious assert statements would help the compiler to see that indexes are provably within-bounds. Comparing generated object code for very slightly different functions over at https://godbolt.org may prove instructive.


This code achieves most of its design goals.

I would be willing to delegate or accept maintenance tasks on it.