-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Ruby: Remove canonical return nodes #12964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5a3b101 to
7dc4956
Compare
7dc4956 to
1696d7e
Compare
a7a4417 to
37efc63
Compare
37efc63 to
fbc34d2
Compare
fbc34d2 to
1788c54
Compare
|
There's quite a few lost results in the Ruby evaluation. In particular, there are projects that didn't lose any call edges or taint sources, but where we lost a bunch of tainted nodes. That would imply that we lost some steps somehow, but it's not clear why. Have you investigated the underlying reason? Given the changes to type-tracking, one possible explanation could be that API graphs can no longer track use/def nodes to the same extent it did before, and so some summaries no longer apply to the same call sites. For now I've opened #13299 which adds some meta-queries for measuring calls to summarized callables. Seems useful to have in general, and perhaps this could help narrow down the root cause of these changes. Otherwise the code LGTM although it's honestly a bit hard to review. There seems to be an unfortunate increase in complexity overall, but it seems to be internal, while the public API is actually getting simplified so I'm ok with that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python changes look fine, and the Python evaluation looks like a small speed-up, even :-)
I agree that the code looks a bit more complicated now, but the rewrite is quite systematic and I wonder if it calls for being packed into parameterised modules (although not necessarily in this PR).
In particular the first commit seems to call for a module for folding over a relation and get the right join order, while the two following commits suggest a module for recursive type tracking.
I checked the changes to |
I'll do this. Putting this PR back into draft, until I see whether the approach works. |
025471d to
3569561
Compare
3569561 to
48ac3e5
Compare
|
PR is ready for review again; only the last three commits are new. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the extra packaging work, the diff now looks extremely nice!
I really like how we can just use the interface in the Python code base and have the join-order logic happen for free behind the scenes.
| pragma[nomagic] | ||
| private predicate stepProj(TypeTrackingNode nodeFrom, StepSummary summary) { | ||
| step(nodeFrom, _, summary) | ||
| } | ||
|
|
||
| bindingset[nodeFrom, t] | ||
| pragma[inline_late] | ||
| pragma[noopt] | ||
| private TypeTracker stepInlineLate(TypeTracker t, TypeTrackingNode nodeFrom, TypeTrackingNode nodeTo) { | ||
| exists(StepSummary summary | | ||
| stepProj(nodeFrom, summary) and | ||
| result = t.append(summary) and | ||
| step(nodeFrom, nodeTo, summary) | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern looks like it could also be hidden in a parameterised module. I think of it as one of the micro modules that @jbj envisioned when he spoke in Nice about "the future with parameterised modules".
It does not have to happen in this PR, though :-)
| bindingset[nodeFrom, t] | ||
| pragma[inline_late] | ||
| pragma[noopt] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Jamen, det er jo hele tre ting på én gang! Det går da virkelig ikke."
(With apologies to those in the audience who did not grow up with Danish commercials for Kinder eggs.)
|
@asgerf : Would you like another look before merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! LGTM and thanks for investigating the result changes.
|
|
||
| pragma[nomagic] | ||
| private Node track(Input::State state, TypeTracker t) { | ||
| t.start() and Input::start(result, state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this PR is merged already, but shouldn't the start have had the filter applied as well? @hvitved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will not work in the Ruby use-case, but I realize that in principle we ought to restrict all the Python start predicates using ignoreForCallGraph, yet as soon as we take a single step things will be filtered away, so I don't think it will matter in practice.


This PR is about porting the join-order improvement from #12979 to the type tracking library shared between Ruby and Python. The motivation for Ruby is to be able to get rid of canonical return nodes, which were only introduced to avoid bad joins during type tracking (and dually, to have steps from parameter nodes to their corresponding SSA node be simple local flow steps instead of call steps).
The join-order from #12979 is implemented in the first commit, and it is about getting exactly this join-order in the various step relations:
The above, however, does not work when type tracking is used in mutual recursion with the call graph construction, because
stepbecomes a recursive call (or, actually the subset ofstepthat isstepCallbecomes recursive), which means that we will have a conjunct with 3 recursive calls (the twostepCallcalls and the recursive type tracking call itself). The solution is to split this non-linear recursion into two non-linear predicates: one that first joins with the projectedstepCallrelation (e.g. 13ada1e#diff-26e3a30a43f4e4f576971a851f3a70fd7c796624c81c5387d6cf6b2db3ed58edR506-R507), followed by a predicate that joins with the fullstepCallrelation (e.g. 13ada1e#diff-26e3a30a43f4e4f576971a851f3a70fd7c796624c81c5387d6cf6b2db3ed58edR493).The second commit does the above for Ruby, and additionally (1) gets rid of canonical return nodes (
SynthReturnNode), (2) no longer treats SSA definitions for parameters as local source nodes, (3) treats the step from parameter to SSA definition as a normal local flow step in type tracking, as in normal data flow, and (4) exposesSelfParameterNodepublicly, for use in queries/libraries.The third commit applies the same call graph join-order improvements to Python, as mentioned above.
CC: @aschackmull