Context
First of all, Graydon clarified the implementation they were thinking of on reddit:
This is in fact what I was talking about -- not separate stacks, but a persistent portion of the shared stack carved out for the iteratee. actually implemented in rustboot, but abandoned along the way when we moved to rustc, in part because llvm didn't support anything like it at the time. It does now.
The idea is to avoid function call overhead without inlining by jumping back and forth between coroutine code and iterator code during iteration.
External woes: stack of calls
It is common, in Rust, to create pipelines of iterators. Imagine something like:
collection
.into_iter() // Returns Person elements
.enumerate() // Returns a tuple (index, Person)
.filter(|i| i.1.age >= 18)
.filter(|i| i.1.address.zip % 100 == 63)
.for_each(|i| send_letter(i.0, i.1)); // or a for loop, for external.
With external iteration, however, this will result in:
- N calls to the age predicate.
- N' calls to the zip predicate.
- N'' calls to the send letter callback.
- N'' calls to Filter's next (zip).
- N' calls to Filter's next (age).
- N calls to Enumerate next.
- N calls to Iterator next.
- (3 calls to iterator methods (1 each): enumerate, filter, filter, no for_each)
With internal iteration (naive), this will result in:
- N calls to the age predicate.
- N' calls to the zip predicate.
- N'' calls to the send letter callback.
- N' calls to the wrapper around the zip predicate + tail.
- N calls to the wrapper around the age predicate + tail.
- (4 calls to iterator methods (1 each): enumerate, filter, filter, for_each)
With internal iteration (Graydon style), this will result in:
- N calls to the age predicate.
- N' calls to the zip predicate.
- N'' calls to the send letter callback.
- (4 calls to iterator methods (1 each): enumerate, filter, filter, for_each)
So, just comparing the number of non-user function calls:
- External: N + N + N' + N'' + 3.
- Internal (naive): N + N' + 4.
- Internal (Graydon): 4.
The number of function calls of external iteration is just much greater... especially when compared to Graydon-style internal iteration.
Since inlining is the optimization which allows eliminating function call overhead, it benefits external iteration disproportionally.
External woes: lost context
This is best illustrated, I find, using the Chain iterator. A Chain iterator is simple: it takes two iterators, and chains them together in a single one, first yielding all elements of the first then yielding all elements of the second.
With internal iteration, it's quite simple; in Python:
class Chain:
...
def for_each(self, callback):
self.first.for_each(callback)
self.second.for_each(callback)
And that's it.
With external iteration, however, every time the next element is queried, one must check which iterator to yield from:
class Chain:
...
def next(self):
if self.use_second == False:
n = self.first.next()
if n:
return n
self.use_second = True
return self.second.next()
The reason is that for any complex iteration protocol, the context is lost in between each query to next, and therefore (1) must be saved internally and (2) must be reconstructed with every call.
The only way to avoid saving + reconstructing at every call... is to eliminate the call by inlining it, and then have the optimizer optimize those out1.
Once again, inlining benefits external iteration more than it does internal iteration.
1 As of LLVM 16, LLVM still doesn't optimize chain iterators next calls, and instead keeps a single loop with a flag check at every iteration...
Appendix: why External?
One key advantage of External iteration manifests itself with non-trivial loops: since using external iteration the control-flow is in the hands of the caller, they can do a lot: continue/break (potentially to other loops), return, zip, etc...
External iteration offers more flexibility, and the greater implementation complexity cost does not necessarily result in a performance loss with contemporary optimizers -- at least on the simpler iterators. This makes it a good trade-off, but it is a trade-off.