Check allocation scheme for primivite types with kebs-tagged #46
Comments
|
I conducted some research. TL;DR My testing procedure was as follows: I copied the tagging snippet from Kebs and applied various modifications to it. I was inspecting the generated bytecode of a class with a method returning a tagged Integer. I used Scala 2.12. The code I used was: trait Tagged[+T, +U]
type @@[T, +U] = T with Tagged[T, U]
implicit class TaggingExtensions[T](val t: T) extends AnyVal {
def taggedWith[U]: T @@ U = t.asInstanceOf[T @@ U]
}
trait Tag
case class Boxed(i: Int)
case class Unboxed(i: Int) extends AnyVal
class Test {
def testNormal = 2
def testTagged = 2.taggedWith[Tag]
def testTaggedRaw = 2.asInstanceOf[Int @@ Tag]
def testBoxed = Boxed(2)
def testUnboxed = Unboxed(2)
def testInteger = new Integer(2)
def testOption = Some(2)
def testOptionUnboxed = Some(Unboxed(2))
def testOptionTagged = Some(2.taggedWith[Tag])
}and the relevant parts of the bytecode were:
As we can see, the tagging mechanism as currently implemented (called via .taggedWith extension method) results in boxing the integer, then calling a virtual method that creates the TaggingExtensions object, then calling a virtual method taggedWith on it (but since all it does is casting a generic type to a generic type - it's a no-op due to the type erasure) and then unboxing the result. The important part here is that the returned value is unboxed (just as in scala-newtype when using newsubtype macro). But the sad part is that in the process it gets boxed and unboxed back and forth. Can we do better? Our goal is to have the generated bytecode look identical to the "normal" version, i.e. simply returning the int, without any virtual calls and without boxing. It turns out that this is what we get when we use plain old value classes. So yes, probably value classes are more efficient than tagged types. If we call asInstanceOf directly (i.e. without the extension method) the generated bytecode gets reduced to only 2 static calls responsible for boxing and unboxing (which is obviously redundant, because nothing happens in between). I looked for ways to reduce the footprint even further. I tried specializing the tagged type for integers (https://scalac.io/specialized-generics-object-instantiation/), making the implicit class final and marking taggedWith method as @inline. None of that worked. It turns out that @inline is only a suggestion for the compiler (in Dotty it'll become a guarantee, but it's a whole different story). What worked, and what results in (according to my knowledge) the best bytecode we can get, was enabling compiler optimization flags. Here I used what was written on Lightbend's blog. If compiled with flags
Now even in the case of the extension method, there are no method calls (either virtual or static)! They got replaced with redundant access to some fields (it's worth checking in the inner workings of Scala's compiler why they are not completely removed...), but AFAIK that's much faster. So, to sum it up, the way that tagged types are implemented now is fine, but we should enable inlining optimizations in our code (and probably also compile kebs itself with them, as it shrinks the bytecode a bit in case if the end user doesn't enable them) taggedWith (with inlining)
taggedWith (without inlining)
where taggedWith$extension is a no-op
|
|
Many thanks @agluszak! |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

We need to check if kebs-tagged provides no overhead for primitive type. If there's an overhead, consider changing representation. See: https://github.com/estatico/scala-newtype
The text was updated successfully, but these errors were encountered: