8

Project Valhalla’s Inline Types – Codes Like a Class, Works Like an int

 3 years ago
source link: https://blog.oio.de/2021/03/30/project-valhallas-inline-types-codes-like-a-class-works-like-an-int/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Project Valhalla’s Inline Types – Codes Like a Class, Works Like an int

After nearly 7 years it’s time to have a short look at project Valhalla. The project’s main goal is to increase the performance of Java on modern hardware. While it remains still unclear when the project will finish, several things have already been revealed.

Some of you may have heard of the new “value” or “data” type that should be introduced with Valhalla. However, the name was changed to “inline” type. This post will focus on the current status of this part.

Another goal of the project is about specialized generics. As the status of this branch is even more unclear, I will not cover it here.

What is it about?

It’s all about performance! Valhalla aims to enable Java to be more performant working with data structures on the current hardware. Currently, everything that does not fit into a primitive type needs to be represented as an object. An object comes with identity, a header, it is stored on the heap and needs to be dereferenced which leads to a memory indirection. However, on modern hardware indirections are costly as you pay with latency when fetching stuff from the different memory locations.

As Java is an object-oriented language, it does contain a lot of pointers to work with. A good example is when working with arrays. While the references are nicely arranged within the array, the referenced objects can be located quite far apart from each other on the heap. You rather want to have all the relevant data, the actual objects, more localized or flat.

In addition, each reference object comes with its own header containing meta information and providing identity. For large amounts of data this can be quite some memory overhead.  For cases, in which we actually don’t need identity, we want to have a denser layout. A good example is an array of a class Point.

Here we would like to have the common advantages of working with an object, like providing methods and so on. However, we do not want to pay the price of indirections and headers when handling large numbers. If we load the data we actually want to have the value inside the array not just the pointer! We want something that:

Codes like a class, works like an int!

Brian Goetz

We basically want to have a third option between a reference type and a primitive type!

The original name ‘value type’ came from the type of class it was meant for (small ‘value’ or ‘data’ objects). The current name ‘inline type’ originates from the way it works. Other names that you may find out there are “faster classes” or “user-definable primitives”. Which pretty much describes the goal of the new approach. Read more about this.

The new ‘inline’ type object does decline identity and can therefore be inlined inside a container, in this case an array. This leads to the desired flattened and dense memory layout. The array does not contain references anymore, but the actual content. This allows the performance gain envisaged. Working with this array will be comparable to an array of int in most cases. The deprivation of its identity and the way it was achieved comes with various “side” effects though.

In case of arrays, Point[] will be larger in size as it now contains the values instead of pointers inside the fields of the array. It needs to be known on startup how much space it has to reserve. Therefore, inline classes need to be pre-loaded. As the elements’ size differs, accessing the array will also be slower. So there will also be some downsides in regards of performance as well.

Remark on migration of legacy code

Perfect candidates/classes for inline types already exist in current code. Some actual candidates would be numerics, dates and wrappers like Optional and Integer. This is the reason why the goal of Valhalla is to retroactively migrate those classes and to enable an easy migration for other classes as well without breaking existing code!

As already mentioned, the new inline type should close the gap between primitives and objects. However, it is not intended to have a third type, but that primitives to be subsumed as inline types in the future. This massively inflates the implementation effort of the new type. The current attempt to deliver such a solution is inline widening.

It’s needless to say that this goal does not only affect the language model, but the JVM and reaches all the way down to the metal.

How does it work?

I want to emphasize that I only talk about the current status and the effects on the language model. The agenda of the project includes several phases were things will be explored and may result in future changes. In addition, it is the goal to relax some of the restrictions for inline classes, always taking into account backwards compatibility.

In the Valhalla world there will be two kinds of classes. The current reference types with identity and the new inline types. On the language level the distinction is achieved by two new interfaces InlineObject and IdentityObject. Each class will implicitly implement the corresponding interface.

How can we now code such a new class? It’s that simple

inline public class Point {
public int x;
public int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}

It’s just a single word “inline”. What happens behind the scenes is a different story though. On the level of the bytecode there is now an additional Q-type descriptor and two new bytecodes defaultvalue and withfield. Other bytecodes needed to be extended as well. Read more about this.

What are the properties and side effects of the new type? The loss of identity basically means that certain identity-sensitive operations are not permitted on inline objects and that they can only partly allow polymorphism. Some key facts are listed below.

Inline classes …

  • … are immutable,
    • therefore they are implicitly final and cannot be abstract.
    • all instance fields of an inline class are implicitly final.
  • … are not nullable, instead they have a default value.

    Like primitives, the new type will not hold a reference to the value but the value itself. It can therefore not be null. Instead, each inline object will have a default value that can directly be accessed with default (e.g. Point.default). It is represented by the default values of each of its fields. A drawback of this solution is that when an array would currently contain mostly null as values, in the Valhalla world the array will hold actual (default) values. However, concerning backward compatibility and migration of existing classes to inline classes non-nullability is one of the major obstacles. This is also tackled by the inline widening.
  • … can be compared with the “==” operator.

    Comparing two inline objects with the “==” operator is still possible. In comparison to objects with identity one can not simply check if the reference points to the same memory address. In case of inline objects it does not need to be the same object but it means that the bit pattern of the data must be identical. In other words they should be substitutable! Therefore, “==” needs to check if they have the same type and if all fields are equal. This can be costly for large objects with inline types as fields.
    (Equality -> Substitutability)
  • … cannot declare instance fields of their own type.
  • … extend java.lang.Object.
  • … cannot inherit from another type and nothing can inherit from it.
  • … can implement regular interfaces.
  • javac generates hashCode(), equals() and toString()
  • javac does not allow clone(), finalize(), wait() or notify()

These are currently the major properties and restrictions. Each implies several cases where inline types may or may not be used. However, they mostly affect the way a migration of existing classes can be achieved.

Use cases

Thinking back about the example of Point[]. The containing object will actually be larger than before. When the objects are ‘large’, i.e. contain inline fields as well, it may be hard to find enough coherent space. Then, it would be more performant to have an array of pointers instead. Therefore, the new type will be actually only  flattenable/inlinable!

This means the JVM decides if the object will be flattened or not. In regards to performance inline classes do not always guaranty an improvement, they are really restricted to ‘small’ objects. This will be achieved by the new inline widening approach.

Inline widening to the rescue

As already mentioned, not breaking backward compatibility while allowing migration of existing classes like wrappers (Optional, Integer, …) to inline classes is the major show stopper. Providing nullability and enabling the JVM to decide to flatten or not to flatten can be tackled by the same approach.

The attempt to solve this is by adding a companion interface to each inline class. The interface is automatically generated by the compiler. This basically means there is a reference type representation (reference projection) of each inline type (value projection). This can be compared to the boxing we already know from wrappers like int <-> Integer. However, the difference is, the conversion of an inline type to its reference projection yields a reference to the actual object and not an identity object itself. The conversion between inline and its companion happens automatically like auto-boxing. This solution is faster compared to the current boxing implementation. While it provides nullability, it also allows the JVM to flatten or not to flatten and would work with erased generics as well.

In terms of migration of classes like Integer (or Optional) there is still a way to go. However, the approach of inline widening also yields a solution here. As these classes are currently used as reference types, they would be rather representing the reference projection of an inline type in the Valhalla world. The possibility of the conversion of these instances to the value type projection would retrospectively allow the desired increase in performance. More about this.

Summary

It is important to realize that Valhalla is a project that goes all the way down through the language and VM and reaches the metal. This means that it might look just like one new construct (inline class) to the programmer, but there are so many layers the feature depends upon.

There are some preview releases, but it’s totally unclear when the project will finish.

More ….

Short URL for this post: https://blog.oio.de/wK742

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK