9

John Fremlin's blog: Java: What a horrible virtual machine

 3 years ago
source link: http://john.freml.in/what-a-horrible-vm
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Posted 2016-04-18 02:36:00 GMT

JVM bashing is an activity rarely supported by facts. I don't actually know the details. I mean Java I really don't care about. What a horrible language. What a horrible VM. So, I am like whatever, you are barking about all this crap, go away. I don't care. This quote from Linus Torvalds upsets people — there is a school of thought that participation medals should be handed to everybody in the race and one should never be nasty at all, so all criticism is wrong, and one should never listen to it. Given that Linus himself is an exceptional technical leader started multiple huge billion dollar industries (Linux and Git) this attitude is extraordinarily arrogant. Is there another person whose technical opinion on these subjects one should respect more?

In its defense, an argument is advanced about the JVM: that it must be good, just because so many resources have been dedicated to it. Unfortunately, software doesn't work like that. Even experienced big software companies that are accustomed to managing big projects can pour billions of development dollars into duds and this Wikipedia list of failed custom projects is salutary reading. There are other VMs, and while the JVM makes bold promises and did have a brief competitive period, it is now effectively a monoculture around the Oracle implementation (OpenJDK).

Lack of dynamic memory allocation. When starting the Sun (Oracle) or OpenJDK JVM people pass a Xmx flag saying how much memory it should use. This is crazy: decades ago in FORTRAN people had to predeclare the maximum size of their datastructures. Dynamic memory allocation with malloc was a big deal (FORTRAN 90 standardized dynamic allocations). And it definitely makes sense: a program should scale its memory usage according to amount of memory it needs. Declaring the overall space usage is indeed better than going through and annotating each array but it's incredible to me that we are still discussing this in 2016. The default value is 256MB, which is crazily low given how memory hungry Java programs typically are (a text editor probably uses more), and insane running on a server with 256GB of RAM. The trouble with raising it, is that then by default the JVM will not worry too much about freeing up unused heap memory if it isn't close to its limit. There is this hilarious question on Serverfault where a poor ex-JVM refugee is introduced to the concept of dynamic memory allocation (default in .NET is 60% of RAM which is so, so, so much more sensible). There are smaller VMs that have this issue (Common Lisp SBCL, for example), but other VMs that try to employ sensible heuristics (like Haskell GHC, which just has a suggested heap size). There are indeed pros and cons to different approaches and a mature VM should implement sensible heuristics by default and allow configuration. The JVM does not even attempt the former.

Lacking ABI and poor bytecode design. The virtual machine lacks basic features like generics, unsigned arithmetic, has poor support for dynamic languages, lacks value types (huge performance issue), and so on. Some of these issues are being addressed, e.g. the invokedynamic op, but it's telling that JavaScript V8 can beat the JVM on some microbenchmarks.

Poor foreign function interface. Even small VMs like SBCL Common Lisp, or Haskell have high performance and easy interfaces to C code. JNA makes a better interface. Sharing datastructures back and forth with native code is a big deal and the CLR from Microsoft invests a huge amount of effort into PInvoke. The JVM should too!

Lack of performance isolation. Ideally, a VM would let you run untrusted code. This is what JavaScript VMs in browsers do really well. The CLR has a concept of AppDomain, sort of a .NET container, which can be configured to limit memory usage and other things. Even PHP lets you limit memory usage per request. In a multithreaded JVM application, one bad request can OutOfMemoryException other requests on the same machine and there's no way to stop it. You can't even track the memory usage of a thread in a multithreaded program. Also, the JVM does not allow fork()ing so you can't use the OS isolation.

Another issue, unfortunately without supporting links, is that in my experience, JVM deployments get stuck on old versions. Every enterprise I've worked in that uses Java, has had some specific old version of the JVM (sometimes, incredibly specific like 1.5.0_05), that they were stuck on and could not upgrade out of, causing the usual problems with not being able to use new tools. Almost always the version used would be no longer supported and weird installers for it would be stored in odd places. Upgrades are always hard, but this is something that FreeBSD, Linux and even Microsoft Windows operating systems do better, and Intel does better with real, physical machines. Virtual machines were sold as more flexible and manageable than physical ones! In my limited experience with it, Microsoft CLR does a much better job here. This is exactly something that one would expect a mature VM with big development budgets to really care about.

It's great that the JVM ecosystem is improving. Lambdas and invokedynamic are good steps forward; but we need more! The concept of a virtual machine promised so much, it's now hard not to find the Oracle VM disappointing.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK