Java Performance

Wednesday, 29 October – 13:30-15:00

13:30 - 14:00

Dynamic Metrics for Java

Bruno Dufour, McGill University, bdufou1@cs.mcgill.ca
Karel Driesen, McGill University, karel@cs.mcgill.ca
Laurie Hendren, McGill University, hendren@cs.mcgill.ca
Clark Verbrugge, McGill University, clump@cs.mcgill.ca

In order to perform meaningful experiments in optimizing compilation and run-time system design, researchers usually rely on a suite of benchmark programs of interest to the optimization technique under consideration. Programs are described as numeric, memory-intensive, concurrent, or object-oriented, based on a qualitative appraisal, in some cases with little justification. We believe it is beneficial to quantify the behavior of programs with a concise and precisely defined set of metrics, in order to make these intuitive notions of program behavior more concrete and subject to experimental validation. We therefore define a set of unambiguous, dynamic, robust and architecture-independent metrics that can be used to categorize programs according to their dynamic behavior in five areas: size, data structure, memory use, concurrency, and polymorphism. A framework computing some of these metrics for Java programs is presented along with specific results.

14:00 - 14:30

How Java Programs Interact with Virtual Machines at the Microarchitectural Level

Lieven Eeckhout, Ghent University, leeckhou@elis.rug.ac.be
Andy Georges, Ghent University, ageorges@elis.rug.ac.be
Koen De Bosschere, Ghent University, kdb@elis.rug.ac.be

Java workloads are becoming increasingly prominent on various platforms ranging from embedded systems, over general-purpose computers to high-end servers. Understanding the implications of all the aspects involved when running Java workloads, is thus extremely important during the design of a system that will run such workloads, to meet its design goals. In other words, understanding the interaction between the Java application, its input and the virtual machine it runs on, is key to a successful design. The goal of this paper is to study this complex interaction at the microarchitectural level, e.g., by analyzing the branch behavior, the cache behavior, etc. This is done by measuring a large number of performance characteristics using performance counters on an AMD K7 Duron microprocessor. These performance characteristics are measured for seven virtual machine configurations, and a collection of Java benchmarks with corresponding inputs coming from the SPECjvm98 benchmark suite, the SPECjbb2000 benchmark suite, the Java Grande Forum benchmark suite and an open-source raytracer, called Raja with 19 scene descriptions. This large amount of data is further analyzed using statistical data analysis techniques, namely principal components analysis and cluster analysis. These techniques provide useful insights in an understandable way.

From our experiments, we conclude that (i) the behavior observed at the microarchitectural level is primarily determined by the virtual machine for small input sets, e.g., the SPECjvm98 s1 input set; (ii) the behavior can be quite different for various input sets, e.g., short-running versus long-running benchmarks; (iii) for long-running benchmarks with few hot spots, the behavior can be primarily determined by the Java program and not the virtual machine, i.e., all the virtual machines optimize the hot spots to similarly behaving native code; (iv) in general, the behavior of a Java application running on one virtual machine can be significantly different from running on another virtual machine. These conclusions warn researchers working on Java workloads to be careful when using a limited number of Java benchmarks or virtual machines since this might lead to biased conclusions.

14:30 - 15:00

Effectiveness of Cross-Platform Optimizations for a Java Just-In-Time Compiler

Kazuaki Ishizaki, IBM Research, Tokyo Research Laboratory, ishizaki@trl.ibm.com
Mikio Takeuchi, IBM Research, Tokyo Research Laboratory, mtake@jp.ibm.com
Kiyokuni Kawachiya, IBM Research, Tokyo Research Laboratory, kawatiya@jp.ibm.com
Toshio Suganuma, IBM Research, Tokyo Research Laboratory, suganuma@jp.ibm.com
Osamu Gohda, IBM Research, Tokyo Research Laboratory, gohda@jp.ibm.com
Tatsushi Inagaki, IBM Research, Tokyo Research Laboratory, E29253@jp.ibm.com
Akira Koseki, IBM Research, Tokyo Research Laboratory, akoseki@jp.ibm.com
Kazunori Ogata, IBM Research, Tokyo Research Laboratory, ogatak@jp.ibm.com
Motohiro Kawahito, IBM Research, Tokyo Research Laboratory, jl25131@jp.ibm.com
Toshiaki Yasue, IBM Research, Tokyo Research Laboratory, yasue@jp.ibm.com
Takeshi Ogasawara, IBM Research, Tokyo Research Laboratory, takeshi@jp.ibm.com
Tamiya Onodera, IBM Research, Tokyo Research Laboratory, tonodera@jp.ibm.com
Hideaki Komatsu, IBM Research, Tokyo Research Laboratory, komatsu@jp.ibm.com
Toshio Nakatani, IBM Research, Tokyo Research Laboratory, nakatani@jp.ibm.com

We describe the system overview of our Java JIT compiler, which has been the basis for the latest production version of IBM Java virtual machine that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the cross-platform optimizations that are common across different architectures. We study the effectiveness of each optimization by selectively disabling it in our JIT compiler on three different platforms: IA32, IA64, and PowerPC. Based on the detailed statistics, we classify our optimizations and identify a small set of the most cost-effective ones in terms of the performance improvement as the benefit and the compilation time as the cost. In summary, we demonstrate that, with a selected set of optimizations, we can achieve 90% of the peak performance for SPECjvm98 at the expense of only 33% of the total compilation time in comparison to the case in which all the optimizations are enabled.

Technical Papers