Technical Papers

Chair: Guy L. Steele Jr., Sun Microsystems Laboratories, papers@oopsla.acm.org

Technical paper presentations discuss new research as well as valuable empirical results in object-oriented languages, systems, and applications. The 26 technical papers were selected after a rigorous peer review of 147 submissions by an international program committee consisting of 25 experts representing various areas of object technology. Each paper was assigned to at least four reviewers. Each paper co-authored by a member of the program committee was assigned to least seven reviewers on the basis of strict anonymity; moreover, such papers were held to a higher standard of review through a specific voting process designed to be auditable by the conference chair. The committee met face-to-face for two days in Chicago, Illinois. While unfortunate unforeseen circumstances (such as illness) prevented a small number of the reviewers from attending in person, for every paper there were at least three reviewers for that paper present at the meeting. Each paper was discussed and judged on novelty and the degree of contribution to object technology. The 26 papers selected for presentation at this conference should advance the state of the art of object technology in significant ways.

Tuesday, 28 October

10:30-12:00	Refactoring and Reflection
15:30-16:30	Smalltalkiana

Wednesday, 29 October

10:30-12:00	Technical Papers and Onward!: Error Repair Generics
13:30-15:00	Java Performance
15:30-17:00	Language Design

Thursday, 30 October

8:30-10:00	Garbage Collection 1
10:30-12:00	Garbage Collection 2
10:30-11:30	Analysis
13:30-15:00	Transactions and Persistence

Refactoring and Reflection

Tuesday, 28 October – 10:30-12:00

10:30 - 11:00

Language-Independent Aspect-Oriented Programming

Donal Lafferty, Trinity College Dublin, Donal.Lafferty@cs.tcd.ie
Vinny Cahill, Trinity College Dublin, Vinny.Cahill@cs.tcd.ie

The term aspect-oriented programming (AOP) has come to describe the set of programming mechanisms developed specifically to express crosscutting concerns. Since crosscutting concerns cannot be properly modularized within object-oriented programming (OOP), they are expressed as aspects and are composed, or woven, with traditionally encapsulated functionality referred to as components.

Many AOP models exist, but their implementations are typically coupled with a single language. To allow weaving of existing components with aspects written in the language of choice, AOP requires a language-independent tool.

This paper presents Weave.NET, a load-time weaver that allows aspects and components to be written in a variety of languages and freely intermixed. Weave.NET relies on XML to specify aspect bindings and ECMA Common Language Infrastructure to avoid coupling aspects or components with a particular language.

By demonstrating language-independence, Weave.NET provides a migration path to the AOP paradigm by preserving existing developer knowledge, tools and software components. The tool's capabilities are demonstrated with logging aspects written with and applied to Visual Basic and C# components.

11:00 - 11:30

Refactoring for Generalization Using Type Constraints

Frank Tip, IBM T.J. Watson Research Center, tip@watson.ibm.com
Adam Kiezun, IBM Research OTI Labs, adam_kiezun@ch.ibm.com
Dirk Baeumer, IBM Research OTI Labs, dirk_baeumer@ch.ibm.com

Refactoring is the process of applying behavior-preserving transformations (called "refactorings") in order to improve a program's design. Associated with a refactoring is a set of preconditions that must be satisfied to guarantee that program behavior is preserved, and a set of source code modifications. An important category of refactorings is concerned with generalization (e.g., "Extract Interface" for re-routing the access to a class via a newly created interface, and "Pull Up Members" for moving members into a superclass). For these refactorings, both the preconditions and the set of allowable source code modifications depend on interprocedural relationships between types of variables. We present an approach in which type constraints are used to verify the preconditions and to determine the allowable source code modifications for a number of generalization-related refactorings. This work is implemented in the standard distribution of Eclipse (see www.eclipse.org).

11:30 - 12:00

Partial Behavioral Reflection: Spatial and Temporal Selection of Reification

Eric Tanter, University of Chile, École des Mines de Nantes/INRIA, Eric.Tanter@emn.fr
Jacques Noyé, École des Mines de Nantes/INRIA, Jacques.Noye@emn.fr
Denis Caromel, Université de Nice, Denis.Caromel@inria.fr
Pierre Cointe, École des Mines de Nantes/INRIA, Pierre.Cointe@emn.fr

Behavioral reflection is a powerful approach for adapting the behavior of running applications. In this paper we present and motivate partial behavioral reflection, an approach to more efficient and flexible behavioral reflection. We expose the spatial and temporal dimensions of such reflection. In the context of Java, we present a reflective architecture offering appropriate interfaces for static and dynamic configuration of partial behavioral reflection at various levels, as well as an open reflective extension for Java implementing this architecture. Reflex is the first extension that fully supports partial behavioral reflection in a portable manner, and that seamlessly integrates load-time and runtime behavioral reflection, along with static optimizations. The paper shows preliminary benchmarks and examples supporting the approach. The examples, dealing with the observer pattern and asynchronous communication via transparent futures, also show the interest of partial behavioral reflection as a tool for open dynamic aspect-oriented programming.

Smalltalkiana

Tuesday, 28 October – 15:30-16:30

15:30 - 16:00

Applying Traits to the Smalltalk Collection Classes

Andrew P. Black, OGI School of Science & Engineering, Oregon Health & Science University, black@cse.ogi.edu
Nathanael Schärli, University of Bern, schaerli@iam.unibe.ch
Stéphane Ducasse, University of Bern, ducasse@iam.unibe.ch

Traits are a programming language technology modeled after mixins but avoiding their problems. This paper reports on a refactoring of the Smalltalk collections classes using traits. We observed that the original collection classes contained much duplication of code; traits let us remove all of it. We also found places where the protocols of the collections lacked uniformity; traits allow us to correct these non-uniformities without code duplication. In addition, traits make possible more general reuse of collection code outside of the existing hierarchy; for example, they make it easy to convert other collection-like things into true collections. Our refactoring reduced the number of methods in the collection classes by approximately 10 per cent. More importantly, understandability and reusability of the code was significantly improved.

16:00 - 16:30

OOPAL: Integrating Array Programming in Object-Oriented Programming

Philippe Mougin, pmougin@acm.org
Stéphane Ducasse, University Of Bern, ducasse@iam.unibe.ch

Array programming shines in its ability to express computations at a high-level of abstraction, allowing one to manipulate and query whole sets of data at once. This paper presents the OOPAL model that enhances object-oriented programming with array programming features. The goal of OOPAL is to determine a minimum set of modifications that must be made to the traditional object model in order to take advantage of the possibilities of array programming. It is based on a minimal extension of method invocation and the definition of a kernel of methods implementing the fundamental array programming operations. The model is validated in F-Script, a new scripting language.

Technical Papers and Onward!: Error Repair

Wednesday, 29 October – 10:30-12:00

11:30 - 12:00

Automatic Detection and Repair of Errors in Data Structures

Brian Demsky, MIT Laboratory for Computer Science, bdemsky@mit.edu
Martin Rinard, MIT Laboratory for Computer Science, rinard@lcs.mit.edu

We present a system that accepts a specification of key data structure constraints, then dynamically detects and repairs violations of these constraints, enabling the program to continue to execute productively even in the face of otherwise crippling errors. Our experience using our system indicates that the specifications are relatively easy to develop once one understands the data structures. Furthermore, for our set of benchmark applications, our system can effectively repair errors to deliver consistent data structures that allow the program to continue to operate successfully within its designed operating envelope.

Generics

Wednesday, 29 October – 10:30-12:00

10:30 - 11:00

A First-Class Approach to Genericity

Eric Allen, Rice University, eallen@cs.rice.edu
Jonathan Bannet, Rice University, jbannet@rice.edu
Robert Cartwright, Rice University, cork@cs.rice.edu

This paper describes how to add first class generic types—including mixins—to strongly-typed, object-oriented languages with nominal (name-based) subtyping such as Java and C#. A generic type system is "first-class" if generic types can appear in any context where conventional types can appear. In this context, a mixin is simply a generic class that extends one its type parameters, e.g., a class C<T> that extends T. Although mixins of this form are widely used in C++ (via templates), they are clumsy and error-prone because C++ treats mixins as macros, forcing each mixin instantiation to be separately compiled and type-checked. The abstraction embodied in a mixin is never separately analyzed.

Our formulation of mixins using first class genericity accommodates sound local (class-by-class) type checking, in the same sense that Java supports local (class-by-class) compilation. A mixin can be fully type-checked given symbol tables for each of the classes that it directly references—the same context in which Java performs incremental class compilation. To our knowledge no previous formal analysis of first-class genericity in languages with nominal type systems has been conducted, which is surprising because nominal type systems have become predominant in mainstream object-oriented programming languages.

What makes our treatment of first class genericity particularly interesting and important is the fact that it can be added to the existing Java language without any change to the underlying Java Virtual Machine. Moreover, the extension is backward compatible with legacy Java source and class files. Although our discussion of a practical implementation strategy focuses on Java, the same scheme could be applied to other object-oriented languages such as C# or Eiffel that support incremental compilation, dynamic class loading, and a static type system with nominal subtyping.

11:00 - 11:30

A Comparative Study of Language Support for Generic Programming

Ronald Garcia, Indiana University, garcia@osl.iu.edu
Jaakko Jarvi, Indiana University, jajarvi@osl.iu.edu
Andrew Lumsdaine, Indiana University, lums@osl.iu.edu
Jeremy Siek, Indiana University, jsiek@osl.iu.edu
Jeremiah Willcock, Indiana University, jewillco@osl.iu.edu

Many modern programming languages support basic generic programming, sufficient to implement type-safe polymorphic containers. Some languages have moved beyond this basic support to a broader, more powerful interpretation of generic programming, and their extensions have proven valuable in practice. This paper reports on a comprehensive comparison of generics in six programming languages: C++, Standard ML, Haskell, Eiffel, Java (with its proposed generics extension), and Generic C#. By implementing a substantial example in each of these languages, we identify eight language features that support this broader view of generic programming. We find these features are necessary to avoid awkward designs, poor maintainability, unnecessary run-time checks, and painfully verbose code. As languages increasingly support generics, it is important that language designers understand the features necessary to provide powerful generics and that their absence causes serious difficulties for programmers.

11:30 - 12:00

Lightweight Confinement for Featherweight Java

Tian Zhao, University of Wisconsin, Milwaukee, tzhao@cs.uwm.edu
Jens Palsberg, Purdue University, palsberg@cs.purdue.edu
Jan Vitek, Purdue University, jv@cs.purdue.edu

Confinement properties impose a structure on object graphs which can be used to enforce encapsulation— which is essential to certain program optimizations, to modular reasoning, and in many cases to software assurance. This paper formalizes the notion of confined type in the context of Featherweight Java. A static type system that mirrors the informal rules of Vitek and Bokopwski is proposed and proven sound. The definition of confined types is extended to confined instantiation of generic classes. Thus allowing for confined collection types in Java and for classes that can be confined post hoc. Confinement types rules are given for Generic Featherweight Java, and proven sound.

Java Performance

Wednesday, 29 October – 13:30-15:00

13:30 - 14:00

Dynamic Metrics for Java

Bruno Dufour, McGill University, bdufou1@cs.mcgill.ca
Karel Driesen, McGill University, karel@cs.mcgill.ca
Laurie Hendren, McGill University, hendren@cs.mcgill.ca
Clark Verbrugge, McGill University, clump@cs.mcgill.ca

In order to perform meaningful experiments in optimizing compilation and run-time system design, researchers usually rely on a suite of benchmark programs of interest to the optimization technique under consideration. Programs are described as numeric, memory-intensive, concurrent, or object-oriented, based on a qualitative appraisal, in some cases with little justification. We believe it is beneficial to quantify the behavior of programs with a concise and precisely defined set of metrics, in order to make these intuitive notions of program behavior more concrete and subject to experimental validation. We therefore define a set of unambiguous, dynamic, robust and architecture-independent metrics that can be used to categorize programs according to their dynamic behavior in five areas: size, data structure, memory use, concurrency, and polymorphism. A framework computing some of these metrics for Java programs is presented along with specific results.

14:00 - 14:30

How Java Programs Interact with Virtual Machines at the Microarchitectural Level

Lieven Eeckhout, Ghent University, leeckhou@elis.rug.ac.be
Andy Georges, Ghent University, ageorges@elis.rug.ac.be
Koen De Bosschere, Ghent University, kdb@elis.rug.ac.be

Java workloads are becoming increasingly prominent on various platforms ranging from embedded systems, over general-purpose computers to high-end servers. Understanding the implications of all the aspects involved when running Java workloads, is thus extremely important during the design of a system that will run such workloads, to meet its design goals. In other words, understanding the interaction between the Java application, its input and the virtual machine it runs on, is key to a successful design. The goal of this paper is to study this complex interaction at the microarchitectural level, e.g., by analyzing the branch behavior, the cache behavior, etc. This is done by measuring a large number of performance characteristics using performance counters on an AMD K7 Duron microprocessor. These performance characteristics are measured for seven virtual machine configurations, and a collection of Java benchmarks with corresponding inputs coming from the SPECjvm98 benchmark suite, the SPECjbb2000 benchmark suite, the Java Grande Forum benchmark suite and an open-source raytracer, called Raja with 19 scene descriptions. This large amount of data is further analyzed using statistical data analysis techniques, namely principal components analysis and cluster analysis. These techniques provide useful insights in an understandable way.

From our experiments, we conclude that (i) the behavior observed at the microarchitectural level is primarily determined by the virtual machine for small input sets, e.g., the SPECjvm98 s1 input set; (ii) the behavior can be quite different for various input sets, e.g., short-running versus long-running benchmarks; (iii) for long-running benchmarks with few hot spots, the behavior can be primarily determined by the Java program and not the virtual machine, i.e., all the virtual machines optimize the hot spots to similarly behaving native code; (iv) in general, the behavior of a Java application running on one virtual machine can be significantly different from running on another virtual machine. These conclusions warn researchers working on Java workloads to be careful when using a limited number of Java benchmarks or virtual machines since this might lead to biased conclusions.

14:30 - 15:00

Effectiveness of Cross-Platform Optimizations for a Java Just-In-Time Compiler

Kazuaki Ishizaki, IBM Research, Tokyo Research Laboratory, ishizaki@trl.ibm.com
Mikio Takeuchi, IBM Research, Tokyo Research Laboratory, mtake@jp.ibm.com
Kiyokuni Kawachiya, IBM Research, Tokyo Research Laboratory, kawatiya@jp.ibm.com
Toshio Suganuma, IBM Research, Tokyo Research Laboratory, suganuma@jp.ibm.com
Osamu Gohda, IBM Research, Tokyo Research Laboratory, gohda@jp.ibm.com
Tatsushi Inagaki, IBM Research, Tokyo Research Laboratory, E29253@jp.ibm.com
Akira Koseki, IBM Research, Tokyo Research Laboratory, akoseki@jp.ibm.com
Kazunori Ogata, IBM Research, Tokyo Research Laboratory, ogatak@jp.ibm.com
Motohiro Kawahito, IBM Research, Tokyo Research Laboratory, jl25131@jp.ibm.com
Toshiaki Yasue, IBM Research, Tokyo Research Laboratory, yasue@jp.ibm.com
Takeshi Ogasawara, IBM Research, Tokyo Research Laboratory, takeshi@jp.ibm.com
Tamiya Onodera, IBM Research, Tokyo Research Laboratory, tonodera@jp.ibm.com
Hideaki Komatsu, IBM Research, Tokyo Research Laboratory, komatsu@jp.ibm.com
Toshio Nakatani, IBM Research, Tokyo Research Laboratory, nakatani@jp.ibm.com

We describe the system overview of our Java JIT compiler, which has been the basis for the latest production version of IBM Java virtual machine that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the cross-platform optimizations that are common across different architectures. We study the effectiveness of each optimization by selectively disabling it in our JIT compiler on three different platforms: IA32, IA64, and PowerPC. Based on the detailed statistics, we classify our optimizations and identify a small set of the most cost-effective ones in terms of the performance improvement as the benefit and the compilation time as the cost. In summary, we demonstrate that, with a selected set of optimizations, we can achieve 90% of the peak performance for SPECjvm98 at the expense of only 33% of the total compilation time in comparison to the case in which all the optimizations are enabled.

Language Design

Wednesday, 29 October – 15:30-17:00

15:30 - 16:00

HydroJ: Object-Oriented Pattern Matching for Evolvable Distributed Systems

Keunwoo Lee, University of Washington, klee@cs.washington.edu
Anthony LaMarca, Intel Research Seattle, lamarca@intel-research.net
Craig Chambers, University of Washington, chambers@cs.washington.edu

In an evolving software system, components must be able to change independently while remaining compatible with their peers. One obstacle to independent evolution is the brittle parameter problem}: the ability of two components to communicate can depend on a number of inessential details of the types, structure, and/or contents of the values communicated. If these details change, then the components can no longer communicate, even if the essential parts of the message remain unaffected.

We present HydroJ, an extension of Java that addresses this problem. In HydroJ, components communicate using self-describing, semi-structured messages, and programmers use pattern matching to define the handling of messages. This design stems from two central ideas: first, that self-describing messages reduce dependence on inessential message format details; and second, that object-oriented pattern matching naturally focuses on the essential information in a message and is insensitive to inessential information.

We have developed these ideas in the context of Rain, a distributed, heterogeneous messaging system for ubiquitous computing. To evaluate the design, we have constructed a prototype HydroJ compiler, implemented some Rain services in HydroJ, and formalized HydroJ's key features in a core language.

16:00 - 16:30

Relaxed MultiJava: Balancing Extensibility and Modular Typechecking

Todd Millstein, University of Washington, todd@cs.washington.edu
Mark Reay, University of Washington, mreay@cs.washington.edu
Craig Chambers, University of Washington, chambers@cs.washington.edu

We present the rationale, design, and implementation of Relaxed MultiJava (RMJ), a backward-compatible extension of Java that allows programmers to add new methods to existing classes and to write multimethods. Previous languages supporting these forms of extensibility either restrict their usage to a limited set of programming idioms that can be modularly typechecked (and modularly compiled) or simply forego modular typechecking altogether. In contrast, RMJ supports the new language features in a virtually unrestricted form while still providing modular static typechecking and compilation. In some cases, the RMJ compiler will warn that the potential for a type error exists, but it will still complete compilation. In that case, a custom class loader transparently performs load-time checking to verify that the potential error is never realized. RMJ's compiler and custom loader cooperate to keep load-time checking costs low. We report on our qualitative and quantitative experience with our implementation of RMJ.

16:30 - 17:00

ModJava: A Rational Module System for Java and Its Applications

John Corwin, IBM, jcorwin@us.ibm.com
David Bacon, IBM, bacon@us.ibm.com
David Grove, IBM, groved@us.ibm.com
Chet Murthy, IBM, chet@watson.ibm.com

While Java provides many software engineering benefits, it lacks a coherent module system and instead provides only packages (which are primarily a name space mechanism) and classloaders (which are very low-level). As a result, large Java applications suffer from unexpected interactions between independent components, require complex CLASSPATH definitions, and are often extremely complex to install and maintain.

We have implemented a module system for Java that is implemented with class loaders but provides a much higher-level interface. High-level properties can be specified in a module definition and are enforced by the module system as new modules are loaded.

To experimentally validate the ability of ModJava to properly handle the complex module inter-relationships found in large Java server systems, we replaced the classloader mechanisms of Apache Tomcat 4.1.18 with 30 ModJava modules. The modified Tomcat is functionally identical to the original, but requires no CLASSPATH definitions, and will operate correctly even if user code loads a different version of a module used by Tomcat, such as the Xerces XML parser. Furthermore, by making a small change to the Java core libraries enabled by ModJava, we obtained a 30% performance improvement in a servlet microbenchmark.

Garbage Collection 1

Thursday, 30 October – 8:30-10:00

8:30 - 9:00

Mostly Concurrent Garbage Collection Revisited

Katherine Barabash, IBM Haifa Research Laboratory, kathy@il.ibm.com
Yoav Ossia, IBM Haifa Research Laboratory, yossia@il.ibm.com
Erez Petrank, Technion - Israel Institute of Technology, erez@cs.technion.ac.il

The mostly concurrent garbage collection was presented in the seminal paper of Boehm et al. With the deployment of Java as a portable, secure and concurrent programming language, the mostly concurrent garbage collector turned out to be an excellent solution for Java's garbage collection task. The use of this collector is reported for several modern production Java Virtual Machines, and it has been investigated further in academia.

In this paper, we present a modification of the mostly concurrent collector which improves the throughput, the memory footprint, and the cache behavior of the collector without foiling the other good qualities (such as short pauses and high scalability). We implemented our solution on the IBM production JVM and obtained a performance improvement of up to 22.8%, a reduction in the heap consumption by up to 13.6%, and no substantial change in the (short) pause times. The modified algorithm has subsequently been incorporated into the IBM production JVM.

9:00 - 9:30

An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views

Hezi Azatchi, Technion - Israel Institute of Technology, hezia@cs.technion.ac.il
Yossi Levanoni, Technion - Israel Institute of Technology, ylevanon@microsoft.com
Harel Paz, Technion - Israel Institute of Technology, pharel@cs.technion.ac.il
Erez Petrank, Technion - Israel Institute of Technology, erez@cs.technion.ac.il

With concurrent and garbage collected languages like Java and C# becoming popular, the need for a suitable non-intrusive, efficient, and concurrent multiprocessor garbage collector has become acute. We propose a novel mark and sweep on-the-fly algorithm based on the sliding views mechanism of Levanoni and Petrank. We have implemented our collector on the Jikes Java Virtual Machine running on a Netfinity multiprocessor and compared it to the concurrent algorithm and to the stop-the-world collector supplied with Jikes JVM. The maximum pause time that we measured with our benchmarks over all runs was 2ms. In all runs, the pause times were smaller than those of the stop-the-world collector by two orders of magnitude and they were also always shorter than the pauses of the Jikes concurrent collector. Throughput measurements of the new garbage collector show that it outperforms the Jikes concurrent collector by up to 60%. As expected, the stop-the-world does better than the on-the-fly collectors with results showing about 10% difference.

On top of being an effective mark and sweep on-the-fly collector standing on its own, our collector may also be used as a backup collector (collecting cyclic data structures) for the Levanoni-Petrank reference counting collector. These two algorithms perfectly fit sharing the same allocator, a similar data structure, and a similar JVM interface.

9:30 - 10:00

Heap Compression for Memory-Constrained Java Environment

Guangyu Chen, The Pennsylvania State University, gchen@cse.psu.edu
Mahmut Kandemir, The Pennsylvania State University, kandemir@cse.psu.edu
Naraya Vijaykrishnan, The Pennsylvania State University, vijay@cse.psu.edu
Janie Irwin, The Pennsylvania State University, mji@cse.psu.edu
Bernd Mathiske, Sun Microsystems, Inc., Bernd.Mathiske@sun.com
Mario Wolczko, Sun Microsystems, Inc., mario@eng.sun.com

Java is becoming the main software platform for consumer and embedded devices such as mobile phones, PDAs, TV set-top boxes, and in-vehicle systems. Since many of these systems are memory constrained, it is extremely important to keep the memory footprint of Java applications under control.

The goal of this work is to enable the execution of Java applications using a smaller heap footprint than that possible using current embedded JVMs. We propose a set of memory management strategies to reduce heap footprint of embedded Java applications that execute under severe memory constraints. Our first contribution is a new garbage collector, referred to as the Mark-Compact-Compress (MCC) collector, that allows an application to run with a heap smaller than its footprint. An important characteristic of this collector is that it compresses objects when heap compaction is not sufficient for creating space for the current allocation request. In addition to employing compression, we also consider a heap management strategy and associated garbage collector, called MCL (Mark-Compact-Lazy Allocate), based on lazy allocation of object portions. This new collector operates like the conventional Mark-Compact (MC) collector, but takes advantage of the observation that many Java applications create large objects, of which only a small portion is actually used. In addition, we also combine MCC and MCL, and present MCCL (Mark-Compact-Compress-Lazy Allocate), which outperforms both MCC and MCL.

We have implemented these collectors using KVM, and performed extensive experiments using a set of ten embedded Java applications. We have found our new garbage collection strategies to be useful in two main aspects. First, they reduce the minimum heap size necessary to execute an application without out-of-memory exception. Second, our strategies reduce the heap occupancy. That is, at a given time, they reduce the heap memory requirement of the application being executed. We have also conducted experiments with a more aggressive object compression strategy and discussed its main advantages.

Garbage Collection 2

Thursday, 30 October – 10:30-12:00

10:30 - 11:00

MarkCopy: Fast Copying GC with Less Space Overhead

Narendran Sachindran, University of Massachusetts Amherst, naren@cs.umass.edu
Eliot Moss, University of Massachusetts Amherst, moss@cs.umass.edu

Copying garbage collectors have several advantages over non-copying collectors, including cheap allocation and prevention of fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard copying collectors require space equal to twice the size of the maximum live data for a program.

We present a mark-copy collection algorithm that extends generational copying collection and significantly reduces the heap space required to run a program. The new collector runs in space that is nearly a factor of two smaller than that required by standard copying garbage collectors, increasing the range of applications that can use copying garbage collection. We show that when the collector is given the same amount of space as a generational copying collector, it improves program execution time of Java benchmarks by 20-85% in tight heaps and by 5-10% in moderate size heaps.

We also compare the performance of the mark-copy collector with that of a mark-sweep collector. We find that, for several benchmarks, the mark-copy collector can run in heaps comparable in size to the minimum heap space required by the mark-sweep collector. We also find that for some benchmarks the mark-copy collector is significantly faster than a mark-sweep collector in tight heaps.

11:00 - 11:30

Ulterior Reference Counting: Fast Garbage Collection without a Long Wait

Steve Blackburn, Australian National University, Steve.Blackburn@anu.edu.au
Kathryn McKinley, University of Texas at Austin, mckinley@cs.utexas.edu

General purpose garbage collectors have yet to combine short pause times with fast throughput. For example, non-concurrent generational collectors can achieve high throughput and have modest average pause times, but occasionally collect the whole heap and consequently incur long pauses. On the other extreme, concurrent collectors, including reference counters, attain short pause times with significant performance penalties. This paper introduces GenRC, which combines copying generational collection and reference counting the older generation to achieve both goals in a uniprocessor setting. Key to our algorithm is a generalization of deferred reference counting which allows GenRC to safely ignore mutations to nursery objects. GenRC thus restricts copying and reference counting to the object demographics for which they perform well. The contiguous allocation of a copying collector is extremely fast, and collection time is proportional to the live objects whose survival rates are usually low in a fixed size nursery. Reference counting time is proportional to object mutations and the number of dead objects, both of which are typically low for older objects. We further restrict the time spent reference counting with collection triggers and buffering. We compare GenRC using a fixed nursery (FG-RC) with pure reference counting (RC), high performance copying and mark-sweep fixed nursery generational (FG-MS), and pure mark-sweep collectors (MS). We show that FG-RC is competitive with FG-MS on throughput, and attains much lower average and maximum pause times.

11:30 - 12:00

Connectivity-Based Garbage Collection

Martin Hirzel, University of Colorado at Boulder, hirzel@cs.colorado.edu
Amer Diwan, University of Colorado at Boulder, diwan@cs.colorado.edu
Matthew Hertz, University of Massachusetts Amherst, hertz@cs.umass.edu

We introduce a new family of connectivity-based garbage collectors (CBGC) that are based on potential object-connectivity properties. The key feature of these collectors is that the placement of objects into partitions is determined by performing one of several forms of connectivity analyses on the program. This enables partial garbage collections, as in generational collectors, but without the need for any write barrier.

The contributions of this paper are 1) a novel family of garbage collection algorithms based on object connectivity; 2) a detailed description of an instance of this family; and 3) an empirical evaluation of CBGC using simulations. Using simulations allows us to explore a broad range of possibilities for CBGC, ranging from simplistic ones that determine connectivity based on type information to oracular ones that use run-time information to determine connectivity. Our experiments with the oracular CBGC configurations give us an indication of the potential for CBGC and also identify weaknesses in the realistic configurations. We found that even the simplistic implementations beat state-of-the-art generational collectors with respect to some metrics (e.g., pause times and memory footprint).

Analysis

Thursday, 30 October – 10:30-11:30

10:30 - 11:00

Declaring and Checking Non-null Types in an Object-Oriented Language

Manuel Fähndrich, Microsoft Research, maf@microsoft.com
Rustan Leino, Microsoft Research, leino@microsoft.com

Distinguishing non-null references from possibly null references at the type level can detect null-related errors in object-oriented programs at compile-time. This paper gives a proposal for retrofitting a language such as C# or Java with non-null types. It addresses the central complications that arise in constructors, where declared non-null fields may not yet have been initialized, but the partially constructed object is already accessible. The paper reports experience with an implementation for annotating and checking null-related properties in C# programs.

11:00 - 11:30

Object Equality Profiling

Darko Marinov, MIT Laboratory for Computer Science, marinov@lcs.mit.edu
Robert O'Callahan, IBM T.J. Watson Research Center, roca@us.ibm.com

We present Object Equality Profiling (OEP), a new technique for helping programmers discover optimization opportunities in programs. OEP discovers opportunities for replacing a set of equivalent object instances with a single representative object. Such a set represents an opportunity for automatically or manually applying optimizations such as hash consing, heap compression, lazy allocation, object caching, invariant hoisting, and more. To evaluate OEP, we implemented a tool to help programmers reduce the memory usage of Java programs. Our tool performs a dynamic analysis that records all the objects created during a particular program run. The tool partitions the objects into equivalence classes, and uses collected timing information to determine when elements of an equivalence class could have been safely collapsed into a single representative object without affecting the behavior of that program run. We report the results of applying this tool to benchmarks, including two widely used Web application servers. Many benchmarks exhibit significant amounts of object equivalence, and in most benchmarks our profiler identifies optimization opportunities clustered around a small number of allocation sites. We present a case study of using our profiler to find simple manual optimizations that reduce the average space used by live objects in two SpecJVM benchmarks by 47% and 38% respectively.

Transactions and Persistence

Thursday, 30 October – 13:30-15:00

13:30 - 14:00

Saving the World from Bad Beans: Deployment-Time Confinement Checking

Dave Clarke, Utrecht University, dave@cs.uu.nl
Michael Richmond, IBM Almaden Research Center, mrichmon@acm.org
James Noble, Victoria University of Wellington, kjx@mcs.vuw.ac.nz

The Enterprise JavaBeans (EJB) framework requires developers to preserve architectural integrity constraints when writing EJB components. Breaking these constraints allows components to violate the transaction protocol, bypass security mechanisms, disable object persistence, and be susceptible to malicious attacks from other EJBs. We present an object confinement discipline that allows static verification of component integrity as they are deployed into an EJB server. The confinement rules are simple for developers to understand, require no annotation to the code of EJB components, and enforcement of these rules can be incorporated efficiently into existing EJB servers.

14:00 - 14:30

Language Support for Lightweight Transactions

Timothy Harris, University of Cambridge Computer Laboratory, tim.harris@cl.cam.ac.uk
Keir Fraser, University of Cambridge Computer Laboratory, keir.fraser@cl.cam.ac.uk

Concurrent programming is notoriously difficult. Current abstractions are intricate to use and make it hard to design computer systems that are reliable and scalable. We argue in favour of a practical declarative style of concurrency control in which programmers directly indicate the safety properties that they require. In essence, the programmer demarks sections of code which execute within lightweight software-based transactions which commit atomically and exactly once.

These transactions can update shared data, instantiate objects, invoke library features and so on. They can also block, waiting for arbitrary boolean conditions to become true. These features support general-purpose shared data structures such as hashtables, queues and buffers. In general, no performance penalty is incurred for memory accesses outside transactions. Furthermore, transactions which do not access the same shared memory locations can commit concurrently.

We present a detailed design of this proposal along with an implementation and evaluation. We argue that the resulting system (i) is easier for mainstream programmers to use, (ii) prevents lock-based priority-inversion and deadlock problems and (iii) offers compelling performance.

14:30 - 15:00

Lazy Modular Upgrades in Persistent Object Stores

Chandrasekhar Boyapati, MIT Laboratory for Computer Science, chandra@lcs.mit.edu
Barbara Liskov, MIT Laboratory for Computer Science, liskov@lcs.mit.edu
Liuba Shrira, MIT Laboratory for Computer Science, liuba@lcs.mit.edu
Chuang-Hue Moh, MIT Laboratory for Computer Science, chmoh@lcs.mit.edu
Steve Richman, MIT Laboratory for Computer Science, richman@lcs.mit.edu

Persistent object stores require a way to automatically upgrade persistent objects. Automatic upgrades are a challenge for such systems. Upgrades must be performed in a way that is efficient both in space and time, and that does not stop application access to the store. In addition, however, the approach must be modular: it must allow programmers to reason locally about the correctness of their upgrades similar to the way they would reason about regular code. This paper provides solutions to both problems.

The paper first defines upgrade modularity conditions that any upgrade system should satisfy to support local reasoning about upgrades. The paper then describes a new approach for executing upgrades efficiently while satisfying the upgrade modularity conditions. The approach exploits object encapsulation properties in a novel way. The paper also describes a prototype implementation and shows that our upgrade system imposes only a small overhead on application performance.