Eye on performance: Referencing objects


Search for:	within
		Search help

IBM home | Products & services | Support & downloads | My account

developerWorks > Java technology


	Eye on performance: Referencing objects

Contents:

Object leaks

Explicitly nulling variables

Resources

About the authors

Rate this article

Related content:

Eye on performance series

Handling memory leaks in Java programs

The practice of peer-to-peer computing: Introduction and history

Subscriptions:

dW newsletters

dW Subscription
(CDs and downloads)

How you reference objects can seriously affect the garbage collector

Level: Intermediate

Jack Shirazi (mailto:jack@JavaPerformanceTuning.com?cc=&subject=Referencing objects), Director, JavaPerformanceTuning.com
Kirk Pepperdine (mailto:kirk@JavaPerformanceTuning.com?cc=&subject=Referencing objects), CTO, JavaPerformanceTuning.com

26 Aug 2003

Intrepid optimizers Jack Shirazi and Kirk Pepperdine, Director and CTO of JavaPerformanceTuning.com, follow performance discussions all over the Internet, expanding and clarifying the issues they encounter in this column. This month, they set their sights on the Java Games Web site to see how game developers identify and then resolve problems that appear when their application doesn't release objects for garbage collection.

If you think of game developers as the Formula One drivers of the Java programming world, then you can understand why this group places so much emphasis on performance. The performance problems that these developers face on a daily basis often stretch the bounds of what we mere mortals typically see. Where do you find these people? One place is at the Java Games community Web site (see Resources). Though there may not be a lot of server-side activity on this site, looking at what these bit-twiddlers face every day can often yield precious nuggets that we all can benefit from. So let's get our game on!

Object leaks
Game programmers are no different from other programmers -- they still need to understand the subtleties of the Java runtime environment, such as garbage collection. Garbage collection can be one of the more difficult concepts to wrap your mind around, as it isn't always obvious how to debug heap problems. It seems like there are a lot of discussions that start with, or end with, "I'm having a problem with garbage collection."

Let's say you're getting out-of-memory errors. You've fired up your profiler to look for the problem, but you've gotten nowhere. You can easily get to the stage where it's easy to believe that the bug is in the JVM heap management, rather than in your application. But, as explained more than once by Java Game's resident experts, the JVM doesn't have any substantiated object leaks. The garbage collector has proved to be generally accurate in determining which objects are dead and then reclaiming their space. So if you are getting out-of-memory errors, it is extremely likely that your application is experiencing "unintentional object retention."

Memory leaks versus unintentional retention
What is the difference between a memory leak and unintentional object retention? When it comes to programs written in the Java language, nothing really. They both basically mean that your application is retaining references to objects that you don't intend to reference. The classic example is adding objects into a collection to keep track of those objects, but forgetting to remove them from the collection when you no longer need to keep track of them. Because the collection can keep growing without bound, and doesn't ever get smaller, at some point you can have so many objects in the collection (or referenced by elements in the collection) that you fill up the heap and get out-of-memory errors. The garbage collector cannot reclaim those objects you think you are finished with, because as far as the garbage collector is concerned, the application can still access them at any time through that collection, so they couldn't possibly be garbage.

In languages without garbage collection, like C++, there is a difference between memory leak and unintentional object retention. C++ programs can have unintentional object retention just like Java programs can. But C++ programs can also have real memory leaks, where objects are no longer reachable by the application but the memory never gets released back to the system. Thankfully, in Java programs, this latter type of memory leak is not possible. We prefer to use the term "unintentional object retention" for the memory problems that make Java programmers tear their hair out, so we can distinguish ourselves from all those other programmers who have to deal with more retrograde languages.

Tracking retained objects
So what do you do if you have unintentional object retention? Well, first you need to determine which objects are being unintentionally retained, and then you need to find which objects are referencing them. Then you've got to figure out where they should be released. The easiest way to identify these objects is by using a profiler with the abilities to snapshot the heap, compare object numbers between snapshots, track objects, find back references to objects, and force garbage collections. With such a profiler, the procedure to follow is relatively straightforward:

Wait until the application has reached the steady state, where you would expect most new objects are temporary objects that can be garbage collected; typically this is after all the application initializations have finished.
Force a garbage collection, and take an object snapshot of the heap.
Do whatever work it is that is causing unintentionally retained objects.
Force another garbage collection and then take a second object snapshot of the heap.
Compare the two snapshots to see which objects have increased in number from the first snapshot to the next. Because you forced garbage collections before the snapshots, the objects left should all be objects referenced by the application, and comparing the two snapshots should identify exactly those newly created objects that are being retained by the application.
Using your knowledge of the application, determine from the snapshot comparison which of the objects are being unintentionally retained.
Track back-references to find which objects are referencing the unintentionally retained objects, until you reach the root object that is causing the problem.

After following this procedure, you will know how to cure the problem.

Explicitly nulling variables
Staying on the subject of garbage collection, one really fascinating discussion concerned whether there was a performance advantage to explicitly nulling variables. Nulling a variable is simply explicitly assigning null to the variable, as opposed to just letting references go out of scope.

Listing 1. Local scope


public static String scopingExample(String string) { 
  StringBuffer sb = new StringBuffer(); 
  sb.append("hello ").append(string); 
  sb.append(", nice to see you!"); 
  return sb.toString(); 
}

When the method is executing, the runtime stack holds a reference to the StringBuffer object created in the first line. As long as the method is executing, the reference to the StringBuffer object prevents that object from being considered as garbage. After the method is terminated, the variable sb goes out of scope, and the runtime stack eliminates the reference to that StringBuffer object. There is no longer any reference to the StringBuffer object, and now it can be garbage collected. This elimination of the reference is equivalent to nulling the sb variable just after the method completes.

Wrong scoping
So if the JVM does the equivalent of the nulling for you, how can explicitly nulling a variable ever help? For correctly scoped variables, there is no benefit. But let's look at another version of the scopingExample method, and this time we'll incorrectly scope the sb variable.

Listing 2. Static scope


static StringBuffer sb = new StringBuffer(); 
public static String scopingExample(String string) { 
  sb = new StringBuffer(); 
  sb.append("hello ").append(string); 
  sb.append(", nice to see you!"); 
  return sb.toString(); 
}

Now sb is a static variable, so it lasts as long as the class remains loaded in the JVM. Every time the method is executed, a new StringBuffer object is created and referenced by the variable. At that point the StringBuffer object previously referenced by the sb variable becomes dead, making it a candidate for garbage collection. This means that the StringBuffer is being held onto by the application for much longer than it needs to be -- possibly forever if no one ever calls scopingExample again.

A pathological example
Even so, would explicitly nulling that variable improve performance? We would have found it difficult to believe that one object more or less can have much effect on performance, until I saw an example given by a Sun engineer at Java Games involving an unfortunately sized object.

Listing 3. Object in old space


private static Object bigObject;

public static void test(int size) { 
  long startTime = System.currentTimeMillis(); 
  long numObjects = 0; 
  while (true) { 
    //bigObject = null; //explicit nulling 
    //SizableObject could simply be a large array, e.g. byte[] 
    //In the JavaGaming discussion it was a BufferedImage 
    bigObject = new SizableObject(size); 
    long endTime = System.currentTimeMillis(); 
    ++numObjects; 
    // We print stats for every two seconds 
    if (endTime - startTime >= 2000) { 
      System.out.println("Objects created per 2 seconds = " + numObjects); 
      startTime = endTime; 
      numObjects = 0; 
    } 
  } 
}

This example simply loops, creating a large object and assigning it to the same variable, reporting the number of objects created every two seconds. Modern JVMs use a generational garbage collection scheme, creating young objects in one space (called Eden) and then moving them to another space if they survive past the first garbage collection. Collecting objects in Eden, the young generation space where new objects are created, is much faster than garbage collecting objects in the "old" generation space. But if Eden is full and no space can be reclaimed, the live objects in Eden must be moved to the old generation to make room for new objects. Without the explicit null assignment, if the object being created is large enough, then Eden gets full and the garbage collector cannot reclaim the currently referenced object. Consequently, the object gets moved to the old generation space and takes more time to garbage collect.

With the explicit null assignment, Eden gets freed each time before the new object is created, so garbage collection is much faster. In fact, with the explicit nulling, the loop creates five times as many objects in two seconds as without the explicit nulling -- but only if you choose objects that are big enough to just fill Eden, about 500 Kilobytes for the default 1.4 JVM configuration on Windows. That's a performance difference of five times faster due to one null assignment! But do note that the reason for this performance difference is because the variable is scoped incorrectly, for which the null assignment is simply a workaround, and also because the object is very large. The whole discussion is further extended in the article "Nulling variables and garbage collection" (see Resources).

Best practice
That was an interesting example, but it is worth emphasizing that the best practice is to correctly scope variables, and to not explicitly null them. Although explicitly nulling variables should normally have no effect, there are also pathological examples where it could have a significant negative effect on performance. For example, iteratively or recursively nulling elements of a collection where the collection object would otherwise be eligible for garbage collection actually adds overhead to a program rather than helping the garbage collector. Keep in mind that the example here was a deliberately mis-scoped one, essentially a case of unintentional object retention.

Resources

Read all the tips in the Eye on performance series.
The Java Games community at java.net, formerly at www.javagaming.org, is a great place to pick up performance tips.
Take a look at this recent HotSpot article on garbage collection.
The issue of nulling references was discussed in "Nulling variables and garbage collection," in issue 060 of The Java Specialists' Newsletter.
"Handling memory leaks in Java programs" (developerWorks, February 2001) provides an excellent review of what causes Java memory leaks and when these leaks should be of concern.
Todd Sundsted's "The practice of peer-to-peer computing: Introduction and history" (developerWorks, March 2001) examines the humble beginnings of P2P and where it fits into the broader distributed computing landscape.
The authors' Web site, Java Performance Tuning, offers a wealth of performance tips and suggested reading.
Visit the Developer Bookstore for a comprehensive listing of technical books, including hundreds of Java-related titles.
Find hundreds of articles about every aspect of Java programming in the developerWorks Java technology zone.

About the authors
Jack Shirazi is the Director of JavaPerformanceTuning.com and author of Java Performance Tuning (O'Reilly). Jack was an early adopter of Java, and for the last few years has consulted primarily for the financial sector, focusing on Java performance. Contact Jack at jack@JavaPerformanceTuning.com.

Kirk Pepperdine is the Chief Technical Officer at JavaPerformanceTuning.com and has been focused on Object technologies and performance tuning for the last 15 years. Kirk is a co-author of the book ANT Developer's Handbook (SAMS). Contact Kirk at kirk@JavaPerformanceTuning.com.

developerWorks > Java technology

About IBM | Privacy | Terms of use | Contact