Garbage collection in the JVM is often treated as a dark art. We know that we’re supposed to be thankful to the JVM for freeing us from worrying about the intricacies of memory management. At the same time, we’d like to retain some amount of control over this process as well. The challenge for most developers is that we’re not sure how.

In this series of posts, I’m going to discuss the garbage collection mechanisms available in Java 6, and how we might train it to do our bidding.

Languages such as C required the programmer to be keenly aware of the memory requirements of a program. You knew exactly how many bytes you needed for a data structure, as well as how to request the operating system for those bytes in memory. With that kind of power came the responsibility of ensuring that you freed this memory once you were done with it. Without the explicit freeing of this memory, the data structure was locked in memory, remaining unavailable both to your program as well as to the operating system. It is easy to imagine how a poorly written program could starve itself by being profligate in its misuse of memory.

A new world order was established with the introduction of Java and the managed memory model it introduced. No more did the application control the allocation and freeing of memory – all that was now the purview of the virtual machine. All that an application did was use the new operator to allocate an object, and all the magic happened behind the scenes. Not only was the memory allocated automatically from the heap, but also was freed automatically whenever that object was no longer needed. This article describes the magic behind this mechanism. The Java heap is an area in memory that is allocated to the virtual machine, and is used to meet the memory needs of a running application. This block of memory is specified using the -Xms and -Xmx VM parameters, as shown:

java -Xms256m -Xmx512m …

This informs the java interpreter that we are requesting a starting heap size of 256MB. If the application’s memory needs exceed that limit, then the JVM may request additional memory from the operating system, until the maximum limit is reached, at 512MB. If an application requires memory beyond this maximum limit, then an Out of Memory error will result. An optimization is to set both these values to the same number. This ensures that the maximum allowable memory is allocated at one shot, and no further dynamic expansion of the heap has to happen.

Heap Structure

The JVM’s heap is not simply a linear byte array. Instead, it is comprised of the following 3 areas:

Permanent Area

This area contains Class and  Method objects for the classes required by the application. This area is not constrained by the limit imposed by the -Xmx parameter.

The size of this area is managed using the -XX:PermSize and -XX:MaxPermSize JVM parameters. The former sets the initial size, while the latter sets the maximum size of this area. Again, in order to prevent the dynamic growing of this space, and the resulting slowdown as the garbage collector kicks in, you could set both these parameters to the same value. Further, make sure you set this space large enough to hold all the classes needed by your application – else your application will fail with an error that indicates that you are out of PermGen space – even though your heap may have plenty of headroom available. This area was called “permanent” because older JVMs (prior to 1.4) would never garbage collect this area. Objects loaded into here were locked in place until the JVM exited. Newer VMs provide the -noclassgc parameter that lets you tune this behavior. If this parameter is not set, the JVM will garbage collect within this area if it needs memory, especially during a Full Collection cycle (we’ll see more about this in a bit).

Young Generation Area

Most objects created by an application are ephemeral – they only live for a very short period of time. It makes sense therefore for such objects to be confined to a fairly small sandbox that can be combed through on a very frequent basis. This improves the efficiency of the collection operation for two reasons – first, collections tend to be much faster when the area to comb is small; and second, the results of a collection tend to be much more productive as most of the objects created here are short lived and so their space can be readily reclaimed.

Old Generation Area

This area generally contains objects that are fairly long lived, i.e., those that have survived multiple collection cycles within the young generation area. The idea behind promoting long lived objects here is to avoid the overhead of continually managing these objects in the young generation area. I.e., it helps optimize the young generation collections to not have to bother with objects that are known to be long lived. This is often much larger than the new generation area, and so garbage collection is much more involved here.

<End of part 1>

Continue on to part 2 >