Damodar's Musings

web development and miscellany

Browsing Posts published in April, 2010

The image below demonstrates what the JVM heap might look like after an application has been running for some time.

As we saw in the last article, the JVM’s heap is divided into 3 main areas – the permanent generation (purple), the old generation (yellow), and the young generation. The young generation area is further subdivided into Eden (green), and two survivor spaces (blue and orange).

The root set of an application is a set of objects that define the highest level references available to the application. The most common members of this set are any static references available to the application, as well as any local variable references to objects that have been allocated by the application itself.

An object on the heap is considered to be “live” if a path can be traced from a root set member to that given object; otherwise the object is considered “dead”. Live objects are said to have passed the “reachability” test, whereas dead objects have failed that test. Live objects are important and must be preserved, whereas dead objects are considered garbage, and must be reclaimed during the next garbage collection cycle.

In the image below, live objects are indicated by a bright brick pattern, while dead objects are represented using a washed out wavy pattern.

We will use this diagram for further exploration of the GC mechanism.

Garbage collectors can typically be described using these dimensions:

  1. Tracing garbage collectors
    This describes a collector that relies on the reachability test to determine which objects should be preserved, and which ones are garbage and so need to be reclaimed.
  2. Stop the World or Concurrent collectors
    Collectors often rely on the fact that no mutations of the heap may occur once the garbage collector has begun its operation. A Stop the World collector freezes all activity in the JVM until it has completed its cycle. On the other hand, a concurrent collector attempts to run the garbage collection process concurrently with the running application. Allowing concurrent operation ratchets up the complexity of its implementation significantly. In addition, such collectors may not be as aggressive at cleaning up garbage as Stop the World collectors.
  3. Compacting collectors
    As memory allocations and deallocations occur over the life of an application, the heap tends to get rather fragmented. This degrades the ability of the JVM to satisfy requests for large contiguous blocks of memory. Compacting collectors address this by defragmenting the heap – by moving live objects so that they are adjacent to each other, maximizing the contiguous free space available on the heap. This results in more efficient memory allocation.

    An alternative to compacting is to allow an object to span non contiguous blocks of memory. The tradeoff is increased implementation complexity and memory management overhead.

  4. Generational Collectors
    Most objects allocated by an application are short lived. Usually, these objects survive only for the duration of a single method call, before being eligible for reclamation.

    Further, the effort required to manage memory is directly proportional to the size of the memory block that needs to be managed. In other words, the smaller the area to be managed, the faster the collection will complete, thereby increasing the amount of CPU time available to the application itself.

    A generational collector takes advantage of both these principles.

    First, it divides the heap into multiple areas – to reduce the area that must be combed during a garbage collection cycle.

    Second, it ensures that each area holds objects of a given generation. The permanent generation hold objects that are relatively permanent; the old generation holds objects that are considered elderly; and the young generation holds objects that are relative newcomers (either newborn or middle aged). This ensures that most of the collection activity can be focused on the small area of the heap that contains the volatile young generation; with old generation collection being required only rarely.

Garbage collectors in the Java world, contain one or more of the above characteristics. For instance, the standard serial collector is a generational, tracing, Stop-the-World, and compacting, type of collector.

The Young Generation area is sub divided further into Eden and two Survivor spaces. At any given time, there is only one “active” survivor space (labeled “From”). The inactive space is labeled “To”. After each successful garbage collection cycle within the Young Generation, the active and inactive survivor spaces switch roles.

Let us assume that 2 new allocations (the green checkerboard pattern) are being requested by the application. If there is enough memory available in Eden, the allocation is made in Eden (as shown), and the application proceeds with no latency being experienced.

The diagram shows that the first allocation succeeded and now resides on the heap.

However, the second allocation request was larger than available space in the Eden! The JVM is unable to fulfill the request, and so it awakens the garbage collector to come to its aid. This results in a Young Generation collection cycle.

The Young Generation collector begins by performing the reachability test on all objects in Eden and in the currently active Survivor space (labeled From). The live objects are identified using the reachability test. Next, it begins copying the active objects out into either the inactive Survivor space (if the object is fairly new) or into the Old Generation area (if the object has survived a number of previous young generation collections). This results in the heap as shown:

A few things to note here:

  1. The newly allocated object is created in Eden (as we’d expect)
  2. The active and inactive Survivor spaces switch roles (and are re-labeled) at the end of the garbage collection cycle.
  3. The previous contents of the Young Generation [Eden + previous From] have now been copied over either to the inactive Survivor space or to the Old Generation area. Here, 2 middle-aged active objects (blue) from the previous From, and 2 newborn active objects (green) from Eden have been copied over to the inactive Survivor space; and 1 elderly active object (blue) from the previous From has been copied over to the Old Generation.
  4. The requested object allocation (checkerboard green) was now satisified from Eden, and it now resides there.
  5. Both Eden and the previously inactive survivor space (To) are nicely defragmented, with active objects on one side and free space on the other. (I’ve not shown the areas marked as garbage in these areas).

This is a fairly quick and painless collection operation, and even if the collector were to stop the world, the pause would be rather unnoticeable. Hence, a Young Generation collection is also termed a “Minor Collection”.

So, what happens when the Old Generation itself becomes full? Let’s reimagine our heap as shown below, with a new object allocation waiting to be satisfied.

In this case, a Young Generation collection is called for because there isn’t enough space available in Eden to satisfy this allocaiton request. However, unlike the earlier scenario, the promotion of the middle-aged object from the active Survivor space to being an elderly object in the old generation cannot proceed because there is no free space available in the Old Generation. In this case, the Young Generation collection is put on hold, while an Old Generation collection cycle is spun up to free up space in the Old Generation.

This results in what is termed a Major Collection, since this activity is expensive (given the size of the entire heap), and the pause suffered by the application (for Stop The World collectors) is noticeable.

However, in the end, the net result of the heap should be exactly as shown as the result of the Young Collection.

This ends part 2 of this series.

Continue on to part 3 >

Garbage collection in the JVM is often treated as a dark art. We know that we’re supposed to be thankful to the JVM for freeing us from worrying about the intricacies of memory management. At the same time, we’d like to retain some amount of control over this process as well. The challenge for most developers is that we’re not sure how.

In this series of posts, I’m going to discuss the garbage collection mechanisms available in Java 6, and how we might train it to do our bidding.

Languages such as C required the programmer to be keenly aware of the memory requirements of a program. You knew exactly how many bytes you needed for a data structure, as well as how to request the operating system for those bytes in memory. With that kind of power came the responsibility of ensuring that you freed this memory once you were done with it. Without the explicit freeing of this memory, the data structure was locked in memory, remaining unavailable both to your program as well as to the operating system. It is easy to imagine how a poorly written program could starve itself by being profligate in its misuse of memory.

A new world order was established with the introduction of Java and the managed memory model it introduced. No more did the application control the allocation and freeing of memory – all that was now the purview of the virtual machine. All that an application did was use the new operator to allocate an object, and all the magic happened behind the scenes. Not only was the memory allocated automatically from the heap, but also was freed automatically whenever that object was no longer needed. This article describes the magic behind this mechanism. The Java heap is an area in memory that is allocated to the virtual machine, and is used to meet the memory needs of a running application. This block of memory is specified using the -Xms and -Xmx VM parameters, as shown:

java -Xms256m -Xmx512m …

This informs the java interpreter that we are requesting a starting heap size of 256MB. If the application’s memory needs exceed that limit, then the JVM may request additional memory from the operating system, until the maximum limit is reached, at 512MB. If an application requires memory beyond this maximum limit, then an Out of Memory error will result. An optimization is to set both these values to the same number. This ensures that the maximum allowable memory is allocated at one shot, and no further dynamic expansion of the heap has to happen.

Heap Structure

The JVM’s heap is not simply a linear byte array. Instead, it is comprised of the following 3 areas:

Permanent Area

This area contains Class and  Method objects for the classes required by the application. This area is not constrained by the limit imposed by the -Xmx parameter.

The size of this area is managed using the -XX:PermSize and -XX:MaxPermSize JVM parameters. The former sets the initial size, while the latter sets the maximum size of this area. Again, in order to prevent the dynamic growing of this space, and the resulting slowdown as the garbage collector kicks in, you could set both these parameters to the same value. Further, make sure you set this space large enough to hold all the classes needed by your application – else your application will fail with an error that indicates that you are out of PermGen space – even though your heap may have plenty of headroom available. This area was called “permanent” because older JVMs (prior to 1.4) would never garbage collect this area. Objects loaded into here were locked in place until the JVM exited. Newer VMs provide the -noclassgc parameter that lets you tune this behavior. If this parameter is not set, the JVM will garbage collect within this area if it needs memory, especially during a Full Collection cycle (we’ll see more about this in a bit).

Young Generation Area

Most objects created by an application are ephemeral – they only live for a very short period of time. It makes sense therefore for such objects to be confined to a fairly small sandbox that can be combed through on a very frequent basis. This improves the efficiency of the collection operation for two reasons – first, collections tend to be much faster when the area to comb is small; and second, the results of a collection tend to be much more productive as most of the objects created here are short lived and so their space can be readily reclaimed.

Old Generation Area

This area generally contains objects that are fairly long lived, i.e., those that have survived multiple collection cycles within the young generation area. The idea behind promoting long lived objects here is to avoid the overhead of continually managing these objects in the young generation area. I.e., it helps optimize the young generation collections to not have to bother with objects that are known to be long lived. This is often much larger than the new generation area, and so garbage collection is much more involved here.

<End of part 1>

Continue on to part 2 >

Powered by WordPress Web Design by SRS Solutions © 2012 Damodar's Musings Design by SRS Solutions