DEV Community

Nikos Katsanos
Nikos Katsanos

Posted on

For Loops, Allocations and Escape Analysis

For Java applications in certain domains it is truly important that the creation of objects/garbage stays at a minimum. Those applications usually cannot afford GC pauses, hence they use specific techniques
and methodologies to avoid any garbage creation. One of those techniques has to do with iterating over a collection or an array of items. The preferred way is to use the classic for loop. The enhanced-for loop
is avoided as 'it creates garbage', by using the collection's Iterator under the cover.

In order to prove this point i was playing around with loops as i wanted to better understand the differences and measure the amount of garbage that is been created by using the enhanced-for loop, which
arguably is a better, more intuitive syntax.

Prior on experimenting on this, I had (falsely?) made some assumptions:

  • Using a normal for loop over an array or a collection it does not create any new allocations
  • Using an enhanced-for loop over an array(?) or a collection it does allocate
  • Using an enhanced-for loop over an array or a collection of primitives, by accidentally autoboxing the primitive value, it ends up in a pretty high rate of new objects creation

In order to better understand the differencies and especially the fact that an array does not have an iterator, hence how the enhanced-for loop works, I followed the below steps.

Step 1: Enhanced-for Loop Under The Cover

An enhanced-for loop is just syntactic sugar, but what it actually results into, when used for an array and when used on a collection of items?

The answer to this can be found in the Java Language Specification.

The main two points from the above link are:

If the type of Expression is a subtype of Iterable, then the translation is as follows.
If the type of Expression is a subtype of Iterable for some type argument X, then let I be the type java.util.Iterator; otherwise, let I be the raw type java.util.Iterator.
The enhanced for statement is equivalent to a basic for statement of the form:

for (I #i = Expression.iterator(); #i.hasNext(); ) {
    {VariableModifier} TargetType Identifier =
        (TargetType) #i.next();
    Statement
}

and

Otherwise, the Expression necessarily has an array type, T[].
Let L1 ... Lm be the (possibly empty) sequence of labels immediately preceding the enhanced for statement.
The enhanced for statement is equivalent to a basic for statement of the form:

T[] #a = Expression;
L1: L2: ... Lm:
for (int #i = 0; #i < #a.length; #i++) {
    {VariableModifier} TargetType Identifier = #a[#i];
    Statement
}

From the above someone can observe that indeed the Iterator is used on the enhanced-for loop on collections. However, for an array, the enchanced-for loop is just syntactic sugar which is equivalent to a normal for loop.

After understanding how the JVM is actually implementing the enhanced-for loop on different use cases our assumptions have changed:

  • Using a normal for loop over an array or a collection it does NOT create any new allocations
  • Using an enhanced-for loop over an array it does NOT create any new allocations
  • Using an enhanced-for loop over a collection it does allocate
  • Using an enhanced-for loop over an array or a collection of primitives, by accidentally autoboxing the primitive value, it ends up in a pretty high rate of new objects creation

Step 2: Defining The Test

In order to test the different scenarios I have created a very simple test which can be seen here

The test itself is very simple, the main points to notice are:

  • The test creates a static array and a static ArrayList and pre-populates them with 100,000 integers. In the case of the array, those are primitives, but in the case of the collection as we use plain ArrayList those are actual Integer objects
  • The test executes the different for loop example scenarios 1,000,000 times
  • The memory used is read before the iterations start and is compared throughout the execution (every 100 invocations) of the program in order to determine if the memory profile has changed
  • The test scenarios include:
    • A for loop over an array
    • An enhanced-for loop over an array
    • An enhanced-for loop over an array, by also autoboxing the elements
    • A for loop over a collection
    • An enhanced-for loop over a collection
    • An iterator based for loop over a collection, replicating the behaviour of enhanced-for loop's syntactic sugar

Step 3: Running The Test

We ran the test with the below setup:

  • OS: MacOS Catalina (10.15.3), Core i5 @2.6Hz, 8GB DDR3
  • JDK: openjdk version "13.0.2" 2020-01-14
  • JVM_OPTS: -Xms512M -Xmx512M -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC

We use EpsilonGC in order to eliminate any garbage collection and let the memory to just increase.

When running the test, some scenarios were easy to verify according to our expectations:

  • A for loop over an array or a collection, does not create any allocations
  • An enhanced for loop over an array does not create any allocations
  • An enhanced for loop over an array with autoboxing, it is indeed creating new objects

However, the rest of scenarios and the assumption that an enhanced-for loop over a collection will allocate a new Iterator on every loop could not be proved by running the above test, with the above JVM properties. No matter what
the memory profile was steady. No new allocations were taking place on the heap.

First step of the investigation was to make sure that the byte code indicates that a new object gets created. Below is the bytecode, which can be used to verify that a call to get the iterator is taking place in line 5:

  private static long forEachLoopListIterator();
    Code:
       0: lconst_0
       1: lstore_0
       2: getstatic     #5                  // Field LIST_VALUES:Ljava/util/List;
       5: invokeinterface #9,  1            // InterfaceMethod java/util/List.iterator:()Ljava/util/Iterator;
      10: astore_2
      11: aload_2
      12: invokeinterface #10,  1           // InterfaceMethod java/util/Iterator.hasNext:()Z
      17: ifeq          39
      20: lload_0
      21: aload_2
      22: invokeinterface #11,  1           // InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object;
      27: checkcast     #8                  // class java/lang/Integer
      30: invokevirtual #4                  // Method java/lang/Integer.intValue:()I
      33: i2l
      34: ladd
      35: lstore_0
      36: goto          11
      39: lload_0
      40: lreturn

As we are using an ArrayList the next step is to see what the call to #iterator() is doing. It is indeed creating a new iterator object as can be seen in ArrayList source code

    public Iterator<E> iterator() {
        return new Itr();
    }

Looking at the above, the results that we are getting with a steady memory profile do not make much sense. Something else is definitely going on. It might be that the test is wrong (i.e. some code is removed by the JIT as returned value of that block is never used).
This should not be happening as the returned value of all the methods that exercise the loops is used to take a decision further down on the program, hence the loops must be executed.

My final thinking was the 'unlikely' scenario that the objects were been placed on the stack. It is known that Hotspot performs this kind of optimizations, by using the output of Escape Analysis.
To be honest I have never seen it happening (or at least I never had the actual time to verify it was indeed happening) until now.

Step 4: Running Without Escape Analysis

The easiest and fastest way to verify the above assumption, that Escape Analysis was feeding into JIT and was causing the objects to get allocated on the stack, is to turn off Escape Analysis. This can
be done by adding -XX:-DoEscapeAnalysis in our JVM options.

Indeed, by running the same test again this time we can see that the memory profile for an enhanced-for loop over a collection is steadily increasing. The Iterator objects, created from the ArrayList#iterator()
are been allocated on the heap on each loop.

Conclusion

At least for myself the above finding was kind of interesting. In many occasions, mainly because of lack of time, we just make assumptions and empirically follow practises that are "known to be working". Especially for people that
are working in a delivery oriented environment, without the luxury to perform some research I would think this is normal. It is interesting though to actually do some research from time to time and try to prove or better understand a point.

Finally, it is worth saying that the above behaviour was observed in an experiment, rather than in actual code. I would imagine the majority of cases in a production system to not exhibit this behaviour (i.e. allocating on the stack), but the
fact that JIT is such a sophisticated piece of software is very encouraging, as it can proactive optimize out code without us realizing the extra gains.

This post is also hosted on my personal blog nikoskatsanos.com

Top comments (0)