DEV Community

Ted M. Young
Ted M. Young

Posted on

Stop Trying to Outsmart the Java Compiler

When Java 9 was (finally) released, it was considered a disappointment, or a disaster, depending on who you talked to. Lost among the debates about Project Jigsaw (the Java Module System) was a big improvement to the way strings are concatenated.

When I talk to even seasoned Java coders, many aren't aware of the changes that JEP 280 (JEP is the "Java Enhancement Proposal", part of the process of bringing new features into Java) brought to the world. Unfortunately, the title of that proposal: "Indify String Concatenation" didn't exactly help. (For what it's worth, indify here means using invokedynamic, which doesn't help if you don't know what that means either.)

I'll summarize the changes and benefits, but leave the detailed explanations and performance benchmarks to the experts.

Before Java 9

When you concatenated strings, like this:

String fullName = first + " " + last;

the Java compiler would automatically change it to be:

String fullName = new StringBuilder().append(first).append(" ").append(last).toString();

The idea here is that StringBuilder is generally more performant. The nice thing is, you didn't have to write that StringBuilder code yourself! But what happens in the future when there are new and faster ways to concatenate strings? Do we have to then replace all of our StringBuilders? (For those who remember, we did that with the move from StringBuffer to StringBuilder when we realized that synchronizing everything wasn't a good idea.)

With Java 9 and JEP 280, the compiler doesn't try and pick the best thing (StringBuilder? StringBuffer? Plain concatenation?), it just calls out to a method that the runtime (the JVM) can replace with something optimized for the runtime behavior of the code. Essentially the code above would look like:

String fullName = makeConcatWithConstants(first, last, SPACE_CONSTANT);

(Though that's not exactly right, because there's more going on -- read JEP 280 for the details -- it's close enough for this discussion.)

As Claes Redestad says in the "String Concatenation, redux" post: "optimize the runtime, not the bytecode".

Not For Loops

While the compiler is smart enough to optimize the string concatenation above, if you're in a loop, especially a very large one, you'll need to manually pull the StringBuilder variable outside of the loop and use StringBuilder's .append() method for the concatenation. For example, if you write

    String line = "";
    for (int i = 0; i < 20; i++) {
      line = line + "line (" + i + ")\n";
    }
    System.out.println(line);

Then the compiler will only optimize the concatenation going on inside the loop, which is still not as efficient as the "manual" optimization of pulling the StringBuilder outside the loop, like this:

    StringBuilder lineBuilder = new StringBuilder();
    for (int i = 0; i < 20; i++) {
      lineBuilder.append("line ").append(i).append("\n");
    }
    String line = lineBuilder.toString();

On the other hand, if your loop is guaranteed to be pretty small (like this one, with the constant of 20 lines), then I'd lean towards the more readable + for concatenation than using StringBuilder.

Don't Trust Me and Maybe Not Benchmarks

You can try "micro" benchmarks on the above code (using the JMH tool), but these days there's so many interactions between the runtime JVM, including the garbage collector and optimizer, the operating system, and the chip you're running on that it's difficult to generalize from a micro-benchmark. I'd much rather rely on higher-level measurements, such as load testing, to find bottlenecks. I often tell my Java students that you're much more likely to have slow systems because you're slinging JSON around, because you have to parse and serialize them, than because your string concatenation isn't optimized.

Slower in Java 11?

While all of this (and other things) have resulted in better String concatenation performance (see Heinz Kabutz's presentations linked in the references section), not everything has been smooth. It turns out there is a performance regression (unresolved when I wrote this) in Java 11.0.2 where string concatenation might be a bit slower. See this bug for details.

References:

  1. JEP 280 - Indify String Concatenation
  2. Video: "Enough java.lang.String to Hang Ourselves" by Heinz Kabutz
  3. Slides from "Enough java.lang.String to Hang Ourselves"
  4. "JDK 9/JEP 280: String Concatenations Will Never Be the Same" by Dustin Marx
  5. "String concatenation, redux" from Claes Redestad
  6. "Digging into JEP 280: Indify String Concatenation" by Mete Balci

Top comments (0)