In this article, we will do a performance benchmarking of String and StringBuilder classes in Java and discuss how to modify strings efficiently.
Strings in Java are immutable; modifying the String creates a new String object in the heap memory with the latest content, and the original String is never changed.
Immutability
String str = "You cannot modify "
str = str + "me"
When we append the value "me" to the str variable, a new String object gets created with the new value You cannot modify me
and gets assigned to str. The original string You cannot modify
does not change.
Performance
Frequently modifying strings such as using the +
operator has significant performance issues, every time the +
append is used, a new String object gets created and reassigned.
To modify the strings efficiently, we should consider the StringBuilder, which changes the string and does not create any extra object in the heap memory.
String Modification
Use the StringBuilder class to modify the string; this does not create a new String object but changes the existing one.
StringBuilder str = new StringBuilder("You can modify.");
str.append("me");
Performance Benchmarking
Consider the concatenation
operation performance benchmark with the String and StringBuilder; consider the following.
- Consider 10 data points
- inputSample = [100k, 200k, 300k, 400k, 500k, 600k, 700k, 800k, 900k, 1m].
- Start with an empty string and concatenate the string "a" n time, where n = inputSample[i] i.e n = 700k.
- We want to know how long it takes to concatenate a string "a" n time for the
inputSample
using String and StringBuilder.
String Class
public class StringBenchmark {
public static void main(String[] args) {
String appendCharacter = "a";
int inputSample[] = new int[]{
100000, 200000, 300000, 400000,
500000, 600000, 700000, 800000,
900000, 1000000};
for (int n : inputSample) {
double startTime = System.nanoTime();
testStringAppend(n, "", appendCharacter);
double endTime = System.nanoTime();
double duration = (endTime - startTime) / 1000000000;
String seconds = String.format("%.2f", duration);
System.out.println("n = " + n + ": seconds: " + seconds);
}
}
static void testStringAppend(int n, String str, String appendCharacter) {
for (int i = 1; i <= n; i++) {
str += appendCharacter;
}
}
}
String Class Results
n = 100000: seconds: 0.38
n = 200000: seconds: 0.99
n = 300000: seconds: 2.14
n = 400000: seconds: 3.74
n = 500000: seconds: 5.79
n = 600000: seconds: 8.28
n = 700000: seconds: 11.16
n = 800000: seconds: 14.65
n = 900000: seconds: 18.29
n = 1000000: seconds: 22.43
StringBuilder Class
public class StringBuilderBenchmark {
public static void main(String[] args) {
String appendCharacter = "a";
int inputSample[] = new int[]{
100000, 200000, 300000, 400000,
500000, 600000, 700000, 800000,
900000, 1000000};
for(int n: inputSample){
double startTime = System.nanoTime();
testStringAppend(n, new StringBuilder(""), appendCharacter);
double endTime = System.nanoTime();
double duration = (endTime - startTime)/1000000000;
String seconds = String.format("%.7f", duration);
System.out.println("n = "+n+": seconds: "+seconds);
}
}
static void testStringAppend(int n, StringBuilder str, String appendCharacter){
for(int i = 1; i <= n; i++){
str.append(appendCharacter);
}
}
}
n = 100000: seconds: 0.0027
n = 200000: seconds: 0.0013
n = 300000: seconds: 0.0015
n = 400000: seconds: 0.0015
n = 500000: seconds: 0.0018
n = 600000: seconds: 0.0022
n = 700000: seconds: 0.0026
n = 800000: seconds: 0.0029
n = 900000: seconds: 0.0032
n = 1000000: seconds: 0.0036
Execution Time: Append
Input (n) | String (S) | String Builder (S) |
---|---|---|
100k | 0.38 | 0.0027 |
200k | 0.99 | 0.0013 |
300k | 2.14 | 0.0015 |
400k | 3.74 | 0.0015 |
500k | 5.79 | 0.0018 |
600k | 8.28 | 0.0022 |
700k | 11.16 | 0.0026 |
800k | 14.65 | 0.0029 |
900k | 18.29 | 0.0032 |
1m | 22.43 | 0.0036 |
Conclusion
- StringBuilder executes significantly faster than the String class when performing the
concatenation
ormodification
operations. - Modifying a String creates a new String in the heap memory. To change the content of the String, we should consider the StringBuilder class.
- Any attempt to modify the String class creates a new object in the heap memory, which has significant performance drawbacks.
- StringBuilder is ideal for modifying string content; it does so without creating any extra objects in the memory.
Top comments (14)
You are doing benchmark for 100k+ concatenations, and that's fine
But for me, the more interesting result would be: what is the limit for which the performance gap doesn't matter and we should use the cleaner API : String?
I've seen people using StringBuilder to avoid a few concatenation of small strings, and that's for me the pinnacle of premature optimization.
@kaleemniz I agree with Jean-Michel here. Under-the-hood Java's StringBuilder is implemented as a partially filled array. Doing n appends to a partially filled array requires time that is linear in n. On the other hand, concatenating n equal length strings with + requires time that is quadratic in n since each concat requires filling an increasing length array (length 2 then 3 then 4 .... the sum of which is quadratic in n).
So it is no surprise that with huge n like you are using that the StringBuilder is faster. You don't need to time anything for that. Linear time is asymptotically faster than quadratic time. Big-O however hides the effects of low order terms and constants, etc since it is focused on what happens for large inputs.
Microbenchmarks of alternatives with asymptotically different runtimes is far more interesting for smaller input sizes to discover where the break even point is. If n is 2 for example, concatenating the 2 Strings with + is almost certainly faster than the overhead of creating a StringBuilder, as is likely the case for the next few n as well.
But where is the break even point? When does the StringBuilder actually become faster? Your lowest n is 100000. Which for the task, where you are comparing a linear runtime and a quadratic runtime alternative for the same task, may as well be infinity as it doesn't provide any more info than an asymptotic analysis.
I'd be interested to see what you'll find with small n and using a microbenchmarking framework. When is String concatenating with + faster than using StringBuilder and when does StringBuilder become faster?
I think this is too general to make any type of rule as every situation is differnet. I'm using simple rule - use your intuition and micro-benchmark particular situation when in doubt ;)
To be little more specific - when I know that I'm adding string contactenation to the code which is guaranteed to be called often hundreds times per second and I'm not too concerned with worse readability, I will optimize the hell out of it. Good example was when I was writing logging wrappers - logging classes will process hundreds of thousands of strings from every part of application so every small piece matters.
But when I'm writing error message strings, email bodies sent from the code which is executed few times a minute I don't care and readability and maintainability is in the driver seat.
And with modern JDK the +/StringBuilder ratio shifted very much to using + sign almost all the time (depends of the type of application obviously).
Those were little bit extreme examples but that's the general way I'm approaching it.
This is such a note-worthy point that I did not measure what is the pivot point of n = k where String Builder becomes faster than String.
Not trying to be rude or smart-ass or anything but I may have some tips for better benchmarks ;)
This measurement has few problems so you may be not getting relevant results.
Also I would argue that relevance of micro-benchmarks are limited if you don't measure exactly the thing you're then using in real code. Isolated micro-benchmarks have of course purpose but can mislead as they may not tell you much about real situation where much more things are in play and modern compilers do not make it simpler as they introduce many tricks ;) In other words are you concactenating those numbers of such Strings in this loop in your real code? ;)
I'm not trying to tell that's the only way but I'm always micro-benchmarking with very narrow focus for some particular situation/problem where I need to decide which way to go and even then I'm always very careful about interpreting results.
some tips (see link at the end for deeper info and links) :
Good tips on java microbenchmarking: stackoverflow.com/questions/504103...
@kaleemniz there are a couple issues with your comparison. Check out Pavel's comment above on microbenchmarking frameworks. They handle the warmup phase that your comparison overlooks.
Also, the StringBuilder version isn't entirely fair. Ultimately if using a StringBuilder you'll eventually call toString, but yours does not.
I'd also rather see both versions have only n and the appended character as parameters. And instead of void, return a String. And then use a microbenchmarking framework.
Why this suggestion on returning a string and not passing the StringBuilder as a parameter? The version with repeated + is updating parameter variable which is not observable external to the method due to pass by value, and thus even the final string is subject to garbage collection. While in the StringBuilder version, the calls to append are changing state of the StringBuilder you passed, so those changes are externally observable.
These resources are super helpful and great tips thanks for writing such a detailed response.
I can't see the use of StringBuilder in 2rd code snippet.
Thank you so much for pointing out, It was a copy paste mistake, fixed now.
But after the use of StringBuilder, we usually call toString() method. Can you put it in the benchmark?
I tried to see if it makes a difference out of curiosity, but it does not really.
Reading the invaluable responses here is the highlight - the highlights will be helpful if there is part two on this subject.
Use a Microbenchmarking framework like JMH to see realistic results.
StringBuilder has a more complex API than String, so it's worth identifying the pivot point "k" for which StringBuilder becomes faster than String, making it easy to decide whether to use String or StringBuilder.
Do not pass String and StringBuilder as method parameters; instead, create String and StringBuilder inside the test functions and at the and use toString() to return the response.
I guess what this boils down to is memory allocation. Using strings concatenation, new memory is allocated each time, while using StringBuilder the buffer approximatively doubles in size when needed.
Rightly, modifying the String class creates a new String instance in the heap memory, which makes the execution of String append slow.