DEV Community

loading...
Cover image for Concatenation performance boost

Concatenation performance boost

aminnairi profile image Amin ・8 min read

All credits for the cover image go to Alessio Barbanti.

You probably have encoutered the problem where you wanted to concatenate two arrays. And you probably know that for this particular case, the Array.prototype.concat method is often the answer to that problem.

If you are not familiar with Array.prototype.concat, here are some examples.

"use strict";

const xs = [1, 2, 3];
const ys = [4, 5, 6];
const zs = xs.concat(ys);

console.log(xs); // [ 1, 2, 3 ]
console.log(ys); // [ 4, 5, 6 ]
console.log(zs); // [ 1, 2, 3, 4, 5, 6 ]

So here, we define two constants that are arrays: one is called xs and contains the numbers from one to three. The other is called ys and is representing the range of numbers from four to six. Then we define a third constant which is called zs and is responsible for holding the concatenation of xs and ys. Note that you need to call the Array.prototype.concat method on one array to merge it with another. Since xs and ys are arrays, there is no problem doing xs.concat(ys). The result is obviously another array containing the numbers from one to six.

What is going on here?

If you still don't understand how this can happen, it can be helpful to try and define our own concat function.

"use strict";

function concatenate(xs, ys) {
    const zs = [];

    for (const x of xs) {
        zs.push(x);
        // [1]
        // [1, 2]
        // [1, 2, 3]
    }

    for (const y of ys) {
        zs.push(y);
        // [1, 2, 3, 4]
        // [1, 2, 3, 4, 5]
        // [1, 2, 3, 4, 5, 6]
    }

    return zs; // [1, 2, 3, 4, 5, 6]
}

const xs = [1, 2, 3];
const ys = [4, 5, 6];
const zs = concatenate(xs, ys);

console.log(xs); // [ 1, 2, 3 ]
console.log(ys); // [ 4, 5, 6 ]
console.log(zs); // [ 1, 2, 3, 4, 5, 6 ]

So, what is going on here? First, we defined our function which takes two arrays (remember, concatenation is merging two arrays together). We then create a variable called zs which will hold all the values of our two arrays and initialized with an empty array. Then, we loop through all the items of the first array called xs, and push them into our final array (which is zs). So now, our zs array contains the following values [1, 2, 3]. We do the same for ys, meaning looping through all the items of the ys array and pushing them to the zs one. Now we end up with a zs array that looks like [1, 2, 3, 4, 5, 6]. Great! We can now return the zs array, leaving the two arrays xs and ys untouched. We did it!

Unpack our pack

What if I tell you that there is another way of doing that? Especially since the new ECMAScript 2015 standard implemented in JavaScript. It looks like this.

"use strict";

const xs = [1, 2, 3];
const ys = [4, 5, 6];
const zs = [...xs, ...ys];

console.log(xs); // [ 1, 2, 3 ]
console.log(ys); // [ 4, 5, 6 ]
console.log(zs); // [ 1, 2, 3, 4, 5, 6 ]

Of course the result is the same, but what is going on here? To understand it, I like to think of the [] operator as someone that is packing something. Like numbers. So to pack the number 1 we would do [1]. Easy right? Well, the spread operator ... is just the inverse, meaning it will unpack our pack. Meaning that doing ... on [1] will give us 1. But that is not quite the case because you cannot unpack your values without putting them in a certain context. For instance, doing this will fail.

"use strict";

const xs = [1];
const x = ...xs;

You will just end up with this error.

$ node main.js
SyntaxError: Unexpected token ...

But we can use it to put it in another box (or a pack, or a context, just synonyms). Like another array.

"use strict";

const xs = [1, 2, 3];
const ys = [...xs];

console.log(xs); // [ 1, 2, 3 ]
console.log(ys); // [ 1, 2, 3 ]

So now we know that we can spread an array into another, and it is equivalent as unpacking all values for an array and packing them back into another. And as we saw in the previous example, we can do this for two, three, or N arrays as well.

"use strict";

const as = ['a', 'b', 'c'];
const bs = ['d', 'e', 'f'];
const cs = ['g', 'h', 'i'];
const ds = [...as, ...bs, ...cs];

console.log(as); // [ 'a', 'b', 'c' ]
console.log(bs); // [ 'd', 'e', 'f' ]
console.log(cs); // [ 'g', 'h', 'i' ]
console.log(ds); // [ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i' ]

Great! But I talked about performance boost in this article, and some of you may think that I tricked you into reading this article. But I assure you that you will not be disapointed.

The results were quite impressive

Lately, I have been writing a slide to prepare one conference I will do in France for my school and the theme was: the Web performance. So obviously, I couldn't resist making one about the JavaScript language. This is when I started experimenting with an awesome website called JSPerf. It allows you to write test cases for about anything and just compare the benchmarks.

I was really curious since we have multiple ways of doing a concatenation in JavaScript such as the two solutions I provided in this article. Obviously, I went in JSPerf and wrote the tests case as follow.

"use strict";

// functions definitions
const concatenate = (xs, ys) => xs.concat(ys);
const concatenate2 = (xs, ys) => [...xs, ...ys];

// test variables
const xs = [1, 2, 3];
const ys = [4, 5, 6];

// tests
concatenate(xs, ys);
concatenate2(xs, ys);

Deadly simple test. Notice here that I used arrow functions just for the sake of making a compact code. In this case, since I am not refering to any previous context, this has absolutely no differences with writing a full function definition. I was just being lazy here.

Now that this is written, let's run some benchmark, shall we?

Alt Text

Unfortunately, I couldn't test it on other browsers. But the results were quite impressive from my point of view. We can tell some things about those results.

We can see here that it is better to use the spread operator than using the concat method in both of these browsers if you need performance. The first one is that the spread operator is a langage construct. So it knows exactly what to do with it while concat is a method. And when called, the JavaScript engine needs to run some various checks before calling it such as knowing if the concat method does indeed exist on what we are calling. Here it obviously exist on an array since its prototype is Array. But still, this is an engine, not a human an it needs to do this special check. Also, it needs to call the corresponding method, and it has a cost (a slightly one though). Putting it all together can make it a little slower.

But most importantly, we can see that is is way, way, way better to use the spread operator on Chrome. It seems that the Chrome dev team has made some huge performance improvements on using the spread operator compared to the concat method. In fact, on my Chrome version, it is 68% slower to use the concat method than to use the spread operator.

My conclusion from here would be to use the spread operator if you can. This means in an environment that supports at least the ECMAScript 2015 standard. For versions below, you will have no choice other than using the concat method. But is it true? To be sure, I wanted to use our custom homemade version of the concat method, but with a slight change.

Just concatenating two arrays together

See, we are dealing with arrays that have a finite length. If you have done some C++, you know that you can have roughly two basic types of arrays: those that have a fixed length, and those that haven't (which are often being refered to as vectors). But in our case, we are in JavaScript, and it is a dynamic language, so for the JavaScript engine, an array must always be dynamic in order to provide all those features like pushing into any array, right?. But inside, the JavaScript engine performs hidden optimizations. Like for instance, pushing only numbers values until you start pushing a string (unlike C++, JavaScript array can be heterogeneous). At this moment, it adds some overhead because it needs to use another type of array to have multiple values of differents types linked together. And this can be costy.

As we said, we are only dealing with two arrays of finite length. There is no concept of a vector being push new values here. Just concatenating two arrays together, nothing more. So, let's think about it to update our concatenate function. We will call it concatenate3 in order to compare it with the two others.

function concatenate3(xs, ys) {
    const xsl = xs.length;
    const ysl = ys.length;
    const zs = new Array(xsl + ysl);

    for (let i = 0; i < xsl; i++) {
        zs[i] = xs[i];
    }

    for (let i = 0; i < ysl; i++) {
        zs[i + xsl] = ys[i];
    }

    return zs;
}

We said, again, that our arrays had a finite length, so we used the Array constructor and added the length of our two arrays to make an array of xs.length + ys.length elements. From here, our array is static because it has a finite length in the eyes of the JavaScript engine. Then, we simply loop and add the element to the final array just like we did earlier, at the difference that now we are not using the push method but directly referencing the index in order to keep the engine from going through all the process of calling the push method. This force us to think differently though as for our second array, we cannot start at the index 0, but we need to start at the index i + xs.length. We didn't push anything, so our array keep being a static one. We simply return the array in the last instruction, leaving the two others untouched again.

Take a seat ladies and gentleman, what you are going to witness is another level of performance boost.

Alt Text

This is just awesome. Who would think that our custom homemade function for concatenating two arrays would be so much faster than both langage construct and method? This is some huge performance boost we gained here and on Chrome, the concat method call is now about 80% slower than ours.

Premature optimizations of our source-code can be really poisonous

In conclusion, I'll say that we have made some great improvements of performances here, but at the cost of research and development. In a real world case, this won't be as easy as that because here we have used a dead simple example. Premature optimizations of our source-code can be really poisonous for the completion of our tasks. The JavaScript engine already performs some huge improvements under the hood to make all our JavaScript code co-exist and perform better. Only optimize when you witness some issues on your Website/server script execution.

What do you think of these results? Let's talk about it in the comment section below! Also, if you want to contribute by testing it on others browsers, I would be glad to check your numbers out. You can check out my test suite for this particular case here on JSPerf.

Thanks for reading and keep being curious!

Discussion (1)

pic
Editor guide
Collapse
khalyomede profile image
Khalyomede • Edited

Perfect conclusion! Your last optimization reminds me this article I wrote on how the indexes in the databases can help to reduce time spent by the database engine, maybe you could take inspiration for your conference :


Best of luck 😉