DEV Community

Arnaud
Arnaud

Posted on

Using Array.prototype.includes() vs Set.prototype.has() to filter arrays

During a code review, I noticed a piece of code that was calling includes inside filter.

The code looked something like this:

const valuesToKeep = [2, 4];
const valuesToTest = [1, 2, 3, 4];
const values = valuesToTest.filter((n) => valuesToKeep.includes(n)); // [2, 4]

The comment I left was along the lines:

"Hey, be careful, filter is O(n) and includes is O(n) so this is O(n2). This would perform as O(n) if you used a Set, BUT since the arrays are so small it probably does not make a difference".

To illustrate the comment with code, this is what I had in mind:

const valuesToKeep = [2, 4];
const valuesToTest = [1, 2, 3, 4];
const setToKeep = new Set(valuesToKeep);
const values = valuesToTest.filter((n) => setToKeep.has(n)); // [2, 4]

My comment didn't sit well with me because saying "hey this will perform OK because the data is a certain way" is not a good idea: the data may change, or maybe I'm just wrong.

So I decided to put this to test. I'm going to generate two arrays containing random integers: an array of values to keep, and an array of values to test. The premise is that the array of values to keep is much smaller than the array of values to test, so we're going to make the array of values to test 10 times bigger than the array of values to keep.

// array of values to keep
const valuesToKeep = Array.from({ length: LENGTH }, () => getRandomInt());

// array of values to check
const valuesToTest = Array.from({ length: LENGTH * 10 }, () =>
  getRandomInt()
);

Then we're going to run two tests: one using includes, and one using has, and we're going to start with LENGTH at 10, and increase it every time, as my premise is that for small array it won't matter much, but we want to see WHEN it starts mattering:

// filter using includes
console.time("includes");
valuesToTest.filter((v) => valuesToKeep.includes(v)); // n2
console.timeEnd("includes");

// filter using has
console.time("has");
const valuesToKeepSet = new Set(valuesToKeep);
valuesToTest.filter((v) => valuesToKeepSet.has(v)); // n
console.timeEnd("has");

And here are the results:

Length of values to keep:  1
Length of values to test:  10
includes: 0.207ms
has: 0.190ms

Length of values to keep:  10
Length of values to test:  100
includes: 0.020ms
has: 0.017ms

Length of values to keep:  100
Length of values to test:  1000
includes: 0.204ms
has: 0.071ms

Length of values to keep:  1000
Length of values to test:  10000
includes: 9.942ms
has: 1.307ms

Length of values to keep:  10000
Length of values to test:  100000
includes: 131.686ms
has: 8.016ms

Length of values to keep:  100000
Length of values to test:  1000000
includes: 1324.318ms
has: 71.495ms

So yes, I am right that with a small quantity of data, Array.includes and Set.has perform roughly the same, but we can see how quickly performance degrades, and the change is so small that it's hard to justify not making it, even for small data samples. Should the size of the data increase, especially the size of the valuesToKeep array, the code is future proof.

TLDR: when matching a value against a list of values, convert the Array to a Set first.

Top comments (9)

Collapse
 
harryjubb profile image
Harry Jubb

Hi, I'm not able to reproduce this in the latest Chrome (94.0.4606.61), running on OS X, even with a large value for n. I also tried in Node, including some older versions.

Was there a particular environment where this is an issue? Maybe Array.prototype.includes has been internally optimized since this was written?

Collapse
 
harryjubb profile image
Harry Jubb

Looks like my image didn't upload, I'm getting:

For n = 100000

includes: 45.72607421875 ms
has: 61.42919921875 ms

For n = 1000000

includes: 367.927978515625 ms
has: 707.369140625 ms

Collapse
 
iugo profile image
iugo
Collapse
 
iugo profile image
iugo

only for of (not .filter): typescriptlang.org/play?#code/MYew...

Collapse
 
miketalbot profile image
Mike Talbot ⭐

Nicely done with the examples, also one of my pet peeves n^2 gets pretty nasty, pretty fast!

Collapse
 
arnaud profile image
Arnaud

Thanks Mike!

Collapse
 
tgrashawatre profile image
Timothy Grashaw

Hey just wondering, do you think there is there a time associated with converting an array to a set, and if so is it less than the time we are saving by running has()?

Collapse
 
noon3 profile image
Rami Dridi

did you get to test this? thanks!

Collapse
 
rmorse profile image
Ross Morsali

Just what I was looking for - looks like I'll be switching over to using Sets a bit more often!