DEV Community

loading...

Managing and removing duplicated values with javascript Sets

rokuem profile image Mateus Amorim ・7 min read

Summary

Sets

Set is a constructor for a javascript collection of unique elements.
It can be used to easily manage lists of ids and other primitive types.

It can be used to write a more semantic code, remove duplicates or record states based on object ids for example.

Creating a Set

You can create a set by using the new keyword and invoking it directly or with a value to use as a base.

const shoppingList = new Set(); // javascript => []
const shoppingList = new Set<string>(); // typescript => string[]
const shoppingList = new Set<string>(['a', 'a', 'b']); // ['a', 'b'] as string[]
const uniqueLetters = new Set<string>(['aab']); // ['a', 'b'] as string[]

Adding values to the set

To add a value to the set you just need to call the .add method. It will not add the item if it is already in the set.

const shoppingList = new Set(['pizza']);
shoppingList.add('meat');

// you can also chain it, but unfortunately you can only pass one value each time.
shoppingList
  .add('meat')
  .add('coke')

If you were using arrays you would need to do something like this each time

// Using arrays this would be equivalent to
const shoppingList = ['pizza'];

if (!shoppingList.includes('meat')) {
  shoppingList.push('meat');
}

So with Set you can make this process a little easier.

Removing values from the Set

To remove a value from the set you just need to call the .delete method. The advantage from the array approach is that it can be applied to any value in the set with ease and the set size is modified after removal, while with arrays you would end up with empty slots.

const shoppingList = new Set(['pizza']);
shoppingList.delete('meat'); // returns false since 'meat' was not in the list. Set stays the same.

shoppingList.delete('pizza'); // Returns true since the element was in the set. The set size is now 0.

This is easier and more semantic than dealing with arrays for cases where the value is in the middle of it.

// Given a base array
const shoppingList = ['pizza', 'coke', 'chocolate'];

// If you wanted to remove the last element it would be simple
shoppingList.pop();

// The first element too
shoppingList.shift();

// But for an element somewhere in the middle it gets a little more complicated.

// You could do this.
delete shoppingList[1]; // But it would create an empty space in the array :(

// So instead you need to do something like this
if (shoppingList.includes('meat')) {
  // Which can be bad as it resets the object reference.
  shoppingList = shoppingList.filter(item => item !== 'meat');
}

Verifying the count of itens in the Set

Different from arrays, where you access the length property, with Sets you need to access the size property instead.

const shoppingList = new Set(['pizza']);
shoppingList.size // 1

Verifying if an item is in the set.

To see if an item is in the set you use the .has method.

const shoppingList = new Set(['pizza']);
shoppingList.has('pizza') // true

With arrays it is also pretty simple

const myArray = ['one', 'two'];

myArray.includes('two') // true

Resetting the Set

you can reset the set by calling the .clear method :)

const shoppingList = new Set(['pizza']);
shoppingList.size // 1
shoppingList.clear();

shoppingList.size // 0
shoppingList.has('pizza') // false

With arrays you could just set it to a new one, but if you wanted to keep the reference intact you would need to use .pop multiple times, so with Sets it is easier.

const x = { a: [1,2,3] }
const myArray = x.a;

x.a = [];

console.log(x.a); // []
console.log(myArray) // [1,2,3] :(

x.a = myArray;

myArray.pop();
myArray.pop();
myArray.pop();

console.log(x.a); // [] :)
console.log(myArray) // [] :)

Looping through the set values

for sets you can use either the .forEach method or for value of mySet.

  const mySet = new Set([1,1,2,3,4,5]);

  mySet.forEach(cb);

  for (const item of mySet) {  // only "of" works. The "in" will not work.
    //... 
  }

Converting set to Array

Converting an array to a set, then converting the set back to an array is a simple trick you can do to remove duplicated values from it :)

To convert from array to set, you just need to pass it in the set constructor argument.

To convert from a Set to an array, you can use Array.from() or deconstructing inside a new array.

const thingsIWant = ['cake', 'pizza', 'pizza', 'chocolate'];
const shoppingList = Array.from(new Set(thingsIWant)); // will output: ['cake', 'pizza', 'chocolate']
const shoppingList = [...new Set(thingsIWant)]; // Same as above, but shorter

Removing objects and arrays duplicates

Objects and arrays are reference-type, which means the Set() will only remove duplicated references, but not structures.

ex:

const x = { a: 1 };
[...new Set([x, x])] // Will result in [x]
[...new Set([x, { a: 1 }])] // Will result in [x, {a: 1}]
// same for arrays...

A simple workaround over that is JSON.parse and .map

ex:

const x = { a: 1 };
[...new Set([x, { a: 1 }].map(JSON.stringify))].map(JSON.parse); // [{ a: 1 }]

There are some downsides:

  • it will not work if you have the same structures but with different property order (ex: {a: 1, b: 2} and {b: 2, a: 1})
  • JSON.stringify will convert functions to undefined
  • JSON.stringify converts NaN to "null"
  • JSON.stringify returns undefined for undefined, but JSON.parse can't handle that.
  • JSON.stringify will not work properly with classes and others

The JSON.stringify problem

Ex:

const x = [undefined, null, NaN, true, 'asd', {a: 5}, () => {
  console.log('a')
}, new Set(['asd', 'bbb'])].map(JSON.stringify);

console.log(x) // [ undefined, "null", "null", "true", "\"asd\"", "{\"a\":5}", undefined ]

x.map(JSON.parse) // will throw an error parsing the first value

One possible solution here would be to remove those undefined values and add it back later after parsing everything:

  const x = [undefined, 'asd', true, false, { a: 1 }, { a: 1 }];

  // map to json so we don't remove valid falsy values
  const jsonX = x.map(JSON.stringify); // [ undefined, "\"asd\"", "true", "false", "{\"a\":1}", "{\"a\":1}" ]

  // Create the set to remove duplicates
  const uniqueJsonX = [...new Set(jsonX)] // [ undefined, "\"asd\"", "true", "false", "{\"a\":1}" ]

  // Now we remove the values that cannot be parsed. Since we conveted false to "false" before, this will only remove non-parseable values.
  const parseableJsonX = uniqueJsonX.filter(v => v); // [ "\"asd\"", "true", "false", "{\"a\":1}" ]

  // Now we can parse the array with JSON.parse to get our "original" values back :)
  const parsed = parseableJsonX.map(JSON.parse); // [ "asd", true, false, {…} ]

  // And finally, if you want to also add undefined values to the set again.
  const parsedWithInvalid = x.filter(v => !v)];

  // Or if you want to add functions and others that were removed too
  const parsedWithInvalid = x.filter(v => !JSON.stringify(v)];

  const uniqueX = [...new Set([...parsed, ...x.filter(v => !v)])]; // [ "asd", true, false, {…}, undefined ]

Well, this solves most of the problems mentioned. But what about objects with different order, functions and classes instances?

Dealing with objects with same values but different key order

To solve this problem we need to add a new step to the solution above. In this case, to quickly sort the object values, we can map it with Object.entries, sort, then join it back with Object.fromEntries

const myObject = {c: '3', b: '2', a: '1'};
const myObject2 = {a: '1', b: '2', c: '3'};

const myArr = [myObject, myObject2].map(item => {
  return Object.fromEntries(Object.entries(item).sort());
}).map(JSON.stringify);

console.log([...new Set(myArr)].map(JSON.parse)); // [{ a: '1', b: '2', c: '3'}]

Dealing with classes instances

Classes instances may behave in an unexpected manner when going trought JSON.stringify(), like:

const x = new Date();
console.log(JSON.stringify(x)); // will output date string instead of [object Date]

const y = new Set([1,2,3,4]);

console.log(JSON.stringify(y)); // {} 🤔

It may work however if you have a simple object-like class, but in general it is not safe to include those in the set to remove duplicates.

I would recomend separating at the start of the approach mentioned before, then, creating a new set for it (in case you want to remove duplicated instances) and joining it in the result at the end.

  const base = [undefined, 'asd', true, false, { a: 1 }, { a: 1 }, new Set([1,2,3], new Date())];

  const state = {
    notParseable: []
    parseable: []
  };

  for (const key in base) {
    const isObject = typeof base[key] === 'object';
    const isSimpleObject = isObject && base[key].toString() !== '[object Object]';

    if (!base[key] || isSimpleObject) {
      state.notParseable.push(base[key]);
      continue;
    }

    state.parseable.push(base[key]);
  }

  // ...

  return [...result, ...[...new Set(state.notParseable)]];

Dealing with NaN, null and undefined

to remove duplicates of those, the same approach as the solution above can be used :).

In this case we remove it from the values that will go trought JSON.stringify and create a separated set for it, then join it in the end.

Dealing with Functions

With functions you can also filter it beforehand and remove duplicated references.

a = () => {};

new Set([a, a]) // Set [ a() ]

However, if you want to compare between inplementations, for whatever the reason, it would probably be better to do that in the array, like this.

const x = [() => {}, () => {}];

const uniqueFunctions = [];
const stringifiedFunctions = [];

for (const f of x ) {
  if (!stringifiedFunctions.includes(f.toString())) {
    uniqueFunctions.push(f);
    stringifiedFunctions.push(f.toString);
  }
}

Gotchas

Vue reactivity

Vue.js is not reactive to Sets, so you need to manually update the component you are using to $forceUpdate after modifying the set

Proxy a Set

Sets are incompatible with Proxy() so you cannot add a global getter/setter for it, but you can still use Object.defineProperty in it.

Primitive and reference types

Sets will work better with primitive types, like string and numbers, but they could also be used with reference types, like objects and arrays, as long as the object reference is the same or you do some of the transformation to the values.

ex:

 const list = [];
 const listItem1 = { foo: 'bar' };
 const listItem2 = { foo: 'bar' };

 // if you do
 new Set([listItem1, listItem1]) // you will get a set with just [listItem1]

 // But if you use 2 different references, even if the values are the same
 new Set([listItem1, listItem2]) // you will get a set with [listItem1, listItem2];

Discussion (0)

pic
Editor guide