DEV Community

Cover image for Tips from open-source: Use “Set” to remove duplicates from an array.
Ramu Narasinga
Ramu Narasinga

Posted on

Tips from open-source: Use “Set” to remove duplicates from an array.

This tip is picked from Next.js source. In this article, you will learn how to remove duplicates from an array using Set in Javascript.

Set example

From this comment above in the picture, I noticed you could remove duplicates from an array using Set.

You can execute the code below as an example to see this in action:

Learn the best practices used in open source

// Suppose exportPathMap is an object containing paths
const exportPathMap = {
  "/home": true,
  "/about": true,
  "/contact": true,
  "/home": true // Duplicate path
};

// Sample normalizePagePath function
function normalizePagePath(path) {
  return path.toLowerCase(); // Convert path to lowercase
}

// Sample denormalizePagePath function
function denormalizePagePath(path) {
  return path.toUpperCase(); // Convert path to uppercase
}

// execute the code
const exportPaths = \[
  ...new Set(
    Object.keys(exportPathMap).map((path) =>
      denormalizePagePath(normalizePagePath(path))
    )
  ),
\];

console.log(exportPaths); // Output: \["/HOME", "/ABOUT", "/CONTACT"\]
Enter fullscreen mode Exit fullscreen mode

denormalizePagePath and normalizePagePath are functions from Next.js source code. I have picked a simpler functions to demonstrate this example.

It is simple:

  1. Define a Set with an array that can contain duplicates
  2. Spread the Set into a new array as shown above.

denormalizePagePath(normalizePagePath(path))

I asked myself “What? why would you normalize and then denormalize?” These functions code is provided below. You first normalize to ensure paths meet certain requirements, in this case, appending /index to the path and then denormalize it by removing /index. This is to ensure consistency in the page paths. Take away here is, if you are unsure about certain variable values, normalize them aka set a standard to make it consistent and then denormalize.

normalize-page-path.ts

// source: https://github.com/vercel/next.js/blob/canary/packages/next/src/shared/lib/page-path/normalize-page-path.ts#L14
import { ensureLeadingSlash } from './ensure-leading-slash'
import { isDynamicRoute } from '../router/utils'
import { NormalizeError } from '../utils'

/\*\*
 \* Takes a page and transforms it into its file counterpart ensuring that the
 \* output is normalized. Note this function is not idempotent because a page
 \* \`/index\` can be referencing \`/index/index.js\` and \`/index/index\` could be
 \* referencing \`/index/index/index.js\`. Examples:
 \*  - \`/\` -> \`/index\`
 \*  - \`/index/foo\` -> \`/index/index/foo\`
 \*  - \`/index\` -> \`/index/index\`
 \*/
export function normalizePagePath(page: string): string {
  const normalized =
    /^\\/index(\\/|$)/.test(page) && !isDynamicRoute(page)
      ? \`/index${page}\`
      : page === '/'
      ? '/index'
      : ensureLeadingSlash(page)

  if (process.env.NEXT\_RUNTIME !== 'edge') {
    const { posix } = require('path')
    const resolvedPage = posix.normalize(normalized)
    if (resolvedPage !== normalized) {
      throw new NormalizeError(
        \`Requested and resolved page mismatch: ${normalized} ${resolvedPage}\`
      )
    }
  }

  return normalized
}
Enter fullscreen mode Exit fullscreen mode

denormalize-page-path:

import { isDynamicRoute } from '../router/utils'
import { normalizePathSep } from './normalize-path-sep'

/\*\*
 \* Performs the opposite transformation of \`normalizePagePath\`. Note that
 \* this function is not idempotent either in cases where there are multiple
 \* leading \`/index\` for the page. Examples:
 \*  - \`/index\` -> \`/\`
 \*  - \`/index/foo\` -> \`/foo\`
 \*  - \`/index/index\` -> \`/index\`
 \*/
export function denormalizePagePath(page: string) {
  let \_page = normalizePathSep(page)
  return \_page.startsWith('/index/') && !isDynamicRoute(\_page)
    ? \_page.slice(6)
    : \_page !== '/index'
    ? \_page
    : '/'
}
Enter fullscreen mode Exit fullscreen mode

Conclusion:

I changed my article title to “Tips from open-source” from “Lessons from open-source” as these are short concised tips extracted from opensource code to improve your coding abilities.

You can use Set to remove duplicates from an array and then convert this Set back to an array using spread operator. I also saw function calling another function as its parameter — denormalizePagePath(normalizePagePath(path)).

I have written functions using this style before to improve readability. Condensed functions => less lines of spaghetti code to deal with => improved readability.

Top comments (2)

Collapse
 
darkwiiplayer profile image
𒎏Wii 🏳️‍⚧️

Set is a very useful data structure, and whenever people use transient sets to de-dupe arrays, I end up just wondering... Wouldn't a set be the correct data structure at that point anyway?

Of course, there is many situations where one really just wants an array but without the duplicate elements.

Regardless, whenever you find yourself writing [...new Set(array)], that's a good moment to pause and ask Should I refactor this array into a set? Or is it an array for a reason?

Remember: sets are more optimised for checking presence of random objects, while arrays are good at random access by index, but not by value.

Collapse
 
ramunarasinga profile image
Ramu Narasinga

Hey, That’s a good point. Using the right data structure matters. To pick the right data structure, context matters. 🙌