DEV Community

Cover image for When to use Object.groupBy
Amin
Amin

Posted on

When to use Object.groupBy

Object.groupBy is a recent addition to the JavaScript language, enabling the grouping of data based on a specific key.

But what exactly does this mean? Let's delve into it by exploring a practical use-case scenario.

Searching for users

Let's say we have a collection of user data retrieved from a database:

const users = [
  {
    id: 1,
    email: "first@domain.com",
    language: "HTML"
  },
  {
    id: 2,
    email: "second@domain.com",
    language: "HTML"
  },
  {
    id: 3,
    email: "third@domain.com",
    language: "CSS"
  }
];
Enter fullscreen mode Exit fullscreen mode

To search for a specific user, one would traditionally iterate through the array and compare each user's email with the target email:

const emailToSearch = "third@domain.com";

let foundUsers = [];

for (const user of users) {
  if (user.email === emailToSearch) {
    foundUsers.push(user);
  }
}

console.log(foundUsers);
// [{ id: 3, email: 'third@domain.com', language: 'CSS' }]
Enter fullscreen mode Exit fullscreen mode

This code begins by defining a variable containing the email of the user to search for. Subsequently, it iterates through each user in the array, acknowledging that the list could be a database result, and not all users may be present.

During each iteration, it checks if the email of the current user matches the specified search email. If a match is found, the user is pushed to a pre-defined variable. This variable is initialized as an empty array to handle scenarios where users do not match the search.

Finally, the found users are displayed. While this approach works, JavaScript's Object.groupBy can offer a more concise and efficient solution for such tasks.

The issue

Even though this is code you may have written before, there is a significant issue with it.

The problem arises from the uncertainty of whether our user is present. This is a substantial concern because each time we attempt to verify if a user corresponds to a specific email, we must iterate through every user in our database.

Now, consider a scenario with a billion rows. This operation would take linear time in terms of complexity.

Not terrible, but there's room for improvement.

Indexing

You might be wondering, why aren't we using an SQL database to handle this? If you thought about it, congratulations! That's the correct answer.

But not entirely, because a database isn't an intelligent creature that knows all our problems in advance and can optimize things for us (although this would be an interesting idea to explore).

Fortunately, databases offer a way to handle this type of operation swiftly by using indexing.

Indexing involves placing a special identifier on a column and informing our database that the next time we need to issue a request to search for something in that column, do it quickly!

But what does "doing it fast" mean? Simply put, it means grouping all data by a specific column. Sounds familiar? It should, because that's the purpose of using Object.groupBy.

When you index a column in a database, you do it because you anticipate returning to it with a request, and you need to access it as rapidly as possible, ideally reaching the point where your request takes constant time.

This is also the objective when using Object.groupBy. You aim to access data more quickly because linear time is not sufficient (for instance), and you need faster access time, ideally constant time.

In action

So, what does it do in action? Well, first of all, you'll identify a column that requires fast access. In our case, this is the email column of our objects.

Secondly, you'll need to create this specially indexed object (or grouped object).

const usersGroupedByEmail = Object.groupBy(users, user => user.email);

const emailToSearch = "third@domain.com";

let foundUsers = usersGroupedByEmail[emailToSearch];

console.log(foundUsers);
// [{ id: 3, email: 'third@domain.com', language: 'CSS' }]
Enter fullscreen mode Exit fullscreen mode

Great success! We obtained the same result as earlier, but without having to write a loop. This means we are now in constant time complexity, right? Right??

Not exactly. All we've done here is remove the loop and instead called the object with the email to search. We managed to do that because Object.groupBy takes a list of objects (in this case) and a function that specifies how we want to group our data. Here, we want to group our users by email, so we returned the email.

However, in this case, we didn't alter the time complexity of our algorithm. If we take this code and benchmark it, we would find that it takes roughly the same amount of time as the previous one.

How does Object.groupBy work?

Simply put, it operates by looping through all the items in our array of users. Straight from there, you can start to guess what's wrong.

Here is an example implementation of it.

/**
 * @param {Array<Item>} items
 * @param {(item: Item) => string | number | symbol} getKey
 * @return {{[key: string | number | symbol]: Array<Item>}}
 */
const groupBy = (items, getKey) => {
  return items.reduce((grouped, item) => {
    const key = getKey(item);

    if (key in grouped) {
      const previousItems = grouped[key];

      return {
        ...grouped,
        [key]: [
          ...previousItems,
          item
        ]
      };
    }

    return {
      ...grouped,
      [key]: [item]
    };
  }, {});
}
Enter fullscreen mode Exit fullscreen mode

Because it needs to loop through all our data to construct an object that can then be used to access our users by email, it takes literally the same amount of time, whether you are using the previous solution or this solution.

In this specific case (and I insist on that), using Object.groupBy is useless.

So why bother?

In reality, it all depends on the context. Like everything in software engineering, the goal is to find the best solution for a specific use case scenario.

You wouldn't use a Kubernetes cluster for deploying a simple HTML & CSS landing page, right?

This is roughly the same thing here. The limited use of our grouped (or indexed) object in this specific case makes it useless to group users by emails in the first place. We could have done it (with a little bit more lines) with a traditional loop instead.

However, if you now want to issue multiple searches, you'll start to notice that using a grouped object is way faster.

Because accessing usersGroupedByEmail[emailToSearch] takes constant time. In fact, you can see the result of the Object.groupBy as an indexed table in a database, which allows constant time access to your data and decreases the time complexity of an algorithm that needs constant access to things like a user, for instance.

So the next one hundred searches will only take constant time, whereas if you use the previous loop to search for a hundred users, you'll increase the time to search for a hundred users since you'll need to loop through all billion users a hundred times. This still takes linear time complexity in the worst-case scenario, but for a billion users, you'll start to notice some slowdown in your algorithm.

Take away

Object.groupBy is a wonderful addition to the JavaScript ecosystem because it means that, for this specific use case scenario (searching through a lot of data in a column faster), you don't need to download a bunch of libraries to do it (you have probably already used Ramda or Lodash for that in the past) or create your own version that could be flawed, requiring additional tests to ensure the security of this algorithm.

However, this is not a silver bullet, and for complex searches, you'll need more than just accessing raw data. For instance, you might want to allow searches on full text with insensitive casing.

Also, the grouping operation is costy, because it takes a linear time to achieve the indexation of our data. Plus, it requires some amount of space because you need a way to reference your grouped users somewhere. So you are trading space for time. And with a billion rows, it can be something to really consider, especially if the data needs re-indexing.

In this case, like for a fuzzy search, for instance, Object.groupBy won't be of any use because it is limited to an exact match. This makes it awesome for database indexing, for instance, and for exact searches on the application side as well.

What about you? Have you come up with ideas for use cases where Object.groupBy could shine? Let me know in the comment section down below!

Top comments (0)