Let's start with the description for Group Anagrams:
Given an array of strings
strs
, group the anagrams together. You can return the answer in any order.An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
For example:
groupAnagrams(['eat', 'tea', 'tan', 'ate', 'nat', 'bat']);
// -> [ ['bat'], ['nat', 'tan'], ['ate', 'eat', 'tea'] ]
groupAnagrams(['']);
// -> [ [''] ]
groupAnagrams(['a']);
// -> [ ['a'] ]
And, as one of the constraints says, each of the strings will consist of lowercase English letters.
One thing to remember from the previous Valid Anagram problem is that we can easily check if two strings are anagrams of each other by comparing their sorted versions.
So, we can use a hash table to store the sorted words. In that case, all words that are anagrams of each other will be grouped together in an array, and share the same key:
function groupAnagrams(strs: string[]): string[][] {
let words: { [word: string]: string[] } = {};
for (let s of strs) {
let sortedWord = [...s].sort().join('');
(sortedWord in words) ? words[sortedWord].push(s) : words[sortedWord] = [s];
}
return Object.values(words);
};
Time and space complexity
Since we're using the sorting operation, time complexity will be
as it is the best we can do with sorting. But we're doing the sorting operation for each element in strs
, so the loop itself has an
time complexity.
To not confuse ourselves, we'll use another variable,
, to denote the length of strs
, that is, the number of times we'll iterate for each element. Overall, the time complexity will be
.
We can say that space complexity is
where
is the length of strs
and
is the length of the longest string, because in the worst case where all strings are anagrams of each other, the value array can contain
strings, and the key's length will be
, so, words
will grow proportionally.
Using Python
To reflect the ternary operation in the TypeScript version:
(sortedWord in words) ? words[sortedWord].push(s) : words[sortedWord] = [s];
We could write it in Python like:words[sorted_word].append(s) if sorted_word in words else words[sorted_word] = [s]
But since it's a bit clunky, we can usesetdefault()
where we're setting the default value ofwords[sorted_word]
to[]
.
It might look like this in Python:
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
words = {}
for s in strs:
sorted_word = ''.join(sorted(s))
words.setdefault(sorted_word, []).append(s)
return words.values()
Now, after taking a deep breath, we can look at NeetCode's solution.
And, voilà, a more efficient solution exists:
from collections import defaultdict
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
res = defaultdict(list)
for s in strs:
count = [0] * 26 # a ... z
for c in s:
count[ord(c) - ord('a')] += 1
res[tuple(count)].append(s)
return res.values()
Here, the constraint we mentioned in the beginning gives some perspective to this solution:
For each string, we can count the number of characters from 'a'
to 'z'
, because the strings will be just lowercase English letters.
We can still use a hash table, and map each of the 26 letters to an index, and increase the value at that index every time we see that letter. The keys will be these arrays of length 26, and the values will be the arrays of strings themselves.
For example, if we have these strings:
['eat', 'tea', 'tan']
Then, res
will look like this:
{
(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['tan'],
(1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['eat', 'tea']
}
Note |
---|
count is converted into a tuple because list s cannot be keys as they are mutable. You can read more on this. |
Also note that in the code, we use the ASCII numbers of the characters to get their index, for example, the count of 'a'
will be at the 0th index, so it's basic offset arithmetic.
For instance, the ASCII number of 'z'
is 122
, and 'a'
is 97
, when you get the difference, it will be 25
, meaning that the 'z'
will be at the end of the array, that is, the 25th index.
After taking another deep breath, let's try converting it into TypeScript:
function groupAnagrams(strs: string[]): string[][] {
let result: { [count: string]: string[] } = {};
for (let s of strs) {
let count = new Array(26).fill(0);
for (let c of s) {
count[c.charCodeAt(0) - 'a'.charCodeAt(0)]++;
}
const key = count.toString();
!(key in result) ? result[key] = [s] : result[key].push(s);
}
return Object.values(result);
};
Note |
---|
While we couldn't use lists as keys in the Python version, we'll just convert the count array into string with toString() in this TypeScript version and use it as key. |
Time and space complexity
The time complexity will be where is the total number of strings and the is the length of a string.
For the space complexity, the dominant item will be the res
variable (the count
array won't matter much because it won't grow with the input size, it is constant, or
).
In the case where each key is unique, the space complexity will be
where
is the total number of strings, and
is the length of the longest string.
And, that's the end of Group Arrays. The next one will be Top K Frequent Elements, until then, happy coding.
Top comments (0)