Akshay Sharma

Posted on Jun 16, 2022

Alien Dictionary Problem

#alien #dictionary

Problem Statement

Determine the alien language's character order. Your task is to return any valid order of alien language. Given a sorted alien language dictionary with N words and k initial alphabets from a normal dictionary.

Note: Since many orders may be available for a given test case, you may return any valid order.

Constraints

0 <= N (size of arr) <= 10^5
0 <= ARR[i].size() <= 10^5
Time Limit: 1 sec

Example Case

Example 1
Input: N = 5, K = 4
dict = {"baa","abcd","abca","cab","cad"}

Output: 1
Explanation: Here order of characters is 'b', 'd', 'a', 'c' Note that words are sorted and in the given language "baa" comes before "abcd", therefore 'b' is before 'a' in output.

Example 2
Input: N = 3, K = 3
dict = {"caa","aaa","aab"}

Output: 1
Explanation: Here order of characters is 'c', 'a', 'b' Note that words are sorted and in the given language "caa" comes before "aaa", therefore 'c' is before 'a' in output.

Solution

We will solve the above problem using topological sort and permutations approach.

Approach 1

Permutations Approach (Brute Force)

Find all of the distinct characters that appear in each word.
Produce all possible combinations of these different characters.
Each permutation should be treated as a valid alphabetical sequence. Check to see if the given words are arranged in this order. To accomplish this, we will do the following:

a. Let the current word be 'currWord' and the next word be 'nextWord' for all words from 1 to n - 1.
b. Compare the characters of both words one by one to locate the first mismatching character. We move on to the following word if there was no mismatching character.
c. Let's imagine the mismatched characters in 'currWord' and 'nextWord' were 'ch1' and 'ch2', respectively.
d. Now, if these words(currWord and nextWord) follow the dictionary, ‘ch1’ will occur before ‘ch2’ in the sequence.

In case the words are sorted according to the current sequence, we will return this sequence.

Code

`bool checkOrder(string *dict, int n, vector &p)
{
unordered_map pos;

for (int i = 0; i < p.size(); i++)
{
    if (pos.find(p[i]) == pos.end())
    {
        pos[p[i]] = i;
    }
}

for (int i = 0; i < n - 1; i++)
{
    string currentWord = dict[i];
    string nextWord = dict[i + 1];

    for (int j = 0; j < min(currentWord.length(), nextWord.length()); j++)
    {
        if (currentWord[j] != nextWord[j])
        {
            if (pos.find(currentWord[j]) == pos.end() || pos.find(nextWord[j]) == pos.end() ||
                pos[nextWord[j]] < pos[currentWord[j]])
            {
                return false;
            }
            else
            {
                break;
            }
        }
    }

    if (currentWord.length() > nextWord.length())
    {
        return false;
    }
}

return true;

}

vector getAlienLanguage(string *dict, int n)
{
vector p;
unordered_set uniqChar;

for (int i = 0; i < n; ++i)
{
    for (char c : dict[i])
    {
        if (uniqChar.find(c) == uniqChar.end())
        {
            p.push_back(c);
        }

        uniqChar.insert(c);
    }
}

sort(p.begin(), p.end());

do
{
    if (checkOrder(dict, n, p))
    {
        return p;
    }

} while (next_permutation(p.begin(), p.end()));

return p;

Complexity Analysis

The above approach will take O(K! * N * L) time, where K is the number of distinct characters, N is the number of words in the dictionary, and L is the maximum length of a word in the dictionary. This will lead to TLE error.

Approach 2

If we consider the first two words of the alien lexicon ["wrt", "wrf",....], then looking at the first mismatch in the letters offers us crucial information about the order they occur!
That is, we may claim that 't' comes before 'f' in the above two terms! This relationship is denoted by the letters 't' and 'f'.
A directed graph can be used to describe this relationship!
Hence,

As a result, iterate over all of the words to create the graph that represents the aforementioned relationship. The graph's vertices will be all of the individual characters, while the directed edge will be the relation, mapping which character comes before another.
Return one of the potential orders for the created graph by doing a Topological Sort on it.

Code

class Solution{ public: void helper(vector<vector<int>> adj, string &ans, int x, vector<bool> &vis){ vis[x] = true; char c = x + 'a'; for(auto z : adj[x] ){ if(!vis[z]){ helper(adj, ans, z, vis); } } ans = c + ans; } string findOrder(string dict[], int N, int K) { vector<vector<int>> adj(K); for(int i=0; i<N-1; i++){ string a = dict[i], b = dict[i+1]; for(int j=0; j<min(a.size(),b.size()); j++){ if(a[j] != b[j]){ adj[a[j]-'a'].push_back(b[j]-'a'); break; } } } string s=""; vector<bool> vis(K, false); for(int i=0; i<K; i++){ if(!vis[i]){ helper(adj, s, i, vis); } } return s; } };

Complexity Analysis

Time complexity will be O(N+ K) N is total words present in dictionary and K is distinct characters in dictionary.
Topological Sort has a temporal complexity of O(V + E), where V is the number of vertices and E is the number of edges, whereas O(K + N) is the number of edges.
Space complexity will be O(K) to store the K distinct characters.

DEV Community

Alien Dictionary Problem

Problem Statement

Approach 1

Permutations Approach (Brute Force)

Approach 2

Top comments (0)

Read next

6 Emerging Web Animation Techniques for Dynamic User Interfaces

Why Cloud Monitoring is Non-Negotiable for Today’s Tech Giants

AI-Powered Collaboration: How Machine Learning Can Transform Team Dynamics

Automating Audio and Video Quality Testing: An Overview