DEV Community

Cover image for Bitsets, Bit Manipulations and Bitmasks Problems
Abhinandan Sharma
Abhinandan Sharma

Posted on • Updated on

Bitsets, Bit Manipulations and Bitmasks Problems

Introduction

If you have come to this post, I bet you are struggling with all those bit operations and tricks as there are two many of them! These bit operations are not only useful in competitive programming questions but also in Dynamic Programming (where we have to save the current state), and reducing time complexities in certain scenarios.

Before heading to discuss the questions let's just get a glimpse of why you should learn bitsets and type of questions which you can't do if you don't have enough knowledge of bit manipulation.

  1. Maximum AND/OR/XOR value of a pair in an array

  2. Queries for bitwise AND of the range [L, R] of the given array.

  3. Find Longest substring containing vowels in even counts, Find Longest awesome substring, Number of wonderful sub-strings, Maximum Product of Word Lengths These type of questions are typically solved using bitsets easily.

  4. Bitmasks on trees/graphs and with Dynamic Programming problems and some binomials.

We will be discussing each of the above questions and in the end, I would be sharing some of the questions from codeforces, Leetcode, etc. If you have any doubts in those questions, feel free to discuss in the comments.

First Type of Questions

Questions asking you to compute the maximum AND/XOR/OR of any pair in an array, directly or indirectly, are actually based on a rather simple trick. Let's First discuss the questions.

Q1. Find Maximum Binary AND pair from an array.

Codeforces Link of the problem:

Input: arr=[2,6,4,5,12,14,15]
Output: Max And pair: 14
Enter fullscreen mode Exit fullscreen mode

This is rather a difficult problem to solve but this will let you think much.

Hint 1: First convert all the numbers to their binary forms and see if you see a pattern in finding the answer

Hint 2: The size of the numbers can be at most 32 bits.

Solution: Read Hint 2 one more time. We know that total digits in binary representation can't be more than 32 for int datatype. Now, think this way: We have to see if a bit can be set or unset in our answer. That means we are actually constructing our answer bit by bit.

We would do a simple iteration here starting from leftmost bit and then we would search for the elements in array having this bit as set (And also the all the bits in our current answer). If the number of such elements is greater than or equal to 2, we would add this bit in our answer.

Dry Run:

arr[]={2,6,4,5,12,14,15}

Binary form of these numbers:

0010
0110
0100
0101
1100
1110
1111
Enter fullscreen mode Exit fullscreen mode

Now Our max answer can't be more than the maximum of these numbers(It's binary AND). So our max answer can be 15. We would start by setting the leftmost bit.

Initial Answer : 0000
First bit set: 1000 which is 8 in decimal. Fourth bit set in 12, 
15 and 14. (Greater than 2) So, we add this bit to our answer

Second Bit: 1100 which is 12 in decimal. The number of elements 
in which both first and second bit is set is 3(12,14 and 15). So,
we add this bit.

Third Bit: 1110 which is 14 in decimal. The number of elements in
which all of the set bits of 14 are present is 2(14 and 15). So,
we add this bit as well.

Fouth bit: 1111 which is 15 in decimal. the number of elements in
which all of the set bits of 15 are present is 1 only(15 
itself). So, we would not add this bit and reset our answer to 
1110.

Enter fullscreen mode Exit fullscreen mode

Why does this work?
A very frequent doubt which comes in mind is, how this algorithm differentiates between two numbers added in the answer. Like, I am first considering 12, 14 in for the second bit but there can be some number which actually satisfies our conditions but 12 don't. So the set bits of 12 would be removed! This would cause "some error". First of all, see the conditions clearly, the condition to add a number is that it should have bits added so far, so 12 won't be contributing any unique bit...

Code:

#include<bits/stdc++.h>
using namespace std;
#define int long long int
bool isValid(int x, vector<int> &arr, int n)
{
    int cnt = 0;
    for(int i = 0; i < n; i++)
    {
        // How do I check if all bits set in answer is set in the element as well?
        // Just do an AND operation!
        if((x & arr[i]) == x)
            cnt++;
    }
    return cnt >= 2;
}
int findMaxAndPair(vector<int> &arr, int n)
{
    int ans = 0;
    // Here I am starting from 63 as it is long long
    // You can actually compute the maximum number
    // and start from it's rightmost set bit
    for(int i = 31; i >= 0; i--)
    {
        // How do I set a specific bit? Just do an OR operation on shifted bit!
        if(isValid(ans | (1 << i), arr, n))
        {
            ans = ans | (1 << i);
        }
    }
    return ans;
}
int32_t main()
{
    int n;
    cin >> n;
    vector<int> arr (n);
    for(int i = 0; i < n; i++)
        cin >> arr[i];

    cout << findMaxAndPair(arr, n);

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Second Type

Querying inside a specific range. To be continued...

Third Type

Using bitsets for storing information effectively.

These problems may look sliding window technique problems at first! So be patient and see why we can't use sliding technique in such problems.

See this problem.
We have to find the size of the largest substring containing vowels in even counts.

So, how can it be solved using bitsets? Here we only have two options for any vowel, either it appears even number of times or odd number of times. So we can represent state of a vowel by a binary digit. And we have just 5 vowels, so the size of the bitset would not exceed 5. Now again stop reading and think!!!

I' m assuming you have given at least 10-15 min( if you haven't solved the problem yet). Here are the observations you should have with you by now:

  1. If we store each bit in form of string or bitset(Yep, that's a STL container in C++), we want state of the string/bitset as 00000 where each 0 represents a vowel and off state(0) of this vowel represents that it occurred even number of times.

  2. This observation is a bit trickier as it actually solves the problem. If we are at suppose ith character in string and bitset state till now is 01011, we need to have the exact bitset state at some other index to make the difference as 00000. For example, consider this string : eleetminicoworoep

At 1st iteration: Bitset: 10000
At 2nd iteration: Bitset: 10000
At 3rd iteration: Bitset: 00000( we reached perfect state here, 
so our current length would be i+1-0=3)
At 4th iteration: Bitset: 10000 ( Wait here! We have encountered 
the same state earlier as well, subtracting these two will make 
perfect state, i-prevIndex+1=2-0+1=3)
....

Enter fullscreen mode Exit fullscreen mode

So, long story short, just store the bitsets in an array or something where you store the index they are first found. If the bitset is repeating, the length formed by this bitset would be i-previousIndex+1. Note that we set perfect state's index to be -1.

Code:

// You may see this also: https://leetcode.com/problems/find-the-longest-substring-containing-vowels-in-even-counts/discuss/1357104/solution-using-map-of-bitset-and-position-c

class Solution {
public:
    int findTheLongestSubstring(string s) {
        int cnt[32];
        memset(cnt,-1,sizeof cnt);
        int mask=0;
        // Pos map stores indices for flipping certain vowel bits
        map<char,int>pos;
        pos.insert({'a',0});
        pos.insert({'e',1});
        pos.insert({'i',2});
        pos.insert({'o',3});
        pos.insert({'u',4});
        int ans=0;
        for(int i=0;i<s.size();i++)
        {   
            // Our character should be a vowel
            if(pos.find(s[i])!=pos.end())
                mask^=(1<<pos[s[i]]);
            if(cnt[mask]!=-1||!mask)
            {
                ans=max(ans,i-cnt[mask]);
            }
            else{
                cnt[mask]=i;
            }
        }
        return ans;

    }
};
Enter fullscreen mode Exit fullscreen mode

Next Problem

Now, let's do this one. Here we have to find the number of wonderful substrings where wonderful string is a string in which at most one element appears odd number of times.

First of all, let's discuss how it's different from previous problem? Here we have to find the number instead of the length of the string and second, we have at most 10 characters in the string(which actually gives a hint to solve this problem using bitsets).

We again start off by creating a 10 length bitset. 0 for even frequencies and 1 for odd frequencies. But here we don't have just one perfect bitset, instead we have these 10 perfect bitsets:

0000000001
0000000010
0000000100
0000001000
0000010000
0000100000
0001000000
0010000000
0100000000
1000000000
Enter fullscreen mode Exit fullscreen mode

Now if we have some bitset, say 1011011001, we need the exact same bitset along with all the bitsets which when subtracted from current bitset, gives you one of the said 10 "perfect" bitsets.
So, what we do is simple, for each ith character in string, calculate the bitset, and then iterate on the bitset itself and turn off one bit at a time and then search for the new bitset occurence in cnt array. If the occurence exists, you increase the answer by 1, else make a new entry in the cnt array. Also, here we do not store the indices in the cnt array, rather we store the number of occurences.

Code:


class Solution {
public:
    long long wonderfulSubstrings(string s) {
        long long cnt[1024]={1};
        long long  mask=0;
        long long  ans=0;
        for(int i=0;i<s.size();i++)
        {
            mask=mask^(1<<(s[i]-'a'));
            ans+=cnt[mask];
            for(int j=0;j<10;j++)
            {
                ans+=cnt[mask^(1<<j)];
            }
            cnt[mask]++;
        }
        return ans;

    }
};

Enter fullscreen mode Exit fullscreen mode

Type 4: Bitmasks with DP and Graphs, along with some binomial calculations.

This is perhaps the coolest form of bitmasks. Here the intuition is all it takes. You just need to think if you need the exact previous state for current dp or not.

Here are some good problems:

  1. Shortest Path Visiting All Nodes

This is a combination of Bitmasks used along with BFS and DP. This is a hard problem to start off with but I chose this as it includes a simple bitmask. Also, if you have no clue to solve this problem in about 30 min, please just look at the solutions and hints as it is not a very simple problem.

Hint 1: You need a BFS for sure, but if you keep track of visited nodes, you won't be able to visit that node again(See the example test case given in leetcode) So, you need to find a condition where you may get stuck in an infinite loop if you do not maintain a visited array. Also, as it can be from any node, you may try a multi-source BFS type of thing.

Hint 2: What will be unique for each path? Like in the below example, we have to go to 0 again from 1 but not again to 1 as it is already fully explored. When we should not visit a node? It's simple: when we have already explored all the nodes of the subgraph of that child and are entering again with the same child.

Solution: Okay, that's a heavy question. There are many ways to solve it, but let's first discuss the problems we have while doing a BFS:

  1. We can have our path started from any node.

  2. We can visit a node multiple times.

  3. We can have cycles, and we need to detect them without actually using the visited array.

Okay, we got our problems, let's start step by step to solve them.

  1. Starting from any node: It's simple to solve. What we do in a usual BFS? We push the source node first. But here as we can have many sources, we push all of them initially.

  2. Visiting a node multiple times: Here comes the most difficult part. Actually if we just remove the visited array from the standard BFS, it would do half the work. Now we can visit a node multiple times, but we still have to figure out some way to detect cycles.

  3. Detecting cycles without having a visited node: This was the crux of the problem. So, let's discuss this in detail.

For example in the figure below, our queue looks like this

0 1 2
Enter fullscreen mode Exit fullscreen mode

graph

But really, do we only have to store the nodes in the queue? How do we know that all nodes are now visited? Yeah, you guessed it right! We use bitmasks here! The first hint was actually in the constraints only. N can be upto 12 which is quite less. You can use a set here but that's a little slow than bitsets.

So, we store the bitsets along with the current node in the queue. Also, we have to just return the longest path, so we need a variable to store that as well. You can use a pair of pair of bitsets and int and int, but I would prefer using a user-defined data structure to store it.

struct info{
    int bitmask;
    int cnode;
    int path;
}
Enter fullscreen mode Exit fullscreen mode

Now our initial queue would look like:

{0001,0,1},{0010,1,1},{0100,2,1}
Enter fullscreen mode Exit fullscreen mode

Now, we have our visited nodes for each source, current nodes, and the length of the path as well. Now, we need to tackle the problem of cycles.

If a cycle occurs, we are sure that the exact same path with same current node would have occurred before as well. That means, if we the same bitmask and node are repeated again, there exists a cycle. In this example only, when we go from 0 to 1 and go back from 1 to 0, now we again can start with 1 but bitmask configuration looks like:

0 : 0001 and 0(current node)
0->1 : 0011 and 1(current node)
0->1->0: 0011 and 0(current node)
Now, if we go from 0 to 1 again
0->1->0->1: 0011 and 1(current node) : This is the exact configuration before, which is obvious, as we are storing the visited nodes and current node and they can't be same.
Enter fullscreen mode Exit fullscreen mode

So, we will store these 2 in a set of pair of bitmask(in form of int) and current node.

All in all, here is the brief: Just do a BFS with queue initially filled with every node and each queue element would store a user defined structure containing a bitmask, a current node, and length of the current path. We would be storing the bitmask and current nodes in a set of pair and skip the exact same configuration already in set. We will break when all nodes are visited at least once.

Here is the code:

struct info{
    int bitmask;
    int cnode;
    int path;
};
class Solution {
public:
    int shortestPathLength(vector<vector<int>>& adj) {
        int n=adj.size();
        queue<info>q;
        set<pair<int,int>> vis;
        // initializing queue and vis set
        for(int i=0;i<n;i++)
        {
            info node;
            node.bitmask=1<<i;
            node.cnode=i;
            node.path=1;
            q.push(node);
            vis.insert({node.bitmask,node.cnode});
        }
        while(!q.empty())
        {
            info curr=q.front();
            q.pop();
            int cnode=curr.cnode;
            if(curr.bitmask==(1<<n)-1)
                return curr.path-1;
            for(auto it:adj[cnode])
            {
                int bitmask=curr.bitmask|(1<<it);
                pair<int,int> x={bitmask,it};
                if(vis.find(x)==vis.end())
                {
                    info node;
                    node.bitmask=bitmask;
                    node.cnode=it;
                    node.path=curr.path+1;
                    q.push(node);
                    vis.insert(x);
                }
            }
        }
        return -1;

    }
};
Enter fullscreen mode Exit fullscreen mode

Now, I think you might have had the feel of solving these type of problems. So here is a list of problems solvable by the above knowledge you had till now:

  1. Find Longest Awesome Substring
  2. Contiguos Array
  3. Binary Subarrays with sum
  4. Subarray Sums Divisible by K
  5. Number of Sub-arrays With Odd Sum
  6. Bitwise AND of Numbers Range
  7. Count Pairs With XOR in a Range
  8. Count Subtrees With Max Distance Between Cities

Discussion (0)