Posted on

# Managing state with binary data & bitwise op

## Bitwise Op, What is it?

A bitwise operation that operates on a bit string, a bit array, or a binary numeral at the level of individual bits - Wikipedia

## Why save state in binary?

Encoding data in a serialized binary format is more efficient than a string format. That's why data encoders like Gob, Protobufs are more efficient than JSON & XML.

They reduce the amount of space required to save data. They don't contain extra rubbish that takes space like field names or whitespaces.

One disadvantage of saving state or serializing data in binary format is rigidness. Since they follow a strict format, both the encoder & decoder must update how they handle new changes to the data format.

Let's say we have aΒ `position`Β object that has only three type int8 fields (radius, longitude, and latitude). We can represent the fields in a binary format asΒ `{toBinary(rad)}{toBinary(lat)}{toBinary(long)}`

If the `radius` is 30(11110), `latitude` is 20(10100) and `longitude` is 50(110010) we save it in our serialized binary format as

``````{toBinary(30)}{toBinary(20)}{toBinary(50)} = {00011110}{00010100}{00110010} =Β 000111100001010000110010
``````

It can be represented in Hex as

``````1e1432
``````

If we are to save this in JSON it will be

``````{"rad":30,"lat":20, "long": 50}
``````

The JSON format takes more memory than the binary format and requires space for field names, while the binary format knows where the fields belong based on the position of the byte. This shows the space efficiency of storing states with binary.

Let's say we no longer need the radius field in the position object.

If we used a JSON format, removing the radius field won't affect the encoding & decoding rules.

If we used the binary serialized format from earlier, we have to make sure both the encoder & decoder are aware of the changes to the radius field.

When one expectsΒ `{toBinary(rad)}{toBinary(lat)}{toBinary(long)}`Β and the other is expectingΒ `{toBinary(lat)}{toBinary(long)}`Β ****there will be a mismatch of field vaules and data won't be accuratly encoded/decoded.

We have to pass 0 as the first byte, provided the encoder & decoder are not in sync. Doing this keeps the format of the previous position object and can be ignored by the side that is up to date.

## A simple use case

We have a notification service that groups notifications by categories and mediums(SMS, Email, and Push).

This notification service has two categories of notification which are deposits and withdrawals.

``````{
"deposit": {
"push": {
"state": true,
"can_edit": true
},
"email": {
"state": true,
"can_edit": false
},
"sms": {
"state": true,
"can_edit": false
}
},
"withdrawal": {
"push": {
"state": true,
"can_edit": true
},
"email": {
"state": true,
"can_edit": false
}
}
}
``````

So the goal is to come up with a way to save the JSON object in a binary serialized format. Something like this `0001011100000111`

### Going From JsonβBinary

Binary is made up of only 0 & 1, which can represent boolean values. Every bit position in binary is a power of 2, just like each position in decimal is a power of 10(1,10,100).

To format the JSON representation above to binary we first flatten the content of the notification category.

``````{
"deposit": {
"push_state": true,
"push_can_edit": true,
"email_state": true,
"email_can_edit": false,
"sms_state": true,
"sms_can_edit": false
},
"withdrawal": {
"push_state": true,
"push_can_edit": true,
"email_state": true,
"email_can_edit": false,
"sms_state": false, //false cus it's empty
"sms_can_edit": false//false cus it's empty
}
}
``````

Now that it's flattened we want to represent each field in a byte. A byte is made up of 8 bits so we can represent the fields as individual bits in the byte.

Since the field type is bool we use `1` to represent `true` and `0` to represent `false`.

So we map the fields to bit positions like:

``````push_state => 1st bit
push_can_edit => 2nd bit
email_state => 3rd bit
email_can_edit => 4th bit
sms_state => 5th bit
sms_can_edit => 6th bit
``````

We can now represent notification setting has

``````{
"deposit": "00010111",
"withdrawal": "00000111"
}
``````

We want to change the JSON above to an array of bytes ([]byte). To do this we have to set index positions on the byte array for the notification categories.

For our byte array of length 2(two notification categories), `deposit_vaue` will be the 1st value in the array and `withdrawal_value` the second.

``````[]byte{deposit_value,withdrawal_value}
[]byte{0b00010111,0b00000111} = 0001011100000111 = 0x1707
``````

Now we can finally represent our Notification Setting as

``````0001011100000111 #binary format
1707 #hexidecimal format
[]byte{23,7} #in code
``````

## The Set, Unset & Check Operators

Now that we have a binary serialized format to save the notification setting object, the next step is updating the binary representation of the notification setting. Let's call itΒ `bitNotifcationSetting`.

We make use of the Set, Unset & Check bit operators to manipulate the bits ofΒ `bitNotifcationSetting`.

Set Operator: This sets theΒ nthΒ bit to 1.

``````set_op(data, nth_flag) => data | nth_flag
``````

Unset Operator: This sets theΒ nthΒ bit to 0.

``````unset_op(data, nth_flag) => data & ^(nth_flag)
``````

Check Bit Operator: this checks if theΒ *n*th bit is set.

``````check_bit_op(data, nth_flag) => (data & (nth_flag) != 0
``````

golang playground example

## Updating A Notification Medium State

Let's say we want to update the SMS state.

• Select the byte we want to edit using the index of the category (deposit = 0)
• call `check_bit_op(*bitNotifcationSetting,sms_can_edit_flag)` to c*heck if the `sms_state` bit is editable.
• If the Β `check_bit_op(*bitNotifcationSetting,sms_can_edit) == true`, move to the next step else return the field is not editable.*
• call `set_op(bitNotifcationSetting, sms_state_flag)` to set the `sms_state` bit to 1/true, or call `unset_op(bitNotifcationSetting, sms_state_flag)` set `sms_state` bit to 0/false

Following the steps above we see how the operators are used in the manipulation of bits.

## Conclusion

The binary format used here can help reduce memory space used and allows the cache hold more information about the notification setting in memeory without the need to cleanup to accomodate more recent values.

The example I used here is easy because the field type is boolean, which takes up a bit. For more complex types like strings and types with variable lengths, using a binary format will be a little hard to manage on your own without the help of external libraries.

For something more complex look at the way Bitcoin encodes transaction object to binary format.