The bit-wise similarity between the character cases

#ascii #character #bits

The ASCII value of 'a' in binary is 1100001.
The ASCII value of 'A' in binary is 1000001.

Notice the similarity?

The 6th LSB is set for lowercase chars and reset for uppercase chars. Rest all bits are the same.

This can be fun to play around with. For example, you can lower the case by simply doing

c | 32

And you can toggle the case (that is uppercase to lowercase and vice versa) by:

c ^ 32

Combining the above two actions, you can upper the case:

(c | 32) ^ 32

Another point to observe is that the difference between these ASCII values for any uppercase char and corresponding lowercase char is 32.

Oh, and by the way, man ASCII provides a nice table for ASCII values.

The investigation phase

While doing some crypto exercise, I had a string that was encrypted such that each character of the original string was XORed with some (unknown) character, to form the new character for the encrypted string.
The task was to decrypt and get the original string. Now it is a known fact that the actions of XOR can be reversed by XORing once again with the same number (a character is a number).

(a ^ b) ^ b = a ^ (b ^ b) = a ^ 0 = a

There are only 128 ASCII character values. So, a simple brute-force, and then by observing the output, you can make out the original string (given that it made sense originally).

When I experimented with the above logic, with arbitrary character 'x', I realized that the decryption made sense 2 times. Once at 'x' and the second time at 'X'. But the thing was, the case had been toggled for 'X'.

This led to some further investigations, and I came to the above conclusions.

DEV Community

The bit-wise similarity between the character cases

The investigation phase

Top comments (0)