DEV Community

Siphiwe
Siphiwe

Posted on • Originally published at theabstract.github.io

Understanding Why Floats Are Not Real Numbers

floats ≠ ℝ

Computers represent numbers in binary digits, the language of zeros and ones. Let's explore how this is done, and why floats can only be considered as approximately real numbers.
Our normal number system is represented in base ten, with the usual digits, which we can expand like this:

437 = 400    + 30     + 7
    = 4(100) + 3(10)  + 7(1)
    = 4(10²) + 3(10¹) + 7(10⁰)
Enter fullscreen mode Exit fullscreen mode

Binary numbers are represented in base two like this:

0 = 0
1 = 1
2 = 1 0
3 = 1 1
4 = 1 0 0
5 = 1 0 1
6 = 1 1 0
7 = 1 1 1
8 = 1 0 0 0
9 = 1 0 0 1
Enter fullscreen mode Exit fullscreen mode

Given a number x in decimal form, we can find its binary form by hand. We simply subtract from x the largest powers of two smaller x. By keeping track of what powers we used and did not, we will have the binary form.
For example, this is how we find the binary form of 237:

237 - 128 = 109
109 - 64  = 45
45  - 32  = 13
13  - 8   = 5
5   - 4   = 1
1   - 1   = 0
Enter fullscreen mode Exit fullscreen mode

and these are the powers of two that were used

256 128 64 32 16 8 4 2 1
      1  1  1  0 1 1 0 1
Enter fullscreen mode Exit fullscreen mode

The binary form of 237 is 11101101. Notice that 256 is excluded since we didn't use it, as a convention we are ignoring leading zeros that occur to the left of the first instance of 1 in the number.

A integer to binary function

Suppose we have to convert 19 to its binary representation. From what we've covered we know it should be:

x = 19 = 1(2⁴) + 0(2³) + 0(2²) + 1(2¹) + 1(2⁰) = 10011.
Enter fullscreen mode Exit fullscreen mode

This is what our function needs to do:
Take the remainder of x relative to 2 (x % 2), this gives us the last binary bit.
Then integer division by 2, which shifts all the bits to the right: 
x//2 = 1(2³) + 0(2²) + 0(2¹) + 1(2⁰) = 1001.

Repeating this process will give us the rest of the bits.

As expected, running to_binary with x = 19 matches the result we got by hand 😎.

❯ python to_binary.py
The binary representation of 19 is 10011.
Enter fullscreen mode Exit fullscreen mode

What about fractions?

Running to_binary with x = 3/8 results in:

❯ python to_binary.py
The binary representation of 0.375 is 0.375.
Enter fullscreen mode Exit fullscreen mode

Clearly this is wrong, 0.375 is what we want the binary representation of 🤦🏾! So we need a function that can handle fractions; lets stick to fractions that are at most 1 to keep things simple (see the resources at the end for a deeper dive into floating-point arithmetic).
For 𝑥 = 3/8 we know that; 3/8 = 0.375 = 3/10 + 7/10² + 5/10³. To get the binary form the procedure is this:

  1. Multiply 𝑥 by a power of 2 large enough to convert it into a whole number.
  2. Convert this whole number into binary.
  3. Divide the result by the power of 2 used in step 1. The integer part of this step tells you by how much to shift the decimal point.

For our example…

  1. 0.375(2³)=3
  2. 3 in binary is 11
  3. Dividing by 2³ gives 1.375 so we shift one place from 0.11 to get 0.011, which is 0.375 in binary form

Note: If there is no integer 𝑝 such that 𝑥(2ᵖ) is a whole number, then the internal representation will always be an approximation. This is why testing for the equality (==) of floats is not exact, and will lead to very bizarre behaviour.

>>> x = 0.1 + 0.2
>>> if x == 0.3: print('Duh!🙄')
... else: print('WTF...🧐?')
...
WTF...🧐?
Enter fullscreen mode Exit fullscreen mode

the decimal to binary function…

Running decimal_to_binary with x = 3/8 now results in:

❯ python decimal_to_binary.py
The binary representation of 0.375 is 0.011.
Enter fullscreen mode Exit fullscreen mode

Exactly what we expected 😎.


Learn more

Thanks for reading.

Top comments (0)