Prolonged Prologue
Recently I worked on a stateless server. In other words, if and when it sends a second response to a client, it has no context of the first one.
Such backend code is pretty great – simple, clean, and easily testable.
Auditing or debugging? Not so great. Within a certain timeframe the backend may serve many clients and though there's enough information to reconstruct which response originated from which client, we're lazy and just wanted to "color" them for quick identification.
We decided to make the frontend generate a (pseudo-)random string when starting-up and send it with each request. No need for a convoluted UUID or something similar. Just plain ol' Math.random().toString(36).slice(2)
.
Fat Chance
During code review I explained to my colleague that 99.9% of the time it will make our lives easier and the 0.1% chance of 2 clients generating the same string is no big deal, because as explained, we can still distinguish between them like we did before.
She approved by change, got up, and as she walked away she pondered aloud But it's not really 0.1%, is it? Would be nice to figure out the actual chance of collision
.
Down the Rabbit Hole
I knew Math.random
returns numbers between 0 (inclusive) and 1 (exclusive) with roughly equal probabilities across that range, I wasn't sure however how it does that.
First stop – MDN of course! Their documentation for Math.random informed me that
numbers in JavaScript are IEEE 754 floating point numbers
. IEEE 754? What's that about exactly?Second stop – ECMAScript Spec. The Number type section revealed that IEEE 754 is just a fancy way of saying double-precision 64-bit floating-point number. Frankly, this is still too fancy, so I headed elsewhere.
-
Third stop – Wikipedia. The double-precision article threw me back to my CS degree with terms such as endianness and more importantly significand (AKA mantissa).
- The last two sources taught me that even though this format has 264 possible values (duh!), reserving 3 of them to represent
-0
(which is different from0
, mind you),infinity
, and-infinity
, is pocket change relative to the 253NaN
values. Yep, you read that right: given arbitrary sets of 64 bits, if you askwhat number does this represent?
, 1 out of 2048 times the answer will beit's not a number
. 🤯 #MindBlown - Speaking of possible values, every non-zero number can take the shape of
± 2ⁿ × 1.something
. For example 2020, the current year, is2¹⁰ × 1.97265625
. The benefit here is that the "1." part can be assumed and doesn't need to be stored. Thus the significand has an extra bit – making the number a bit more precise. 😂 #DadJokes
- The last two sources taught me that even though this format has 264 possible values (duh!), reserving 3 of them to represent
At this stage I stopped reading in order to experiment what numbers can JavaScript store. The smallest one below 1
turns out to be 0.9999999999999999
. i.e. if you enter 0.99999999999999994
in the console, JS will omit the last digit, and if you enter 0.99999999999999995
it will just round up to 1
.
The next numbers are:
0.9999999999999998
0.9999999999999997
0.9999999999999996
0.9999999999999994
0.9999999999999993
0.9999999999999992
0.9999999999999991
0.999999999999999
0.9999999999999995
is not absent by mistake. Remember these numbers are displayed as decimals, but are binary under the hood. Expect rounding errors when "translating" back and forth.
Naïvely I thought, Well, if I continue with these small decrements, after about 10 quadrillion times, it will go down to
. I immediately realized this misses the whole point of floating-point! 😂 #DadJokes0.0000000000000001
It's not the end. From there, we can again decrement 10 quadrillion times
0.00000000000000009999999999999999
0.00000000000000009999999999999997
down to
0.00000000000000000000000000000001
and so on.
This revelation – the density increases towards 0 – is in fact tied to the subnormal numbers (that "1.something" from before? for them it's actually "0.something").
I'm dwelling on this because Math.random
is supposed to return values with approximately uniform distribution. This means the set of its possible results is smaller than the general set of numbers JS can represent in the range [0,1). Or looking at it the other way around: there are fractions that you can set yourself, but can never get from Math.random
.
Back on Track
Penultimate stop – The V8 blog explained how about 4 years ago
Math.random
switched to use thexorshift128+
algorithm (noting other major browsers switched to it as well). This is almost what I was looking for – as this algorithm deals with generatinginteger
numbers.Final stop – The glorious V8 source code unveiled how it's done (I believe other engines behave similarly):
ToDouble
takes the latest generated integer, right-shifts it into what would shortly become the significand, sets the exponent to0x3FF
(which means+2⁰
) and casts it todouble
. This yields2⁰ × 1.random
and the only thing left to do is subtract1
.
static inline double ToDouble(uint64_t state0) {
// Exponent for double values for [1.0 .. 2.0)
static const uint64_t kExponentBits = uint64_t{0x3FF0000000000000};
uint64_t random = (state0 >> 12) | kExponentBits;
return bit_cast<double>(random) - 1;
}
Answer to the Ultimate Question
There are 2⁵² (or 4,503,599,627,370,496) possible values for Math.random
to return. It is a few quadrillions after all!
I plugged this value into the general equation for the birthday paradox and was relieved to learn that even with thousands of simultaneously connected clients, we have better chance of winning the lottery than have a collision.
Post Trauma
I obviously called my colleague to share these findings with her. She nodded understandingly, then paused for a moment before remarking Good, now we also know the chance of an empty identifier
.
To which I replied, Say what now?
.
Yeah, there's 1 in 2⁵² chance to get any specific value from calling
.Math.random()
and one of those is 0. In that rare yet possible case slice(2)
will leave you with an empty string. Not that it matters too much – you said you wanted to "color" the traffic with a string. I guess transparent is as good color as any
I'm never asking for your code review again.
📟 Let me know if you ever use Math.random()
and get 0.
Top comments (2)
In an age where we're bombarded from all directions with articles like "3 Things You Didn't Know In ES2015!" and "Here's a useful Tip: Use [].filter to filter arrays!" this has been a breath of fresh air.
Thank you.
Here's a nice visualizer for the 64 bits of double-precision floating-point numbers: binaryconvert.com/result_double.ht...