Among the most influential programming books I read is Programming Pearls and More Programming Pearls. In the More Programming Pearls book, there is a nugget of wisdom in chapter 7 called "The Envelope is Back". It discusses the need to have certain formulas memorized, and the ability to quickly make quick calculations on-the-fly regarding scale.
As a Software Architect™, I've built systems that scale to nearly a billion transactions each day. And to reason about scale in my head there's just a very small trick (or two) that I use and it works well (for me).
I always memorized what a million of whatever it is that I need to reason about in two different ways:
- One million in scale
- One million in quantity
Developers would often come to me and shout: "We need to support 2 million transactions each day with this API... HELP!" (Or something like it). It rarely rattled me. Because I understood the math of scale. I'd just answer: "Great, so we need to support 24 calls per second. No problem!".
Back of the envelope calculations aren't meant to be precise. Only approximate. Something easily computed in our head. One million transactions per day is:
- ~42k per hour
- ~700 per minute
- ~12 per second
Bottlenecks rarely happen at the scale of minute or hour. So the only number I need to know is 1 million of something that scales is ~10-12 per second. That's it.
Once we know that, we can determine what scale we need to support easily enough:
- 1 million per day = ~12/second (12 * 1)
- 5 million per day = ~60/second (12 * 5)
- 30 million per day = ~360/second (12 * 30)
- 100 million per day = ~1200/second (12 * 100)
One million users per day is:
- 1 million x approximate number of transactions per user session
If a typical user generates 50 API calls during their session, then we can use our back of the envelope skills to reason that we must support:
- 12 * 50 = ~600 txn/second.
When it's okay to use ~10 as a factor, also. We are not trying to be engineeringly precise here, just approximate. For numbers that will cause me too much thinking, I'll use 10 instead of 12 as a factor.
If supporting only the average were enough, we'd but done with this part of the post. But we often have to think about Peak Times. Certain parts of the day see more traffic than usual. We also have to support that.
My personal view is that we always MUST support the average first. It's the usual condition. But we then need to support the expected peak times. For this, we need another metric to memorize. I call it the 10% per hour rule. Remember, we're thinking in terms of 1 million. If 10% of that traffic happened during 1 hour, (or 30% of it during a 3 hour window), how much traffic per second is that?
- 100k transactions = 30/second.
So we can derive the following metrics:
- 1m/day app @ 10% peak for 1 hour: 100k rule = 30/second.
- 1m/day app @ 30% peak for 1 hour: 100k rule = 90/second.
- 1m/day app @ 30% peak for 3 hours: 100k rule = (30 x 3) / 3 = 30/second.
So a 1m/day app would need to sustain 12/second on average, and 10% peak for 1 hour would need to sustain 30/second for that hour. And so on.
Got 10 million tnxn's/day? Multiply those number by 10. And so on.
- 1 million/day = ~12/sec
- Assuming a user requires 50 tns per session, that's 50 million/day or ~600/sec
Always think in terms of per second
|How Many||Per User||sec.||1hr peak @ 10%||@ 30%|
Pretty much, that's all I need to remember. Everything else is a matter of multiplication in powers of 10
It's rare that scale is sustained. It usually flows in ebbs and tides. Sometime, tho, there isn't a peak. It's constant through the day. Examples of this would be:
- Monitoring an engine
- Monitoring a water filtration system
- The data on a flight control system
- Time of Day server
You might think that a Time of Day only experiences it's peak at 1am when most computers query it. That's not the case. This kind of service becomes active at 1 am every local time zone for each of the 24 times zones. It likely experiences it's peak at the beginning of each hour as each new timezone makes its query. I would classify this as a sustained scale. Meaning, we must treat each hour like a peak time.
Quantity is different. We're actually talking about capacity here. The following is a simple chart to help with computing different quantity. And again, we're thinking in terms of 1 million:
- An Int32 = 4 bytes. That's 4 million bytes
- An Int64 = 8 bytes. That's 8 million bytes
- A float = 4, or 8 bytes.
- An Object is the size of all it's metadata for itself and all members, + the size of data for each member.
- A UTF-8 Char in English is usually 1 byte.
- A UTF-8 Char in other languages is 1-3 bytes. Chinese characters might be 3 bytes. So if you support other languages, think of UTF-8 chars in terms of 2 or 3 bytes. I'd assume 3.
- @ 3 bytes per char, that's 3 million bytes.
Just because you've determined your bandwidth can support 120/sec doesn't mean your entire system can. Parts of the system that might not be able to keep up:
- Database connections or throughput
- Hard disk reading/writing
- Utility that encrypts/decrypts data at high volumes
- API call to 3rd party service that's rate throttled
Every part of the system needs to be able to support the expected workload.