Numbers everyone should know ... again.

#sre #devops #engineering #design

.. but, why bother writing about this again?

Well, because in my opinion, 10000000ns may be accurate, but it's also a bit hard to convert into something useful, quickly.

I believe that these numbers should reflect something useful; for example, do you really need to remember that it takes 0.5ns[1] for a CPU L1 cache reference?

It's quite likely that your day-to-day does not depend on knowing this by heart, knowing where to quickly find that number if you need it, is a different thing. See the reference/s below. [1]

To me personally, useful numbers are things you can see and use every day, for example, when I ping an IP address and it takes 234ms (default packet size is 56 bytes), then I know that that's slow, because...

Example [1]	Latency (Human)	Latency (ns)
Mutex lock/unlock	0.0001ms	100
Main memory reference	0.0001ms	100
Read 2K bytes sequentially from memory	0.0004ms	488
Compress 2K bytes with Zippy	0.02ms	20 000
Send 2K bytes over 1 Gbps network	0.02ms	20 000
Read 1 MB sequentially from memory	0.25ms	250 000
Read 1 MB sequentially from network	10ms	10 000 000
Read 1 MB sequentially from disk (not SSD)	30ms	30 000 000
Disk seek (not SSD)	10ms	10 000 000
Round trip within same datacenter	0.05ms	500 000
Send packet CA->Netherlands->CA	150ms	150 000 000

Of course, above numbers are useful, but I find them much more useful in ms instead of ns as I can now easier reason about how slow or fast something is.

What other numbers may be useful? I tend to find the following quite helpful too...

Example	-
Read 1MB from loopback (~10Gbps)	1ms
DNS recursive lookup to 1.1.1.1	20ms
1Gbit Network throughput	125 MByte/s
... -> 100Mbit	12.5 MByte/s
... -> 10Mbit	1.25 MByte/s
1 MB/s Network throughput	8 Mbit/s
GCP inter-zone transfer 1 MB from network (*)	10ms
GCP inter-zone transfer 10 MB from network (*)	70ms
GCP inter-zone transfer 100 MB from network (*)	530ms

(*) instance type: n1.standard, zone: us-central1-{a,b}, measured with iperf3 on 17 Nov 2019

When thinking about engineering a new system, try to keep above numbers in mind and do some quick calculations. Rough estimates will go a long way.

Obviously, above numbers do not take CPU and Memory requirements into account, some of which could be calculated based on the software vendor's information and of course at least some performance testing to establish the system's real boundaries and baseline performance.

All this ultimately leads to the idea of Non-Abstract Large System Design referred to in Google's SRE Workbook [2].

Please let me know if you have any feedback or find any errors, I'll be happy to correct them.

References:

[1] - Google Pro Tip: Use Back-Of-The-Envelope-Calculations To Choose The Best Design
[2] - SRE Workbook - NALSD

DEV Community

Numbers everyone should know ... again.

Top comments (0)

Read next

Key Components of a VPC: Detailed Breakdown

Understanding DevSecOps Principles

My first AI Food Assistant

Modern Traffic Management with Gateway API in Kubernetes