Eugene Cheah for Uilicious

Posted on Mar 22, 2019 • Edited on Aug 23, 2019 • Originally published at uilicious.com

🤔 Explain Hot & Cold Storage (like S3 and glacier)

#explainlikeimfive #webpef #cloud #beginners

With so many distributed systems disrupted the past few days. I thought it would be a good idea to expand on the topic as a series.

#hugops to all those #sysadmin who work on these things.

🤔 Explain Distributed Storage - and how it goes down for github / uilicious / cloud / etc

Eugene Cheah for Uilicious ・ Oct 23 '18

#explainlikeimfive #webpef #cloud #beginners

Background Context

If you go around distribution storage talks, conferences, or articles, you may commonly hear the following two terms:

hot
cold

What makes it extremely confusing is that these terms are used inconsistently across technology platforms.

And due to the complete absence of an official definition: SSD, for example, can be considered "hot" in one, and "cold" in another.

Of which at this point, even experienced sysadmins who are unfamiliar with the terms would go...

So what the heck is Hot & Cold ???

With the lack of official definition, I would suggest the following ...

Data temperature terms (hot / warm / cold), are colourful terms used to describe data or storage hardware, in relative terms of each other, along a chosen set of performance criteria (Latency, IOPS, and/or Bandwidth) on the same platform (S3, Gluster, etc).

In general, the "hotter" it is, the "faster" it is

~ Eugene Cheah

Important to note: these temperature metrics are used as relative terms for a platform. The colorful analogy of red hot and cold blue is not interchangeable across technology platform

The following table describes in very inaccurate terms, which would be considered hot or cold for a specific platform technology. Along with common key criteria (besides cost, as that is always a criteria)

platform	"hot"	"cold"	common key criteria
High-Speed Caching (For HPC computing)	Computer RAM	SSD	Data Latency (<1ms)
File Servers	SSD	HDD	IOPS
Backup Archive	HDD	Tape	Bandwidth

Hence due to the relative difference of terms, and criteria. I re-emphasize, do not mix these terms across different platforms.

For example, a very confusing oddball : intel optane memory, which is designed specifically for high-speed low latency data transfer, and is slower in total bandwidth, when compared to SSD.

This makes it a good candidate for high capacity caching server. But "in most workloads", ill-suited for file servers, due to its lower bandwidth, and high price per gig, when compared to SSD.

If you're confused by optane, its alright, move on - almost everyone including me is.

Why not put everything into RAM or SSD?

After all, ain't you a huge "keep it simple stupid" supporter?

For datasets below 1 TB, you could and probably should for simplicity, choose a single storage medium and stick with it. As mentioned in the previous article in the series - Distributed systems do have performance overheads that only makes it worth it at "scale" (especially one with hybrid storage).

However, once you cross that line, hot and cold storage, slowly becomes a serious consideration, after all...

In general, the faster the storage is in 2019, the more expensive it gets. The following table shows the approximate raw material cost for the storage components without its required overheads (like motherboards, networking, etc)

Technology	Reference Type	Price per GB	Price per TB	Price per PB
ECC RAM	16 GB DDR3	$2.625	$2,625	$2.6 million
SATA SSD	512 GB SSD	$0.10	$100	$100,000
Hard Drive	4TB 5900RPM	$0.025	$25	$25,000
Tape Storage	6TB Uncompressed LTO 7	$0.01	$10	$10,000

"Reference Type" pricing for each technology is already intentionally chosen for lowest price per GB, during 2017-2018

When it comes to Terabyte or even Petabyte worth of data, the raw cost in ram sticks alone would be $2.6k per Terabyte, or $2.6 million per Petabyte.

This is even before an easy 10x+ multiplier for any needed equipment (eg. motherboard), infrastructure, redundancy, and replacements, electricity, and finally its manpower that cloud providers give "as a service".

And since I'm neither bill nor mark, its a price out of reach for me. Or most of us for that matter.

As a result: for practicality reasons, once the data set hits a certain scale, most sysadmins of distributed systems will use some combination of "hot" storage specced out to the required performance workload (which is extremely case by case). With a mix of colder storage, to cut down on storage cost.

In such a setup, the most actively used files, which would need high performance, would be in quick and fast hot storage. While the least actively used file would be in slow cheap cold storage.

Such mix of hot and cold storage can be done on both a single server, or (as per this series) on a much larger scale across multiple servers, or even multiple nations. Where one can take advantage of much cheaper servers in Germany or Frankfurt, where electricity and cooling is cheaper.

Hetzner Helsinki Datacenter — It is pretty cool, that uiliciosu cold storage is literally being cooled by snow

How is the data split across fast and slow then?

Distribution of data, can be performed either by the storage technology chosen itself, if supported. Such as GlusterFS, or ElasticSearch. Where it can be done automatically, based on its predefined configuration.

Cloud storage technologies, such as S3, similarly has the configuration to automatically migrate such storage into "colder" state, according to a predefined age rule.

Alternatively, the distribution of "hot & cold" storage could also be done within the application, instead of single platform technology. Which would let it have very fine grained control over as it moves its storage workload from one system or another according to its expected use case.

In such a case the term "hot or cold" would refer to how the application developers plan this out with their devops, and sysadmins. A common example would be store hot data in a network file server or even SQL database, with cold data inside S3 like blob storage.

Why do some articles define it (confusingly) by the age of data then?

Because this would represent a commonly, over-generalized use case. Nothing more, nor less.

Statistically speaking, and especially over a long trend line, your users are a lot more likely to read (or write) a recently created file. Then an old 5+ year file.

xkcd : old files — xkcd 1360 : Old Files

You can easily see this in practice if you have a really good internet connection. By randomly (for the first time in a long time) jumping back in facebook or google to open up images that are over 5+ years old. And compare the load times to your immediate news feed.

Though if you are unlucky, while everyone's "hot" working files are loading blazing fast, you can find that your year old "cold" files sadly gone for a few hours.

Which is unfortunately what happened to me recently during the Google outage ...

Btw it has been long fixed - #hugops the hardworking google SRE's

While such rule of thumbs like "if older then 31 days, put into cold storage", is convenient...

Do not fall for an easy rule of thumbs (for distributed storage, or any technology)

The flaw is that it completely ignores a lot of the nuances that one should be looking into for ones use case.

Such as situations where you would want to migrate cold data to hot data. Or to keep it in hot data way longer than a simple 31 days period. Which is very workload specific.

Small nuances if not properly understood, and planned for, can come back to either bring services down to a crawl, when heavy "hot" data like traffic hits the "cold" data storage layer. Or worse, for cloud blob storage, an unexpected giant bill.

And perhaps the worse damage is done when: starting new learners down a confusing wrong path of learning.

I have met administrators were obsesses over configuring their hot/cold structure purely by days, without understanding their workload in details.

Ok, how about showing some example workloads then?

For example, uilicious.com heaviest file workload, consist of test scripts and results, where user writes and run 1000's of test scripts like these ...

// Lets go to dev.to
I.goTo("https://dev.to")

// Fill up search
I.fill("Search", "uilicious")
I.pressEnter()

// I should see myself or my co-founder
I.see("Shi Ling")
I.see("Eugene Cheah")

which churn out sharable test result like these ...

... which then get shared publicly on the internet, or team issue tracker 😎

The resulting workload ends up being the following :

When a test result is viewed only once (when executed), it is very likely to be never viewed again.
When a test result is shared and viewed multiple times (>10) in a short period, it means it's being "shared" within a company or even social media. And will get several hundred or even thousands more requests soon after. In such a case, we would want to immediately upgrade to "hot" data storage. Even if the result was originally over a month+ old.
When used with our monitoring scheduling feature. Which automates the execution of scripts. Most people view such "monitoring" test results, only within the first 2 weeks. And never again soon after.

As a result, we make tweaks to glusterfs (our distributed file store) specifically to support such a workload. In particular to the logic in moving data from hot to cold, or vice visa.

Other notable examples of workloads are ...

Facebook like "On This Day", showcasing images from the archive in the past
Accounting and audit of company data, at the start of the year, for previous year data
Seasonal promotion content (New Year, Christmas, etc)

Because good half of workloads I have seen so far would benefit from unique configurations that would not make sense to other workloads. And even at times fly against the simple rule of thumb.

My personal advice is to stick to the concept of hot and cold, being fast data or slow data accordingly, allowing you to better visualize your workload. Instead of date creation time, which oversimplifies the concept.

A quick reminder of one of my favorite quotes of all time :

Alternatively : There is no silver bullet

In the end, it is important to understand your data workload, especially as you scale into the petabyte scale.

So why didn't we just call them Fast N Slow

And don't get me started on how much I hate the term "warm" 😡

Oh damn, I so wish they did so. And we would be never needing this whole article.

However, while I could not trace the true history of the term (this would be an interesting topic on its own)

If you rewind back before the era of SSD, and to the predecessor to SSD storage for the early '20s.

I present one, bad ass HDD — I present, one, hot HDD

These high speed 15,000 RPM HDD are so blazing hot, they needed their own dedicated fans once you stack them side by side.

These hot drives would run multiple laps around the slower cheaper drives spinning at 5,400 RPM, which served as cold storage.

Hot data was literally, blazing hot spinning wheels of data.

one of the most expensive hot wheels — btw: these collectables go for $2,000 each

For random context, your car wheels go at around 1,400 RPM when speeding along 100 mp/h (160 km/h). So imagine 10x that - its a speed that you wouldn't want your fingers to be touching. (not that you can, all the hard-drive images you see online are lies, these things come shipped with covers)

The temperature terms are also used to describe different levels of backup sites, a business would run their server infrastructure with. With hot backup sites, having every machine live and running, waiting to take over when the main servers fail. To cold backup sites, which would have the power unplugged, or may even be empty room space for servers to be placed later.

So yea, sadly there are some good reason why the term caught on 😂 (as much as I hate it)

So what hardware should I be using for hot and cold, for platform X?

Look at the platform guide, or alternatively, wait for part 3 of this series 😝

In the meantime: if you are stuck deciding, take a step outside the developer realm, and play yourself a pop song as you think about it.

Happy Shipping 🖖🏼🚀

Top comments (5)

vorsprung • Mar 23 '19 • Edited

Great article for beginners, with some good comparisons, tables and examples. However...

I work in storage ops and I've never heard of the "hot" and "cold" applied to latency in the way you describe

"hot" and "cold" I've only heard applied to

a) DR sites in varying states of readiness
b) storage that is basically online or offline
c) cache hit rates in a database table or network proxy

(which I think you mention in passing in the article)

I also don't quite follow why the article is entitled "Explain Distributed Storage" when it exclusively talks about single partition storage. Sounds like the storage is all in one place - so not distributed!

btw that Katy Perry vid is ridiculous :)

Eugene Cheah Uilicious • Mar 24 '19 • Edited

Thanks for the feedback. Your right that it wasn't made clear, and the concept hot & cold is applicable in a server setup. I have renamed the title accordingly and added the following lines in the article

Such mix of hot and cold storage can be done on both a single server, or (as per this series) on a much larger scale across multiple servers, or even multiple nations. Where one can take advantage of much cheaper servers in Germany or Frankfurt, where electricity and cooling are cheaper.

As for latency, admittingly its rarer as it happens mainly in HPC (High Performance Computing) on time-sensitive data. Where it has a mix of both large data set too big to fit into RAM, and fast CPU's crunching data in a non-sequential format.

Especially those weather prediction supercomputers, who only guesses it right half the time.

Though these days it becomes less of an issue with GPU based workloads (which is a topic on its own)

Shi Ling Uilicious • Mar 22 '19

Hm... We should have more specific terms then! Let's use temperature so it's less ambiguous! Which metric system do you wanna use? Celsius, Fahrenhiet, Kelvin?

Eugene Cheah Uilicious • Mar 22 '19

Haha, i might do so in the part 3 : where I compare specific hardware with each other.

The down side though, it gets messy whenever a new technology appears, and slot itself in the middle of two existing one.... Do we start adding .5 degrees? haha 🤣

Some comments may only be visible to logged-in visitors. Sign in to view all comments.