When considering an on-premises or hybrid IT infrastructure, you will no doubt come into contact with the hyperconvergence hype train. Promising reduced costs, greater flexibility, and free puppies for everyone; what is a hyperconverged infrastructure, what are its benefits, and does it deserve a place in your budget?
Traditional 3-Tier Infrastructure
To understand what hyperconvergence offers we need to compare it to a traditional datacentre infrastructure. A traditional datacentre infrastructure is sometimes called a three-tier infrastructure. Those three tiers are the network tier, the server tier, and the storage tier. Let’s start simple, though — here’s a standalone server.
It’s a single box that contains computational resources (processors and memory) and storage resources (hard drives). This is no different to your desktop PC or even your laptop or phone. Everything you need is together in one package. When you scale out your infrastructure this approach becomes inefficient.
You have to worry about managing loads of different pieces of storage in loads of different discrete units. Expansion is limited by the physical constraints of the individual boxes themselves, and you end up wasting space by over-provisioning your storage to make sure you don’t run out in each of your little silos.
To solve this, organisations started separating their storage from their computational resource, into a separate Storage Area Network or SAN.
Your servers now do the computation and connect back to a centrally managed pool of storage. As well as being much more flexible and scalable, this makes it easy to do things like replicate your data to other locations and back it up more efficiently; because you can replicate or back up the SAN itself rather than having to manage backup and replication on every server, individually.
This infrastructure tiering which was once for large datacentres got pushed to the masses when virtualisation arrived. Virtualization decouples your computation from the underlying hardware, allowing the logical servers that run your business to move around between physical host servers that simply provide processor memory capacity for them.
This flexibility to roam across hosts has huge efficiency and resilience benefits, but it also requires a separate storage tier so these logical virtual servers can access the storage from any host. There’s no point being able to seamlessly glide your computation between different host servers if you’re going to be tethered to one of them anyway for access to your storage. So you arrive at the traditional three-tier datacentre infrastructure that many of us are familiar with.
The network, server, and storage tiers talk to each other, but they are managed independently as three separate entities. Often, by three separate teams.
A converged infrastructure is basically the same thing but packaged up and productised by a single vendor. Cisco, for example, offer a FlexPod converged infrastructure that is comprised of Cisco network switches, Cisco servers running something like vSphere as the hypervisor, and NetApp as the storage layer. It’s all bundled up and sold as a pre-validated and pre-configured unit.
The idea is you buy your rack of converged infrastructure, plug it in, and off you go. If you need more capacity you buy another rack from the same vendor. The attractive thing about this approach is that you’ve got a single vendor to go to if anything goes wrong. You’re not bouncing between hardware, hypervisor, storage, and network vendors all blaming each other; and you avoid any arguments about whether a particular component is compatible or supported.
There are some downsides as well. In theory, a converged infrastructure is very scalable, but in practice it can be prohibitively expensive to do so. The entire thing is vendor-locked and if you just need to add a little bit of capacity here or there you might find yourself over a barrel. The vendor may insist that you can only add capacity by purchasing an entire rack full of kit, which would be massive overkill; and of course there’s only one person you can get it from so it’s not going to be cheap! If you feel tempted to add a little of your own, that may invalidate support for your entire datacentre stack — negating the benefit of going for a converged infrastructure in the first place.
So what is hyperconverged and how is it different? Traditionally, the server and storage tiers of your three-tier infrastructure used physically different hardware. The storage used dedicated arrays with storage controllers and a fibre channel network linking it all together.
Over time, we’ve seen a move away from specialised storage hardware to more generalised server and network hardware. Expensive fibre channel storage networks are in many cases being replaced by the iSCSI protocol that runs across a standard IP network. Specialised storage controllers with dedicated hardware for things like RAID have seen their logic move into software that runs on commodity server hardware.
Handling storage in software instead of hardware allows for a lot of flexibility, and new features can be downloaded rather having to buy and replace physical kit. You may have heard this described as Software Defined Storage or SDS.
So if your server layer is running on commodity x86 hardware, connected to IP network switches… and your storage layer is running on commodity x86 hardware, connected to IP network switches… why not put them together?
This in a nutshell is a hyperconverged infrastructure or HCI. At first glance it may look like we’ve gone simply back to having multiple standalone servers that contain both computation and storage.
Physically, that’s exactly right, but logically speaking it behaves more like a three-tier infrastructure. The computation is using virtualization so logical servers can migrate between physical hosts at will. Unlike with a standalone server there is no hard linking between the computation resources and the storage resources in the same metal box. The actual data is distributed and replicated between servers just like nodes in a SAN.
A virtual machine could feasibly be running on one physical server and using storage from another.
To all intents and purposes it’s the same as a three-tier infrastructure, but the hypervisor and SAN nodes are sharing the same physical hardware. The first obvious benefit of this approach is that by collapsing your server and storage tiers you save on hardware. That’s fewer metal boxes to buy, power, cool, and fit somewhere. Scaling out your infrastructure is in theory a nice and simple affair. Your capacity now comes in discrete units that include all of your computation and your storage.
So if you have five hyperconverged host servers and you need 20% more capacity, you buy another hyperconverged host and everything increases by 20%. You’ll often get simplified administration as well. This varies quite a bit between vendors, of course. Some wrap everything up inside a single management interface and abstract away a lot of the underlying detail. Others are a bit more of a DIY affair.
For the more packaged products you end up with simplified administration. You can now manage the infrastructure from a single point rather than managing servers and storage separately. Maybe you only need one team now, instead of two. So: less kit to buy, fewer things to manage, all the performance and resilience of a three-tier infrastructure… Is there a catch? Yeah. There’s always a catch.
Linear scalability sounds like a good idea because it’s easy to understand and easy to purchase. The problem, of course, is that a lot of applications don’t scale linearly. Virtual desktop infrastructure or VDI is generally a good fit for this sort of thing. Each new user means a new desktop, which means a repeatable chunk of processor, memory, and storage. If you double the number of users, you double your processor, memory, and storage requirements; so you double the number of hyperconverged hosts and it all works out quite nicely.
But what about a file server? Usually with file servers you’ll tend to see storage growth over time, but processor and memory utilisation tends to remain fairly static by comparison. That’s something that doesn’t lend itself very nicely to hyperconverged scaling. If you want to add more storage capacity you could be forced to buy additional computation resources as well, because it all comes in a single box. Quite how inflexible this is will depend on the vendor and the units they offer, but you can easily find yourself spending more on hardware because you can’t simply tack on a bit of storage. Instead you’re buying processors and memory you don’t actually need, because “hyperconverged”.
Another downside is complexity. Reduced complexity is supposed to be a benefit of hyperconverged infrastructures, but that only really counts when it’s working. There are fewer things to manage, yes; but if you think about it, each of those individual things is now individually more complex because it’s combining the computation and the storage in every unit. You can no longer take a virtualisation host down for maintenance without also taking a storage node down for maintenance as well. If a Hyper-V hosts bluescreens, so did part of your SAN.
Typically, you should be able to tolerate such a failure because it will be deployed in a highly available topology where other nodes pick up the slack; but the point remains. When it’s all working well there’s less to manage. When you have an issue to deal with, though; it can start to feel a bit like a stack of cards.
So ultimately whether hyperconvergence is right for you or not is going to depend on a number of factors. If your infrastructure scales linearly, it could save you money on hardware. If scalability needs to be non-linear, hyperconvergence could cost you more in hardware. You might want to mix and match — you might be better served by the flexibility to design separate server and storage tiers in your main datacentre, but the plug-and-play simplicity of hyperconvergence in a space-constrained branch site. There is no one-size-fits-all answer, and you need to consider the performance characteristics of your workload as well. Whilst I can’t give you a simple “yes” or “no” answer on that, hopefully this has helped you figure out where to start.
Let me know in the comments if you’re going hyperconverged and in what scenarios you find it works best for you.