DEV Community

Cover image for Considering Data Redundancy Solutions as Hospital Computers Meltdown During Heatwave!
Head in The Clouds Blog
Head in The Clouds Blog

Posted on

Considering Data Redundancy Solutions as Hospital Computers Meltdown During Heatwave!

There are many things we take for granted in this one life and that includes the reliability of our hospitals and of course their IT systems! In the West, especially in the UK’s capital city of London, we have become accustomed to a fast, reliable, accurate and attentive service from our hospitals. Especially from the renowned Guy’s and St Thomas’ Hospital, a centre of excellence, and one of the UK’s leading providers of hospital and community-based healthcare, research, and education.

Imagine my shock and dismay then, when taking a loved one to an outpatient appointment on the Monday, to be informed that we would not receive the results of the medical tests that day, due to the hospitals computer systems being down since Friday! Not the past 10 or 30 minutes, hour or day, which would have been bad enough, but over an entire weekend!

We were some of the lucky ones though, as we were only awaiting routine test results. Usually highly competent hospital staff, were now rendered helpless and frustrated at being unable to access or read any of their patients medical records. As a result the hospital fell into complete chaos. At least one cancer patient and possibly serveral more were almost given the wrong medication, which would have had hazardous and possibly fatal results:

_“On two occasions I was confused for being another patient. Once in the middle of the night a nurse woke me up to try and give me medication I had never heard of.

“After quite a long argument with this nurse it transpired she was standing with a different patient’s paper record.

“Another occasion a surgeon burst into my room to tell me he had been in my operation and that my appendix had ruptured.

“Again after another slightly long argument it transpired he had got the wrong person and was very apologetic.” — Jennifer, Cancer Patient, Guys and St Thomas’ Hospital, (Source: BBC News).
_

Mind Blown!

A copious number of other unlucky patients had to have their operations and procedures cancelled due to the hospitals IT servers failing in the heat. Its common knowledge that waiting lists for such operations and procedures are climbing. So imagine then the frustration, and disastrous impact on personal wellbeing of having to wait several months, possibly years, for an operation, only for it to be cancelled at the last minute, due to the heat! Plus if this is how one of the leading hospitals were coping, what then of other less resourced hospitals?

Image description

Something seemingly so innocuous as the weather, having the power to take down entire IT systems of the city’s most vital services, is a sure sign of global warming as discussed in my last blog post. It also emphasises the need to consider more effective data and storage redundancy solutions.

Upon learning of the hospitals failing systems, my mind instantly went to a cloud concept I had learnt while studying for and passing the Microsoft Azure Fundamentals AZ-900 Exam. That of both data and storage redundancy.

Image description

Azure, is Microsofts Cloud computing solution, and cloud computing refers to the delivery of computing services over the internet (the cloud). These computing services include: databases, storage, servers, software, networking, analytics, and intelligence. Cloud computing offers faster, and more reliable, agile, elastic and scalable services at lower cost due to its pay as you go model. Reliability is achieved through data and/ storage redundancy.

So what is data and/ storage redundancy and how could it have helped in this situation?

Let’s first start with redundancy. Redundancy refers to duplicating your system and having a backup system in a slightly different location. This ensures that this back-up system can kick into gear in the event that your primary system fails, just as the IT servers failed in the Guys and St Thomas’ Hospital, example above.

How effective this redundancy is depends on the geographical location of your back-up system. Redundancy options include Availability Zones and Region Pairs.

An Azure Availability Zone refers to having physically separate datacenters within the same Azure region, as illustrated by the diagram below. The diagram below displays three (the minimum required) separate availability zones, in one shared geographical region. So in three separate buildings in the same city or borough for example. Each availability zone comprises of one or more datacenters, equipped with an independent source of power, cooling and networking. This is to ensure that in the event of an incident knocking out the power, cooling and / or networking at one availability zone (site), then the back-up systems at either of the other two availability zones are ready to take over. This ensures minimum outtage and downtime. An example of an incident taking down one availability zone maybe a virus on a local network or lightning striking the building of that availability zone.

Each availability zone is set up to be an isolation boundary, ensuring that no other clients/organisations on the public cloud can access or compromise another clients data.

The availability zones are connected via high-speed, private and diverse fiber-optic networks.

Image description

In terms of storage for items such as patient records, you have two main options: Locally Redundant Storage (LRS) and Zone-redundant Storage(ZRS) as depicted in the illustration below.

Locally Redundant Storage(LRS): refers to the solution where your data is copied three times, synchronously within a single physical location in the primary region. Naturally, LRS is the least expensive option as it only requires a single physical location. However, this option is not suitable for applications requiring high availability or durability, such as patient records.

Zone-redundant Storage (ZRS): Refers to the solution where your data is copied synchronously across three Azure availabilty zones in the same primary region, as in the illustrated example above.

Availabilty zones illustration

Yet what if disaster (such as a heatwave in the real life example above) affects the entire region? This is where a region pair provides even greater redundancy, is highly recommended by Microsoft, and an option, that most organisations adopt. Organisations can use a combination of LRS and ZRS in different region pairs. So it is highly likely that the hospital would have also had a region pair. That is of course if the hospital has been able to adopt cloud technology in the first instance.

A Region Pair is where each Azure region is paired with another region within the same geography, but at least 300 miles away. If one of the region pair is afflicted by a natural disaster, the failing services there would automatically failover to the other region in its region pair, as illustrated in the diagram below. The region pair are directly connected yet far enough apart to be isolated from regional disasters, such as heatwaves, floods, storms and terrorist attacks. This ensures reliable services and data redundancy.

Image description

Example of Azure Region Pairs. Source: Microsoft

Planned Azure updates are rolled out to paired regions, one region at a time, to minimize downtime, and mitigate risk of application outtage. With the exception of “Brazil South”, data continues to reside within the same geographic boundary, in-line with tax and law enforcement jurisdiction requirements.

In terms of region pair storage, Azure offers two solutions. Geo-redundant Storage(GRS) and Geo-zone redundant storage (GZRS).

Geo-redundant storage (GRS): This is the solution where your data is copied sychronously three times within a single physical location, within the primary region, using the LRS method. The data is then also copied asynchronously, to a single physical location within the secondary region. Within that secondary region your data is also copied three times sychronously, using LRS, just as in the primary region.

Geo-zone-redundant storage (GZRS): Is where your data is copied synchronously across three Azure availability zones in the primary region, employing ZRS. Your data is then copied asynchronously to a single physical location in the secondary region. Within that secondary region, your data is then copied synchronously three times employing LRS. The GZRS solution provides the highest availability and most durability as illustrated below:

Redundancy Computing images source: Microsoft & Microsoft Tech Communities.

When selecting the appropriate cloud service, for their needs, organisations also need to choose between the public, private or hybrid cloud. The private cloud, is where organisations purchase and maintain their own hardware, retaining complete control over their resources and security. If the hospital has adopted cloud services, this could be the model chosen, to avoid the governance issues associated with the public cloud.

However, drawbacks to the private cloud comprise of high capital costs and TCO (Total Cost of Ownership), including IT overheads, and wastage. This is because you are paying for resources, which become outdated and run out of compute memory very quickly. You also cannot scale out, adding more resources as demand for capacity increases and scale in, offloading resources and associated costs as demand falls, as quickly and efficiently as you can, with the public cloud.

A solution may be to move towards a hybrid cloud to obtain the best of both worlds, including manageable security and greater resilience to outtages (as compute maintainance is outsourced and managed using the latests security solutions, such as DDoS/Distributed-Denial-of-Service prevention), however this brings with it its own compatibility issues and complexities.

Image description

Image description

In conclusion, the solution would be for entities such as Guys and St Thomas’ Hospital, to move towards adopting Hybrid Cloud based technologies such as Azure, if they have not done so already. Within the Azure data and storage options, the GZRS solution, with its optimal levels of IT service availability and durability, would be the ideal solution for clients such as Guys and St Thomas’ Hospital, for whom the reliance on such systems is often a matter of life and death. However, such solutions are often the most expensive, and in an economy of public sector cuts, primary services are seeing their budgets slashed, often with catastrophic consequences. Plus with heat waves becoming more global rather than regional, such solutions are also being pushed to their limits.

Perhaps then, future solutions will also have to consider not only how we distribute and back-up power, compute and networking services, but also alternative forms of (renewable) power itself, using sustainable technologies which work with the earths changing temperatures rather than succumbing to it.

Top comments (0)