DevGraph

Posted on Aug 5, 2021

Move Your Database to the Cloud With Zero Downtime

#database #cloud #webdev #programming

By Ravi Duddukuru

A downtime-free migration with ScaleArc: decouple your database from the application layer and move it safely to the cloud with our load balancing tool.

The average cost of database downtime has risen from $300,000 to almost $1,000,000 worldwide between 2014 and 2020, say Gartner and Statista, respectively.

Downtimes are one of the main risks that accompany database migrations to the cloud. But with the right tools in your hands, you can eliminate them and arrive safely at the advantages of cloud computing.

Cloud Databases Are Key to Your Business Success

By the end of 2021, 59% of all companies will be using the cloud for their main workloads, according to IDG. This shift is mostly driven by the three main factors:

• The simplicity of the maintenance: you do not need to deal with purchasing and upgrading the hardware.

• Better scalability: if you need more computing power, you can get it in minutes.

• Cost savings: as a result of the previous two, you have fewer personnel costs and far fewer hardware expenditures.

On your way to a cloud database, you should be aware of the obstacles.

Database Downtimes

Database downtimes can be unnerving even for your permanent customers and a decisive disappointment for the new ones. When your application is not showing up-to-date information, users cannot work with it properly. Downtimes may have various reasons, for instance, application errors.

Application Code Change

Applications remain bound to their databases through the SQL queries that run in the background. Once you’ve renamed an entity in the application source code, you have to reflect them in the database or the queries that the application sends over to the database.

Such changes may be quickly overlooked, and even if they are not, their implementation takes time.

Ideally, no adjustment in one layer should ever influence the other, especially during the cloud transition, to prevent database downtimes.

The Solution

Thus, the first step on the way to a seamless database migration to the cloud is decoupling the application from the database. It can be done by creating an additional tier between the database and the application, making them independent from each other and more robust against failures.

But before we proceed with this, let’s take some time to go through the database migration issue in detail and outline possible solutions for the pitfalls that may be waiting for you during the transition period. As a bonus, we’ll show you a shortcut.

Cloud Migration: Database-Related Obstacles

If you decide to move your application and/or the underlying database to the cloud, the whole thing may get tricky. First of all, by moving, we obviously mean creating a copy of the original database and moving the data into the new one.

Basically, there are three widely used scenarios for doing this. Their main differences are the duration of a possible downtime and the number of business risks you have to accept in exchange for reducing the database downtime.

Strategies for Migrating a Database to the Cloud

Offline Copy Migration

As we mentioned before, a database receives many queries that may be updating and reading the same records with a minimum time lag. During the migration, the challenge is to create a complete copy of your data catching the most recent updates.

Offline copy migration solves this issue by simply shutting down the database — and, consequently, the application — for the time a copy has to be generated by means of a simple export and import of the data. Within this downtime, the original database will be disconnected from the application and the new copy connected to it.

In reality, this may mean a few hours of the database and application unavailability. It may work if you migrate an internal application used only by your employees and you do it on a weekend or during the night hours. For commercial applications, however, this may lead to revenue reduction and a poor customer experience.

Master/Read Replica Switch Migration

In this case, you create a read copy of your original database. The master copy gets updated and sends updates to the replica. Once you are ready to turn on the switch, you swap them, granting write access to the former replica.

Indeed, in this scenario, you still have some short downtime. You also run a risk that the original database failed to handle a few last-minute queries and they won’t show up in the new database. This may be acceptable for small businesses, internally-used applications, or applications that demonstrate a clear top and bottom in the amount of traffic during the day.

Master/Master, or Dual-Write, or Online Migration

Contrary to the previous method, you can write data to both databases and “turn off” one of them at any time. The databases need to be synched to keep identical records.

Here, a lot can get lost in transition. You never know which one is the most up-to-date. Thus, the synchronization process requires continuous observation. Together with the complexity of the maintenance, it can lead you to postpone the final switch.

The downtime is close to zero but the risks associated with this strategy are the highest among the three scenarios.

Still, the temptation for a quick switch is high, and there is a solution for this: incremental batch reading. This method adds a new column to each table in the database that identifies which records were already synced and which not. This allows you to at least spot the gaps, eliminating them later with a manual one-time synchronization.

Strategy-Independent Obstacles and Solutions

Apart from the difficult strategic choice, there are other factors that can influence your application downtime and, therefore, your approach to database migration. We gathered the three most common challenges and a few life hacks for dealing with them.

Database Schema Changes

Preparing a database migration is often a good point to do some inventory. For instance, to re-think the way your data is organized in the database and to re-design the latter accordingly. That manifests in the database schema changes.

As a result, your source and target databases have different schemes. The initial setup of a replica, as well as a routine synchronization between the two, will require a schema translation tool. Alternatively, you can use an eject-transform-load (ETL) tool, provided that you selected the first or second scenario.

Data Discrepancies

When you move the entire application to the cloud, you may decide to do some re-factoring that leads to changes in how your application works. Consequently, the application data will change. For instance, instead of using UTC in a time column, you start using your local time format. Again, you’ll need a synchronization and ETL solution to transfer the records correctly.

Other Database Functionality Differs

Indeed, cloud migration means a functional upgrade. For instance, your old database may have lacked one-to-many relationships between tables, or did not allow triggers or materialized views, or was not so good in partitioning.

The new cloud one may have this. The question is how to match the data, especially, if you choose the master/master scenario where you have to perform two-way updates all the time.

In very rare cases, it can be the other way around, and you perform a functional downgrade. For instance, if due to some cost factors you decided to adopt a cloud database that offers only essentials, then you have to cover discrepancies before the final switch happens. For instance, solve the many-to-many issue by adding a few more tables and establishing connections between them, or by getting rid of the old data instead of partitioning the tables.

A Smooth Migration: Managing Your Replicas Successfully

The use case with the master and a replica of a database is not only limited to database migration. You can use the master (the primary database) and one or more replicas, or secondaries or slaves, for other purposes. For instance, you might want to split the traffic between them and write the data only to the master and read the data only from the slaves to enhance your database performance.

When you manage your master and secondary databases in a clever way, this can help you not only to speed up the routine work of your application but also to secure your cloud migration.

The magic cure is a database load balancer. It can be applied to the databases without a master-slave relationship and to all databases that reside on more than one server.

What Is a Database Load Balancer?

Obviously enough, it balances the load of your database. Load balancers belong to middleware. They are placed between applications and database server farms.

With a load balancer, you have a universal endpoint for your application and enable the resource optimization of your database. It accelerates query throughput and lowers latency.

The load balancer distributes the queries more efficiently using one of the following methods:

• Round robin sends queries to the servers in a linear manner: the first query to the first server, the second to the second, and so on. It ensures primitive queue management of the incoming queries.

• Weight-based balancing means that the balancer keeps in mind the actual capacities of each server specified as a proportion of the network traffic and the amount of its current load. When a new query comes in, it re-directs it to the server that still has some free capacity.

• The least connection method only takes into consideration how many connections each server has already established, and re-directs a newly arrived request to the least burdened one.

When you want to migrate your database into the cloud, a load balancer can be used in the following scenario. But, indeed, not every load balancer is made for this. We explain to you how ScaleArc performs this task.

How ScaleArc Works During a Migration

Usually, your database is tightly coupled to the application. Instead, you can make the coupling looser and more flexible by putting ScaleArc between the database and the application, and then proceed with the migration.

Inside ScaleArc, you need to create a cluster or use an existing one for the source and target databases. Then, you can add a read-and-write target database to that cluster. It will be put into standby mode and won’t receive any traffic until the cutover.

ScaleArc will move all the data to the target database and keep synching the new data. Once you are ready to make the target your primary database, switch the connection between ScaleArc and the new database. The old one will still be there but not receiving any traffic. Your application will read and write data from the former replica.

Since ScaleArc remains connected to your application during the whole migration period, this method has no downtime.

A Short Step-by-Step Tutorial

The ScaleArc console has an intuitive cockpit where you can configure your database clusters before migration. We reduced the number of configuration parameters to a sufficient minimum. You need just a couple of minutes to fill in the form and start the migration process.

In the ScaleArc console, do the following to begin your migration:

• Click on “Set Up ScaleArc Cluster”
• Specify the database type
• Enter user credentials to allow ScaleArc to log in into the database
• Specify source and target database server names
• Wait till ScaleArc validates these host names and create a new cluster
• Click “Start Migration”
• Specify roles for each database server
• Enter an email address to receive notifications

While your migration is happening, you will get notified by email about important status changes and the completion of the migration. You can also check the progress manually in the console. You can monitor how many tables have already been written to the new database.

Last but not least, you can schedule the cutover time.

ScaleArc: More Than Just Balancing

With ScaleArc, you have a full grip on your migration risks. A seamless switchover is made possible thanks to the integration with Microsoft SQL Server™ AlwaysOn technology and MySQL™ automatic failover.

As we mentioned before, ScaleArc’s main purpose is routine database load balancing. We would like to highlight a few of its advantages to show you why this can be of great help to you even after you’ve finished your migration.

Load balancing works in the three main directions:

• traffic re-routing
• transaction queueing
• queries throttling

It prevents downtime maintenance, doubles your website performance, and boosts your revenue. ScaleArc stands out among other tools since it offers a few particular features on top of the basic functionality.

Automatic Failover

If you have data-heavy applications, such as webshops, or booking or fintech platforms, you may keep one or a few secondary standby databases that can support the main one when it is down. Outside of the database migration, a switch from the primary to the secondary database is called failover.

ScaleArc offers an automated failover process. It means that not only do you get notified when a database is interrupted, but ScaleArc automatically re-directs the traffic to the secondary databases. The users won’t even notice the difference, and can keep working with your application normally.

Surge Queuing

During a failover, the primary and secondary databases are not synchronized. This leads to a replication lag. Once the failover ends, the primary database gets a lot of new requests, resulting in service outages.

It is important to queue the incoming requests to prevent new troubles. The issue is often addressed by surge queuing, which is a ScaleArc feature.

Splitting Read and Write

Write requests are very important since they can change the returned results of a read request that follows afterward. That’s why it is often better to separate these two types of requests completely and only perform write operations on the primary database, while using the secondary one for the read requests.

In this case, the information gets updated quicker. This feature boosts the overall performance of your application.

Connection Pooling and Multiplexing

ScaleArc will collect inactive connections and store them temporarily in a pool instead of closing them immediately. This works faster than establishing a completely new connection every time.

Multiplexing goes beyond that and allows the re-use of a connection for multiple clients.

Query caching

ScaleArc caches — saves — responses to the most common queries. This allows for quicker delivery of user content since the query does not need to be processed anew when it arrives again. This reduces waiting time for your applications, making them more user-friendly.

Analytics and Logging

Indeed, ScaleArc is a highly transparent tool that allows you to collect data about your database performance and all failovers and downtimes.

At ScaleArc, we have thought about your need for a real-time view of all processes. We provide a live monitor for your database load. You can break down query traffic, handpick problematic queries and candidates for caching, and perform bucketing on your databases.

You have all your log data in one place. You do not have to generate logs for each database and then bring them together in a third-party analytics tool.

Moreover, you can retrieve historical data at any time and create forecasts based on it.

Integrations with RESTful API

If you still want to use your tool for analyzing logs, ScaleArc supports REST API integrations with monitoring and management tools. You can customize your log analytics to catch weak points.

Browse to www.scalearc.com to learn more about ScaleArc Database Migrations!

DEV Community