In this post, I will explain how to use AWS Database Migration Service (DMS) to replicate the data from AWS to outside.
1. The requirement
In my country - Vietnam, every company must to store the client information such as personal information in local for security concern. Please check this link for more information. Based on the government regulation, it does not matter whether you are a local company or international one, your client's personal data must be stored in Vietnam. In case your application is running on AWS or other public clouds, which currently does not have a region in Vietnam, how your company comply with our government regulation?
In this blog, I will show an approach which help you leverage AWS services and still comply with the government regulation when your company invests to Vietnam market.
2. Solution
Let's say you are running your application on top of AWS infrastructure. Your client data is stored in RDS instance. Based on the government regulation, you have to store the client data both in AWS and in the local environment. You rent a VM from local provider to store the data and replicate the data from AWS to VM.
The below diagram demonstrates the solution:
The components are:
- AWS RDS for MySQL: storing the production database including the client's personal data.
- AWS Database Migration Server (DMS): a managed service to help migrate commercial and open-source database to AWS quickly and securely. In this case, I used DMS to replicate continuously the data from a source (RDS instance) to a destination (VM in local cloud provider).
- Virtual Private Gateway and firewall: establish a IPsec VPN site-to-site between AWS and local cloud environment. It provides a secured communication channel.
- Virtual machine: running MySQL engine to store the replicated data from AWS RDS.
Let's go to the details:
a) AWS RDS instance: Assuming that I have a RDS instance for MySQL engine which stores the private client data.
b) Virtual machine in local provider:
A virtual machine is running MySQL engine.
c) DMS:
For DMS step-by-step configuration and best practices, you can refer to these link: Get started and best practices.
Step 1: Creating a replication instance to replicate the data from RDS to VM based on the pre-defined migration task.
Note: In production environment, a multi-AZ replication instance is highly recommended to provide high availability. In order to achieve high-performance replication, you should use instance type such as compute optimized or memory optimized instead of burstable one. I just use it for the demonstration only.
Step 2: Creating endpoints for source (RDS instance) and destination (virtual machine). You should run test the connection to make sure RDS instance can communicate with VM smoothly.
Step 3: Creating migration task:
There is a bunch of parameters which you should consider to configure. It depends on your requirements and your source database status. Refer to this for your information.
In migration task, validation can be enabled if you want DMS to compare the data between source and destination. Please keep in mind that this validation progress will make the replication task longer. Otherwise, you can compare the source data and destination one manually such as number of database, number of tables, and size of database, etc.
The migration task in running status shows that it is replicating the data based on the pre-defined parameters. For other status, refer to this link for your information.
In order to monitor the migration task, you can check CloudWatch Logs and Table statistics, which updates information regarding the state of your tables during replication. The possible table state can be:
Table does not exist: AWS DMS can't find the table on the source endpoint.
Before load: The full load process is enabled, but it hasn't started yet.
Full load: The full load process is in progress.
Table completed: Full load is completed.
Table cancelled: Loading of the table is canceled.
Table error: An error occurred when loading the table.
The following shows that full load is completed. It means all tables was replicated to the destination (VM).
Other parameters in table statistics such as Inserts, Deletes, Updates, and DDLs show the number of these statements that were replicated during the change data capture (CDC) phase.
3. Conclusion
In this blog, I demonstrated how to use AWS Database Migration Service (DMS) to replicate the data from a source (RDS instance) to a destination (VM in local provider) to comply the government regulation. Should you have any question, feel free to leave your comment. Thank you for your reading!
Top comments (2)
Great article, keep the good work! Liked and followed! 🚀
Thanks Naubit!