DEV Community

Migrating MSSQL to AWS using Data Migration Service with CDC Migration

Overview and Objectives

The objective of this guide is to outline a comprehensive approach to migrating a Microsoft SQL Server database that is hosted on-premises to Amazon Web Services (AWS) using the AWS Database Migration Service (DMS). This document provides a step-by-step guide covering all important aspects of the migration process, including the setup of the DMS infrastructure, creation of endpoints, replication tasks, monitoring, and configuration of the source MSSQL database, and the destination database.
In line with AWS recommended practices, the DMS instance will be deployed within a private subnet. The connection between the on-premises source and AWS will be established through a VPN connection, which will not be addressed in this article. As an alternative, in instances where VPN is not set up, the DMS can be deployed within a public subnet with restricted access through a security group attached to it. Access will be granted only from a designated source IP address.

The business benefits of this migration include:

Zero Downtime: With this migration process, you can move any MSSQL servers from on-premises or from other hosting providers, such as Azure or GCP, to AWS RDS with no interruption to your operations.
Reproducible Setup: By using the same setup for all customers, you can reduce the variations in deployment and improve knowledge management, reducing the risk of errors.

High-Level Solution Design

The following sections will outline the components included in the offering, provide a high-level overview of the architecture, and present a detailed plan for constructing the solution.

Migration Diagram

Networking Setup and DMS Replication instance setup

VPC Networking Setup
The following prerequisites for setting up a DMS migration must include all essential network components to ensure the product's viability as a turn-key solution for the migration process. These components are:

  • Virtual Private Cloud (VPC) to host the destination MSSQL RDS instance.
  • Three-tiered Subnets - frontend (public), Database and DMS replication instance (private network with internet access through a NAT gateway), Data layer (internal network without internet access for managed services such as RDS, VPC endpoints, etc.)
  • Setting up VPC Endpoints for RDS, S3, and Cloudwatch,with the capability to add or remove endpoints per customer deployment, is a recommended best practice, especially when using the DMS instance version 3.4.7 This allows for flexibility in case additional services are required in the future.

DMS and RDS Replication Instance Setup

Before starting the migration, it's important to set up the MSSQL RDS instances properly, including:

  • MSSQL RDS Instance
  • Security Groups (one for the Database Listener for the RDS and another for the DMS security group)

DMS Replication Instance using console or automation option.

  • AWS RDS target instance nodes - with pre-set recommendations for the appropriate instance type based on the workload.

When selecting the instance type, several factors should be taken into consideration, such as:

  • The environment's lifecycle (development, QA, production)
  • The workloads running on the instance, including memory, CPU, and network requirements
  • Storage allocation to ensure the ability to migrate and host the production databases.

DMS Instance

Pre-Migration configuration steps on the MSSQL source instance

Pre-requirements:
To enable Transactional Replication and CDC for tables without primary keys via the DMS source endpoint, sysadmin permission must be granted to the dms_user. This can be accomplished by adding the "setUpMsCdcForTables=true" flag as an extra connection attribute to the source connection endpoint in DMS version 3.4.7, which is a new and available feature that has been released for version 3.4.7.

If you would like to perform the setup manually the steps below:

  • The dms_user will be created for use in the DMS source endpoint and will be granted the necessary permissions.
  • Enable Transactional Replication on the instance for tables with primary keys
  • Enable CDC on the source database use:
use [source_db_name] EXEC sys.sp_cdc_enable_db;
Enter fullscreen mode Exit fullscreen mode
  • Next, set-up MS-CDC for each of the source tables. For each table with unique keys but no primary key, run the following query to set up MS-CDC.
exec sys.sp_cdc_enable_tabl
@source_schema = N'schema_name',
@source_name = N'table_name',
@index_name = N'unique_index_name',
@role_name = NULL,
@supports_net_changes = 1
GO
Enter fullscreen mode Exit fullscreen mode
  • For each table with no primary key or no unique keys, run the following query to set up MS-CDC.
exec sys.sp_cdc_enable_tabl
@source_schema = N'schema_name',
@source_name = N'table_name',
@role_name = NULL
GO
Enter fullscreen mode Exit fullscreen mode

Pre-migration process steps
To minimize migration downtime, the following task needs to be performed on the source database:

  • A full backup must be taken of the source database, which must be using either the Bulk-Logged or Full recovery model.
BACKUP DATABASE <database_name> TO DISK = 'D:\Backups\database_name_datetime.bak
WITH NOFORMAT, NOINIT, NAME = 'database_name_datetime',
SKIP, NOREWIND, NOUNLOAD, STATS = 10 GO
Enter fullscreen mode Exit fullscreen mode
  • For the ongoing replication we need to capture the LSN number that will be used for the DMS replication task, take a record of the LSN
select top 1 [Current LSN
from sys.fn_dump_dblog (NULL, NULL,NULL, 1,'C:\Backups\BackupDB\databasename_datetime.bak',NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL)
where operation='LOP_BEGIN_XACT'
and [Begin Time]>= cast('yyyy-mm-ddTMM:SS:ms' as datetime); –-< Datetime of the backup completed
Enter fullscreen mode Exit fullscreen mode

Here is the example of the output of the query above:

LSN

  • Upload the backup files of the database to an S3 bucket for restoring the RDS instance at the destination. Before uploading, it's necessary to have access to the S3 via the AWS Console or CLI. This requires that the AWS CLI and the IAM user's credentials have access to the S3 bucket.
 aws s3 cp database_name_datetime.bak s3://<s3-bucket-name>/database_name_datetime.bak
Enter fullscreen mode Exit fullscreen mode

Restore backup from S3 bucket on RDS instance

Create IAM Role and manage or inline policy:
Trusted entities:

{
    "Version": "2012-10-17",

    "Statement": [

        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Service": "rds.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Custom managed or inline policy:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kms:DescribeKey",
                "kms:GenerateDataKey",
                "kms:Encrypt",
                "kms:Decrypt"
            ],
            "Resource": "arn:aws:kms:us-east-1:<aws_account_id>:key/<kms_key_id>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::s3-example-dmsupload",
                "arn:aws:s3:::s3-example-mssql-audit-logs"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload"
            ],
            "Resource": [
                "arn:aws:s3:::s3-example-dmsupload/*",
                "arn:aws:s3:::s3-example-mssql-audit-logs/*"
            ]
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode
  • To restore the database from an S3 bucket on RDS, it is necessary to have the SQLSERVER_BACKUP_RESTORE option pre-configured in the option group and an IAM role was created, as shown in the screenshots below:

To access the MSSQL option group created during RDS setup in the AWS console, go to RDS and click on "Option Groups" on the left-hand side. Then, add the value and IAM role, and apply the changes to the RDS instance.

RDS Option Groups
Example of the ARN:arn:aws:iam::<aws_account_id>: role/<role_name>
ARN
The SQLSERVER_BACKUP_RESTORE option should have the correct IAM role and policy that grants access to the S3 bucket containing the database backup files.

  • Restore the database on the RDS instance
exec  msdb.dbo.rds_restore_database
    @restore_db_name='<DatabaseName>', 
    @s3_arn_to_restore_from='arn:aws:s3:::<s3_bucket_name>/databasename_datetime.bak',
    @with_norecovery=0,
    @type='FULL';
Enter fullscreen mode Exit fullscreen mode
  • Check the status of the restore process exec msdb.dbo.rds_task_status;

  • Run the following command on the source database to enable CDC.
    use [DatabaseName]
    EXEC sys.sp_cdc_enable_db

  • Disable Triggers on the destination RDS database schema
    DISABLE TRIGGER ALL ON [destanation_db_name]
    EXEC sp_MSforeachtable "ALTER TABLE ? DISABLE TRIGGER all"

  • Disable constraints on the target database
    EXEC sp_MSforeachtable "ALTER TABLE ? NOCHECK CONSTRAINT all"

    At last, we are prepared to set up the replication task. Ensure that the necessary DMS instance has been deployed and that all the security access required between the source and destination has been configured

Create Target and Source Endpoints

Create Source and Target endpoints used by the migration task per database in the AWS account.
This step is done with the following pre-requisite:

  • The DMS instance has been deployed.
  • The network connectivity/security for the DMS and RDS instance is configured between the source and destination.

The task can be created as a part of the DMS creation via console or automation using Terraform.

Example of the source endpoint

Source Endpoint
Example of the target endpoint

Target Endpoint

Create DMS migration type ongoing replication

The task can be created manually or via automation

Ongoing Replication Task
When you create a task please use the following options:

  • Select "Enable custom CDC start mode"
  • Specify a log sequence number ( The LSN was recorded during the performance of the source backup.)
  • Target table preparation mode "Do nothing"
  • Include LOB columns in replication "Full LOB mode"
  • LOB chunk size (kb) = 64 This is for the LOB
  • Task logs = True

Task Settings LSN

Full LOB

Task Setting Lob value

Task Settings Logs

Task Advance Settings

Task Setting Mapping

Once the task is created, it's time to kick it off! The DMS task will automatically enable transactional replication and change data capture (CDC) on the source database. To keep an eye on its progress, take a look at the stats tab within the task and monitor the CloudWatch logs for any error messages.
Tip: In order to minimize IO and high CPU usage on the source instance and reduce the delta, it is advisable to use the latest backup from the source for CDC replication, which can help to lower the amount of transaction log reads required from the source database.
Tip: For managing large tables, it is advisable to create individual DMS tasks for each table.
Tip: To monitor DMS tasks, you can use CloudWatch to check the DMS instance and task statistics.

References

Top comments (0)