SQL server replication best practices with AWS DMS
How to set up Amazon RDS SQL server ongoing replication
For several years, Capital One has been going through a cloud transformation, a journey which culminated in our data center exit last year. In 2018 my team at Capital One embarked on our own cloud journey, breaking down our application into multiple services built on cloud native solutions. At the time, our product journey had reached a point where it made sense to migrate our database from running on EC2 instances to Amazon RDS.
For high availability and disaster recovery we wanted a read replica and cross region replication, but Amazon RDS does not support same region read replicas nor cross region replication in MS SQL Server 2012, which we were using at the time. The lack of an out-of-the box replication solution increased design complexity. Brainstorming for a solution we landed on AWS Database Migration Service (Amazon DMS).
Why tell this story now? Not everyone is in the same place with their cloud journey. Many companies are still migrating away from older tech and legacy platforms, including this server type. We solved this problem three years ago but thought it might be helpful to share it for companies at an earlier stage in the migration curve. In this article I will be discussing some best practices for Amazon Database Migration Service for SQL server cross region replication that my team and I developed, and which may be useful for your own organization.
Basic concepts for getting started with Amazon Database Migration Service
AWS Database Migration Service is a web service designed to migrate data from one data store to another. Amazon DMS can be used to move data between homogenous (for example Oracle to Oracle) and heterogenous (for example Oracle to SQL Server) endpoints. Amazon DMS also works with several data sources and is not restricted to databases. At a fundamental level, DMS is made of the below components:
- Replication Instance
- Endpoints
- Task
Let’s go over these components one by one.
AWS DMS replication instances
Replication instance is a managed Amazon Elastic Compute Cloud on which the replication task runs. Multiple replication tasks can run on a single instance. Instance type is chosen based on the number of tables in the migration and the number of concurrent replication tasks planned to run on a single instance.
AWS DMS endpoints
As illustrated in the diagram above, we have a source and target endpoint. The endpoints establish connections between datastore and replication instances. Connection information in general includes endpoint type, DB Engine, server name, encryption and credentials among others.
Amazon DMS tasks
Replication task setting defines the actual migration process. The migration type options are full load, full load + Change Data Capture, and Change Data Capture only.
Solution for ongoing replication for Amazon RDS SQL servers
For our 2018 use case, we created four Amazon RDS instances running on SQL Server 2012 spread equally across two regions. Amazon DMS replication tasks were set up to replicate from the primary instance in one region to instances in another region to achieve cross region replication/read replication. We started using DMS and all was well.
And everyone lived happily, though maybe not completely honestly, ever after. –Jon Scieszka
The end
Not really. :)
Outside of fairy tales, nothing worth achieving is ever so easy. Once our infrastructure was up and running, we started observing that instances in the same region were 100% replicated (tables and rows). However, on cross region instances, we observed a mismatch in the number of rows. Our Log and Transaction databases were off by ~ 100K rows. As we tried to get past this latency issue we learned a few things. The following are a few best practices we learned during this process which I would like to share.
Best practice #1 - Test where you fly
Build a test environment to perfect your migration strategy. Our test environment was valuable to simulate production-like loads, test server configurations, and migration strategies before deciding on a final approach. High volume, large databases need thorough performance and latency tests before production rollout.
Best practice #2 - Play your server cards right
Choosing the right infrastructure helps avoid performance bottlenecks in migration tasks. A replication server with the right compute and memory options is essential. Database/table size plays a key role in deciding the number of parallel tasks to run on a single replication server. Isolate large database replication tasks to a server running on multi-AZ configuration.
Best practice #3 - Prepare source and target database for migration
Amazon DMS uses Change Data Capture for ongoing replication in Microsoft SQL Server. During a full load migration, tasks perform a full scan on the source database. Reducing the load on the source database can enhance performance. Setting your source database to autoscale with a considerably larger max size limit helps avoid database space issues. Remove processes that compete for write space on the target database. Turn off backups, Change Data Capture, triggers, and foreign key constraints before the full load on the target database. For ongoing replication, they can be turned on later.
Best practice #4 - Monitoring migration tasks
Metrics for host, replication server, and table are available. Table Statistics provides information on the number of rows loaded. Queries to verify rows between source and target in conjunction with Amazon CloudWatch metrics helps perfect a migration strategy.
Best practices #5 - Adjust LOB mode settings
LOB stands for Large Binary Object. Our database had large stores of images. Managing LOB settings helped us enhance performance. Limited LOB mode has a user specified limit. Full LOB mode has no specified limit but can cause considerable performance issues. A suitable solution is to opt for Inline LOB which combines the best of two. An example of that can be seen here.
Conclusion
If you’re migrating Microsoft SQL Server (earlier than 2016) to Amazon RDS, AWS Database Migration Service is a solution worth considering for replication across nodes. I hope these five best practices for using AWS DMS for SQL Server replication help with your server migration journey!