Azure SQL Data Sync can help a business solve many problems with syncing data to Azure. This may seem self-explanatory but there are some limitations you should be aware of, so let’s get right to it.
What is Azure SQL Data Sync?
Azure SQL Data Sync is a sync framework that allows businesses or individuals to sync data from one or many disparate SQL data sources – either on-premises or other Azure SQL Databases – to a single Azure SQL Database (the Hub). The data can be synced manually or set up on a schedule to sync automatically anywhere from once every five minutes to once a month. They sync job can also sync data from the Hub to the other on-premises or Azure SQL databases.
One example may be a central office that collects data that needs to be distributed to different regional databases. The central office can sync to the Azure SQL database which in turn can sync the data to the regional databases, keeping them all as close to real-time sync as possible.
The Set Up Process
The sync job is able to copy data to or from one or more “Member Databases” using a hub/spoke topology. An Azure database is the hub, and the one or more member databases are the spokes, so to speak. These member databases can be either other Azure databases or local SQL databases. Here are the basic steps to get successful sync set up.
- Navigate to the Azure portal and select your hub database. Select the “Sync to other databases” option on the left-hand menu.
- This screen will show the Sync Groups that have been set up. These groups define the Hub and Member databases and the configuration settings of the sync. Select the New Sync Group button.
- The Create Data Sync Group screen is straightforward. First, give your group a name. There will be a good amount of metadata stored on the sync, so you must also choose to either store this in a new or existing database. You can also set up automatic syncs here. Currently, 5 minutes is the more frequent allowed. Finally, select a rule for conflict resolution. This will determine which records take precedence, either the Hub or Member when a conflict occurs.
- After the Resource Group has been created you will be able to add Member databases to the sync. For Azure, it is again straightforward, with dropdowns to find the specific database you want to sync with. You will, of course, need to enter credentials for both Hub and Member databases as well. The sync direction option is key. You can select from either direction or bi-directional
- Setting up a local SQL database as a sync member is a bit more involved. First, install the SQL Azure Data Sync Agent on the machine or machines that have access to the reference databases that need to be synced. During the installation, you will give this tool the username and password that it will use to register your reference database with Azure. This account must have “log on as a service” rights in Azure and read/write access to your reference database(s). It will create a few tables of its own in the reference database(s) that are used during the sync process.
- Setup a Sync Agent in Azure.
- Get the key created here from the Azure Sync Agent to paste into the Sync Client Agent so that the client can register your reference database(s) with Azure.
- Register your reference database(s) through the Sync Client Agent interface. Once this completes you will be able to select the on-premises database as a member in the Azure interface.
- The configuration of the sync can now be set up in Azure. Select either the Hub or Member database and then select the tables/fields to be synced.
- The data sync is now set up and ready to run. Select the new Sync Group and you will see the management screen. The logs for successful setup/sync and any errors are listed at the bottom. You can view/edit the properties of the sync group itself. Additionally, by clicking the Databases or Tables icon buttons you can make any additional changes as necessary.
So, overall it is a fairly simple process; however, there are some limitations you need to be aware of.
The Azure SQL Sync team has published several requirements and limitations that you need to be aware of before setting up a sync job. There are many, but I’ve done the work for you and listed the most common issues:
- Each table in the sync needs to have a primary key.
- No table can have an identity column unless it is the primary key.
- Many special characters are not allowed in names of tables, columns, and even data fields. Microsoft lists some specifically, but we have experienced issues with others.
- The maximum number of endpoints across all sync groups is 30.
- The maximum character length of any database, table, schema, or column is 50 characters.
- There can be no more than 500 tables across all databases included in the sync group.
- There can be no more than 1000 columns in a single table in a sync group.
- The max data row size on any table in the sync group is 24Mb.
- If more than one database in the sync group already contains data before the initial sync, then every row will be treated as a conflict during the original sync and will require resolution, which on very large data sets could take from hours to months according to Microsoft’s documentation.
So, suffice it to say that Azure SQL Data Sync is not for really big data. You can, of course, use multiple sync groups to get around some of these limitations.
So, Azure SQL Data Sync can be a great way to sync data – even from multiple data centers – into a single spot in Azure if you have a business need to do so. It is fairly simple to set up; however, there are some limitations to be aware of before going through the process.
CloudFit exists to plug gaps and connect dots for clients, enabling them to move quickly and run effectively in the Cloud. Our mission is to improve the lives of our customers and team members by providing selfless, accountable human and technology services that simplify the complex digital world. Come see what we do and how we do it.
By Jason Graham, Sr. Software Development Engineer, and Sam Ellison, Software Development Engineer, CloudFit Software