SQL Serverlink
Updated 6 days ago
SQL Server is Microsoft's SQL database. Fivetran's integration service replicates data from your SQL Server source database and loads it into your destination at regular intervals.
NOTE: Fivetran supports SQL Server as both a database connector and a destination.
Row-based relational databases, like SQL Server, are optimized for high-volume, high-frequency transactional applications. While very performant as production databases, they are not optimized for analytical querying. Your analytical queries will be very slow if you build your BI stack directly on top of your transactional SQL Server database, and you run the risk of slowing down your application layer.
Column-based databases are optimized for performing analytical queries on large volumes of data at speeds far exceeding those of SQL Server. While these databases are not good for high-frequency transactional applications, they are highly efficient in data storage. They permit more data compression (10x-100x) than row-based databases, which makes them a cost-effective way to store and access data for analytical purposes.
Fivetran offers the following solutions for replicating data from your Oracle source database:
- Fivetran native connector
- High-Volume Agent (HVA) connector
- Local Data Processing - SQL Server
Supported serviceslink
Fivetran supports the following SQL Server database services:
- Generic SQL Server
- Amazon SQL Server RDS
- Azure SQL Database
- Azure SQL Managed Instance
- Google Cloud SQL for SQL Server
Supported configurationslink
Fivetran supports the following SQL Server configurations:
Supportability Category | Supported Values |
---|---|
Database versions | SQL Server 2012 - 2019 |
Maximum table count per connector | 15,000 |
Maximum column count per table | 1,019 |
Connector limit per database | No limit |
Transport Layer Security (TLS) | TLS 1.1 - 1.3 |
IMPORTANT: We do not support single-user mode.
Which SQL Server instance types we can connect to depend on whether you use change tracking or change data capture as your incremental update mechanism. Read our Updating data documentation for more information.
Instance Types | Change Tracking | Change Data Capture |
---|---|---|
Generic SQL Server | ||
     Primary instance | check | check |
     Availability group replica | check | |
Amazon SQL Server RDS | ||
     Primary instance | check | check |
     Availability group replica | check | |
Azure SQL Database | ||
     Primary instance | check | check |
     Availability group replica | check | |
Azure SQL Managed Instance | ||
     Primary instance | check | check |
     Availability group replica | check | |
Google Cloud SQL for SQL Server | ||
     Primary instance | check | check |
     Availability group replica | check |
Known limitationslink
For connectors using change data capture (CDC) as the incremental update mechanism, we don't sync computed columns because CDC doesn't support these columns. Learn more in Microsoft's Change data capture - Known limitations and issues documentation.
Featureslink
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
Custom data | check | All tables and fields |
Data blocking | check | Column level, table level, and schema level |
Column hashing | check | |
Re-sync | check | Table level |
History | check | Supports history mode. |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking | check | AWS PrivateLink: Generic SQL Server on EC2, SQL Server RDS Azure Private Link: Azure SQL Database, Azure SQL Managed Instance |
Setup guidelink
Follow our step-by-step setup guides for specific instructions on how to set up your SQL Server database type:
- Generic SQL Server
- Amazon SQL Server RDS
- Azure SQL Database
- Azure SQL Managed Instance
- Google Cloud SQL for SQL Server Private Preview
Sync overviewlink
Once Fivetran is connected to your database, we pull a full dump of all selected data from your database. We then use one of SQL Server's two built-in tracking mechanisms, change tracking and change data capture, to pull all your new and changed data at regular intervals. If data in your database changes (for example, you add new tables or change a data type), Fivetran automatically detects and persists these changes into your destination.
Syncing empty tables and columnslink
Fivetran can sync empty tables for your SQL Server connector.
We can also sync empty columns in most cases. However, if you don't add rows after you create a new column, we cannot sync that new column. We need at least one row to see a new column because we learn of changes to a table's column cardinality when we see a row with a new or removed column during an update.
For more information, see our Features documentation.
Schema informationlink
Fivetran tries to replicate the exact schema and tables from your database to your destination.
Fivetran-generated columnslink
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source table_fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row
We add these columns to give you insight into the state of your data and the progress of your data syncs.
Type transformation and mappinglink
As we extract your data, we match SQL Server data types to data types that Fivetran supports. If we don't support a certain data type, we automatically change that type to the closest supported type or, for some types, don't load that data at all. Our system automatically skips columns with data types that we don't accept or transform.
The following table illustrates how we transform your SQL Server data types into Fivetran supported types:
SQL Server Type | Fivetran Type | Fivetran Supported | Notes |
---|---|---|---|
BIGINT | LONG | True | |
BINARY | BINARY | True | |
BIT | BOOLEAN | True | Not supported as primary keys for Teleport |
CHAR | STRING | True | |
DATE | LOCALDATE | True | |
DATETIME | LOCALDATETIME | True | |
DATETIME2 | LOCALDATETIME | True | |
DATETIMEOFFSET | TIMESTAMP | True | Not supported as primary keys for Teleport |
DECIMAL | BIGDECIMAL | True | |
FLOAT | DOUBLE | True | |
GEOMETRY | JSON | True | Not supported as primary keys for Teleport |
GEOGRAPHY | JSON | True | Not supported as primary keys for Teleport |
HIERARCHYID | STRING | True | |
IMAGE | BINARY | True | |
INTEGER | INTEGER | True | |
MONEY | BIGDECIMAL | True | |
NCHAR | STRING | True | |
NTEXT | STRING | True | |
NUMERIC | BIGDECIMAL | True | |
NVARCHAR | STRING | True | |
REAL | FLOAT | True | |
ROWVERSION | BINARY | True | |
SMALLDATETIME | LOCALDATETIME | True | |
SMALLMONEY | BIGDECIMAL | True | |
SMALLINT | SHORT | True | |
TEXT | STRING | True | |
TIME | STRING | True | |
TIMESTAMP | BINARY | True | |
TINYINT | SHORT | True | |
UNIQUEIDENTIFIER | STRING | True | |
VARCHAR | STRING | True | |
VARBINARY | BINARY | True | |
XML | STRING | True |
We also support syncing user-defined data types. When you create a user-defined type in SQL Server, you are required to choose a base type. If Fivetran supports that base type, we automatically transform your user-defined type to its corresponding Fivetran type. For more information, see Microsoft's user-defined types documentation.
If we are missing an important type that you need, please reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. This can change either the data type itself or the parameterization on a data type. For more information, see the individual data destination pages under the Destinations documentation section.
Excluding source datalink
If you don’t want to sync all the data from your database, you can exclude schemas, tables, or columns from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Column Blocking documentation.
Alternatively, you can change the permissions of the Fivetran user you created and restrict its access to certain tables or columns.
How to allow only a subset of tableslink
In your primary database, you can grant SELECT permissions to the Fivetran user on all tables in a given schema:
GRANT SELECT on SCHEMA::<schema> to fivetran;
content_copy
or only grant SELECT permissions for a specific table:
GRANT SELECT ON [<schema>].[<table>] TO fivetran;
content_copy
How to allow only a subset of columnslink
You can restrict the column access of your database's Fivetran user in two ways:
Grant SELECT permissions only on certain columns:
GRANT SELECT ON [<schema>].[<table>] ([<column 1>], [<column 2>], ...) TO fivetran;
content_copyDeny SELECT permissions only on certain columns:
GRANT SELECT ON [<schema>].[<table>] TO fivetran; DENY SELECT ON [<schema>].[<table>] ([<column X>], [<column Y>], ...) TO fivetran;
content_copy
Initial Synclink
Once Fivetran is connected to your database or read replica, we first copy all rows from every table in every schema for which we have SELECT permission (except for those you have excluded in your Fivetran dashboard) and add Fivetran-generated columns. We copy rows by performing a SELECT statement on each table. For large tables, we copy a limited number of rows at a time so that we don't have to start the sync over from the beginning if our connection is lost midway. How many rows we copy at a time depends on whether your tables have clustered indices or not. For tables with clustered indices, we copy 500,000 rows at a time. For tables without clustered indices, we copy 5,000,000 rows at a time.
Connecting to a read replicalink
You can only connect Fivetran to a read replica if Change-Data Capture is enabled on the primary database. We need this in order to perform our incremental updates.
Updating datalink
Fivetran performs incremental updates of any new or modified data from your source database. During incremental updates, we request only the data that has changed since our last sync. We use one of the following incremental sync methods for incremental updates: change tracking (CT), change data capture (CDC), and Fivetran Teleport Sync.
NOTE: Change tracking and change data capture are SQL Server's built-in tracking mechanisms.
CT, CDC, and Fivetran Teleport Sync create change records that Fivetran accesses on a per-table basis during incremental updates. To guarantee data integrity, we check for changes on every table with CT, CDC, or Fivetran Teleport Sync enabled during each update, which can add to the sync time. Customers who sync with many thousands of tables can therefore expect longer syncs.
Tables that do not have CT, CDC, or Fivetran Teleport Sync enabled still appear on your Fivetran dashboard, but they are disabled. They also have a message indicating that you need to enable either CT, CDC, or Fivetran Teleport Sync.
If CT and CDC are both enabled on a table, we use CDC as the incremental update mechanism. Use Fivetran Teleport Sync when enabling the tracking mechanisms is not an option.
CT vs CDC vs Fivetran Teleport Synclink
There are several key differences between change tracking (CT), change data capture (CDC), and Fivetran Teleport Sync:
Feature | Change tracking | Change data capture | Fivetran Teleport Sync |
---|---|---|---|
Syncs tables with primary keys | Yes | Yes | Yes* |
Syncs tables without primary keys | No | Yes | No |
Can read from a replica | No | Yes | Yes |
Tracks DML changes | Yes | Yes | Yes |
Provides full record of changed data | No | Yes | Yes |
Tracks previous changes | No | Yes | No |
* IMPORTANT: Fivetran Teleport Sync supports a limited number of primary keys. See which primary keys are supported in our Teleport Sync limitations documentation.
NOTE: CDC has heavier processing and storage overhead than CT.
To learn more about CDC and CT, read on below or see Microsoft's Track Data Changes documentation.
Change trackinglink
Change tracking (CT) records when a row in a table has changed, but does not capture the data that was changed. CT also does not capture how many times the row changed or record any previous changes. SQL Server records changes from all tables that have CT enabled in a single internal change table.
You cannot sync tables without a primary key if you choose CT as your incremental update mechanism. CT needs primary keys to identify rows that have changed.
You must enable CT on the primary database, as well as on each individual table that you want to sync. When you enable CT on your primary database, you can select a window size. Window size determines how long your change records are kept in the change table before they are deleted. If not explicitly specified, the default value is 3 days. We recommend changing the window size to 7 days. A longer window gives us more time to resolve any potential sync issues before change records are deleted. Our system detects when we were unable to process changes to a table before they were deleted from the change table. When we detect this situation, we trigger a re-sync for that table.
Change tracking is a lightweight background process that should not impact your production workload. CT takes up minimal storage space on your hard drive because its change table only records the primary keys of changed rows.
Change data capturelink
NOTE: Fivetran can only sync tables that have a single CDC capture instance. Our syncs only include tables and columns that are present in a CDC instance. If you add new tables or columns, you must create a new CDC instance that includes them and delete the old instance.
NOTE: To use a Clustered Columnstore Index (CCI) for CDC, you must enable CDC before adding the CCI.
Change data capture (CDC) tracks every change that is applied to a table and records those changes in a shadow history table. Unlike CT, CDC captures what data was changed and when, so you can see how many times a row has changed and view past changes.
CDC can track changes on any kind of table, with or without a primary key. Due to SQL Server limitations, a SQL Server connector with CDC enabled automatically propagates column type changes to your destination but does not propagate any other schema changes, such as renamed columns, new tables, and new columns.
You must enable CDC on the primary database, as well as on each individual table that you want to sync. When you enable CDC on your primary database, you can select a window size (also known as a retention period). Window size determines how long your change records are kept in the shadow history table before they are deleted. Unless specified, the default window size is 3 days. We recommend increasing the window size to 7 days. A longer window gives us more time to resolve any potential sync issues before change records are deleted.
CDC is a heavier process than CT. CDC takes up more storage space in your database because it captures entire changed records, not just the primary keys of changed rows. CDC also uses more compute resources than CT because it writes each table's changes to its own shadow history table. If you want to reduce some of the load on your production database, you can configure CDC to read from a replica, though you will need to enable it on your primary database first.
Fivetran Teleport Synclink
Fivetran Teleport Sync is a proprietary database replication method that offers the completeness of snapshots while approaching the speed of log-based systems. With this sync mechanism, Fivetran can incrementally replicate your database with no additional setup other than a read-only SQL connection.
Use Fivetran Teleport Sync if enabling the tracking mechanisms is not an option or if there are other reasons that would prohibit the Fivetran user from using CT or CDC.
NOTE: Fivetran Teleport Sync supports syncing tables with composite primary keys.
Fivetran Teleport Sync's queries perform the following operations on your database:
- Do a full table scan of each synced table
- Perform calculations on all values in each synced table's rows
- Aggregate a compressed table snapshot in the database's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your database:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Limitationslink
Fivetran Teleport Sync cannot sync tables without a primary key. Teleport does not support all column types as primary keys. See which columns we support as primary keys in the Type transformation and mapping table.
DROP COLUMN
DDL operations prompt a table re-sync. ADD COLUMN
operations do not prompt a table re-sync because we are able to process the schema change without it.
Teleport supports syncing tables with primary keys up to 4GB in size. Tables exceeding this limit cannot be synced.
For SQL Server 2014 (12.x) and earlier, Teleport can capture a maximum row size of 8000 bytes. Rows larger than 8000 bytes cause a connector failure. Later versions do not have this limit.
Tables with a primary keylink
We merge changes to tables with primary keys into the corresponding tables in your destination:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
- A DELETE in the source table updates the corresponding row in the destination with
_fivetran_deleted = TRUE
- An UPDATE in the source table updates the corresponding row in the destination
Tables without a primary keylink
NOTE: Fivetran cannot sync tables without a primary key using CT or Teleport. You must have CDC enabled to sync tables without a primary key.
We handle changes to tables without a primary key differently:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
. - The
_fivetran_id
column helps us handle DELETE operations:- If there is a row in the destination that has a corresponding
_fivetran_id
value, that row will be updated with_fivetran_deleted = TRUE
. - If there is not a row in the destination that has a corresponding
_fivetran_id
value, a new row will be added with_fivetran_deleted = TRUE
.
- If there is a row in the destination that has a corresponding
- An UPDATE in the source table is treated as a DELETE followed by an INSERT, so it results in two rows in the destination:
- A row containing the old values with
_fivetran_deleted = TRUE
- A row containing the new values with
_fivetran_deleted = FALSE
- A row containing the old values with
As a result, one record in your source database may have several corresponding rows in your destination. For example, suppose you have a products
table in your source database with no primary key:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Boogie robot | 2 |
Cookie robot | 3 |
You load this table into your destination during your initial sync, creating this destination table:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:00' | 1 | FALSE | dfdf |
Boogie robot | 3 | '2000-01-01 00:00:00' | 2 | FALSE | ewra |
You then update a row:
UPDATE products SET quantity = 4 WHERE description = 'Cookie robot';
content_copy
After your UPDATE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:00' | 3 | TRUE | dfdf |
Boogie robot | 3 | '2000-01-01 00:00:00' | 2 | FALSE | ewra |
Cookie robot | 4 | '2000-01-01 00:00:00' | 4 | FALSE | zxfd |
You then delete a row:
DELETE FROM products WHERE description = 'Boogie robot';
content_copy
After your DELETE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:02' | 3 | TRUE | dfdf |
Cookie robot | 4 | '2000-01-01 00:00:02' | 4 | FALSE | zxfd |
Boogie robot | 3 | '2000-01-01 00:00:02' | 5 | TRUE | ewra |
So, while there may be just one record in your source database where description = Cookie robot
, there are two in your destination - an old version where _fivetran_deleted = TRUE
, and a new version where _fivetran_deleted = FALSE
.
We also de-duplicate rows before we load them into your destination. We use the _fivetran_id
field, which is the hash of the non-Fivetran values in every row, to avoid creating multiple rows with identical contents. If, for example, you have the following table in your source:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Then your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Deleted rowslink
We don't delete rows from the destination, though the way for how we process deletes differs for tables with primary keys and tables without a primary key.
Change trackinglink
You cannot sync tables without a primary key because CT requires primary keys to record changes. Tables without a primary key are excluded from your syncs. They are still listed in your Fivetran dashboard, but appear disabled.
If you want to add a primary key to a table, you can run the following query in your primary database:
ALTER TABLE <schema>.<pkless_table> ADD _id int NOT NULL IDENTITY (1,1) PRIMARY KEY
content_copy
The new _id
primary key column...
- Auto-generates a sequential
_id
value when a new record is added - Cannot be changed or overwritten with new values
- Automatically populates on all records when added to an existing table
Change data capturelink
We merge changes to tables without a primary key into the corresponding tables in your destination:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
- A DELETE in the source table updates the corresponding row in the destination with
_fivetran_deleted = TRUE
- An UPDATE in the source table soft deletes the existing row in the destination by setting
_fivetran_deleted = TRUE
, then inserts a new row with the updated values
Migrating Service Providerslink
If you want to migrate service providers, we will need to do a full re-sync of your data because the new service provider won't retain the same change tracking data as your original SQL Server database.