Databases

Your production database contains your most important data, and Fivetran's database connectors are our most important and battle-tested connectors. Our database connectors are designed for correctness and safety. Correctness means that your destination should always be an accurate replica of your database source. Safety means that no matter what, Fivetran should never cause problems in your source database.

Supported databases

Fivetran supports replicating data from the following databases:

If you'd like to integrate with a database that's not listed here, reach out to Fivetran support.

Database integration overview

Connect: Connect us to your production database or read-replica through a specific Fivetran user that you create in your database. For instructions on creating a Fivetran user, see the setup guide for your database type.
Initial sync: Fivetran discovers all selected tables and automatically pulls all historical data for all objects to which it has access. The duration of initial syncs varies by customer, but most syncs finish within a day.
Transform and map schema: Fivetran parses through all your data, typecasting and mapping every column in the source object to a column in the corresponding table in the destination. We transform any data types that are not natively supported in the destination into data types that are accepted. Fivetran does not do any aggregations at this stage.
Load: Fivetran automatically creates schemas and tables for each mapped source object in the destination. We populate these tables with the data from the initial sync.
Incremental updates: Fivetran incrementally updates the data in your destination in batches, using a merge operation (upsert & insert) to update only new or changed data. You can choose how often these batches run, ranging from every 1 minute to every 24 hours. Fivetran's unique system automatically detects schema changes in your data source and applies these changes to the destination.

Connecting your database

Connection options

Because our database connectors are pull connectors, Fivetran must have a way to connect to your source database. We support the following connection methods:

Direct connection
SSH tunnel
Reverse SSH tunnel
AWS PrivateLink, Azure Private Link, or Google Cloud Private Service Connect
VPN tunnel
Proxy Agent

For more information about these connection methods, see our Database Connection Options documentation.

Safety

Fivetran is a good citizen of your database. You can connect Fivetran to a read-replica of your database if you want to isolate us from your production workloads, but you can also connect us directly to your production instance. Our queries are designed to minimize contention with your production queries in two major ways:

Our queries generate simple execution plans that can run concurrently with many other queries. They don't take higher priority over your other workloads.
We use parallel processing to sync data from your database to your destination. Using multiple threads makes your syncs faster than using a single thread, since we can run multiple queries simultaneously. However, multiple threads also compete for resources slightly more than a single thread does. We limit the number of threads that we use to ensure that we don't compete too much with your concurrent queries.

Initial sync

During the initial data sync, Fivetran connects to your source database and copies the entire contents of every table that you’ve selected to sync. The way we perform that initial sync depends on whether your source database has a B-tree (for example, MySQL or SQL Server) or a heap (for example, PostgreSQL or Oracle) data structure.

The SAP HANA database connector has a different initial sync strategy from our other database connectors. For more information, see our SAP ERP on HANA initial sync documentation.

B-Tree databases

For B-Tree tables, we perform the initial sync using a series of queries that look like this:

select *
from {table}
where {primary key} > ?
order by {primary key}
limit {page size}

This is an efficient strategy because each table in a B-tree database is intrinsically ordered on the {primary key}. There is no need for a sort by clause in this query, because each table is already in {primary key} order.

Heap databases

For heap databases, the initial sync is more complicated. For small-to-medium tables, we sync the entire table with a single select * query. For large tables, we break the contents into more manageable blocks based on the order in which the data was added to the database. We then sync the blocks in chronological order.

Excluding source data

If you don’t want to sync all the data from your source database into your destination, you can easily exclude schemas, tables, or columns from your syncs on your Fivetran dashboard. Alternatively, you can change the permissions of the Fivetran user in your source database and restrict its access to certain tables or schemas.

Check initial sync progress

To monitor the progress of your initial sync, you can use either of the following:

Our free Fivetran Platform Connector, which we automatically add to every destination you create.
External log services.

Using either of these options, you can:

Check the most recent log record to track the progress of your table imports.
- All tables marked NOT STARTED have not been imported yet.
- All tables marked COMPLETED have been imported.
Check older log records to track your sync progress.

Transformation and mapping overview

Similar to Cloud API connections, Fivetran's database replication product pursues a "sync-all" strategy. This means we sync as many source schemas and tables as possible with as little setup and configuration as possible.

However, unlike Cloud API connections, our database connectors do one-to-one mapping of source schemas and tables to destination schemas and tables. Columns are also mapped one-for-one wherever possible, except when no compatible data type exists in the destination. In that case, we will transform data types that are not supported in the destination into data types that are supported.

Loading schemas and tables

Schemas and tables

A single database connector can replicate multiple schemas, as well as all of their underlying tables. These schemas are stored in the destination in the same logical schema or table hierarchy as exists in your source database. Depending on the naming convention you choose, each replicated schema name is either prepended with the connection name or uses the source schema name without any changes.

Mapping name from source to destination

We use one of the following naming conventions to name your schemas, tables, and columns in the destination:

Fivetran naming
Source naming

You cannot modify your selection once the connector completes the initial sync.

Fivetran naming

For database connectors, we do minimal name-changing between the source and destination. This allows SQL queries written for the source database to be easily ported to the destination, without having to change the field names in the queries. There are only a few exceptions to this rule:

Schema names that have the connection name prepended, and
Schema, table, or column names that violate Fivetran’s naming conventions. In these cases, we apply the following rules in the order in which they are listed:
- Non-ASCII characters (such as Chinese characters) are replaced with a transliteration.
- Remove accent marks from Latin characters
- We only allow letters, numbers, and underscores. Any other character is replaced with an underscore.
- Upper-case letters are converted to lower-case.
- The character limit for names is 127 characters. Any name with more than 127 characters is truncated to the limit.

home fivetran database replication overview new

Source naming

Source naming does not follow Fivetran’s naming conventions and is available only for the database connectors, syncing data to a destination. When enabled, Fivetran preserves the original or source names of the schemas, tables, and columns with UTF-8 values without any modifications and replicates them to your destination.

Some destinations may have reserved keywords or restrictions that prevent certain names from being used. We display a warning message in such instances and exclude the schemas, tables, or columns from the sync.

fivetran database replication source schema

Supported connectors and destinations

Fivetran supports Source naming for a subset of its connectors and destinations.

Supported connectors

Fivetran supports Source naming for the following connectors:

Supported destination

Fivetran supports Source naming for the following destinations:

Changing the destination schema name

We do not support changing the connection name. It is therefore not possible to change the name of the destination schemas.

If you want to change the name of your destination schema, you can set up a new connection and remove your old connection once the new one has finished syncing. You can also create a SQL view on top of the existing schema with a new name that works better for your BI or SQL layer.

Incremental updates

Fivetran performs incremental updates by syncing only new, modified, or deleted records from your source database. During incremental updates (also known as incremental sync), Fivetran captures changes that occurred since the last successful sync using either the source database’s native change data capture mechanism or a Fivetran proprietary method. The method of capturing changes depends on the source database.

The following native change data capture mechanisms are supported by our database connectors:

Binary Log – MariaDB, MySQL
Change Data Capture (CDC) – SQL Server
Change Data Feed – Databricks
Change Feed – Azure Cosmos DB
Change Streams – DocumentDB, MongoDB
Change Tracking (CT) – SQL Server
Changefeed – CockroachDB
DynamoDB Streams – Amazon DynamoDB
Logical Replication – PostgreSQL
LogMiner – Oracle
Oplogs – MongoDB
Time Travel – BigQuery, Snowflake
XMIN – PostgreSQL

The following Fivetran proprietary data capture mechanisms are also supported:

Archive Log Only – Oracle (HVA), SAP (HVA)
Binary Log Reader – Oracle, SQL Server
Direct Capture – Oracle (HVA), SAP (HVA)
Fivetran Teleport Sync – BigQuery, MariaDB, MySQL, Oracle, PostgreSQL, Snowflake, SQL Server

During incremental syncs, Fivetran maintains an internal set of progress cursors that allows us to track the exact point where the last successful sync ended. We record the time of the last sync for each row in a column called _fivetran_synced (UTC TIMESTAMP). This enables a seamless transition between syncs and ensures that no data is missed.

These progress cursors make Fivetran highly resilient to service interruptions. If a disruption occurs, such as the destination becoming unavailable, Fivetran automatically resumes syncing from where it left off once the issue is resolved, even after several days or weeks, as long as the necessary log data is still available.

Deleted data

All database connectors support syncing deleted data. For tables with a primary key, we mark deleted rows as deleted in your destination, but do not actually remove them. We create an extra column in the destination to mark deleted rows, "_fivetran_deleted" (boolean). Deleted rows will have a true value in this column.

The following databases support tables without a primary key:

The way we treat deleted rows in these tables varies by connector type. See the relevant database connector's documentation page for more information.

Frequency of updates

By default, all database connectors sync new and modified data every 15 minutes. Depending on the size of each update or the number of tables, syncs may take longer than that. In that case, the next data sync will start at the next 15-minute interval. For example:

Update X Start: 9:00am
Update X Finish: 9:18am
Update Y Start: 9:30am

You can change the update interval in your Fivetran dashboard. When database syncs encounter repeated errors, our system automatically retries the sync after the update interval or one hour, whichever is shorter.

Migrating service providers

If you migrate from one database service provider to another, we will need to do a full re-sync. That's because when you switch to another provider, Fivetran can't keep track of where our last sync left off, as the new service provider won't retain the same progress cursor and changelog information as the previous provider. Examples of switching service providers include:

MySQL (Self-Hosted) <--> Amazon RDS for MySQL
PostgreSQL (Self-Hosted) <--> Amazon RDS for PostgreSQL
SQL Server (Self-Hosted) <--> Amazon RDS for SQL Server

See the individual database guides for specific instructions on how to migrate.

Fivetran Teleport Sync

Fivetran Teleport Sync is a proprietary incremental sync method that can incrementally replicate your database with no additional setup other than a read-only SQL connection.

Fivetran Teleport Sync is an alternative to change log replication for incremental syncs. We recommend using it in the following cases:

You cannot or do not want to provide Fivetran access to your database's change logs
You need to limit the tables or columns that Fivetran can read from
Change log syncing is slow or resource intensive due to the number of updates
You load data into the source database using a TRUNCATE/LOAD or BACKUP/RESTORE process that invalidates change logs
You currently use snapshot syncing and want a faster method

Fivetran Teleport Sync works by running SQL queries in your database that return a compressed version of your data in a form that can be used to find the changed data.

Teleport sync general limitations

Review the limitations specific to each connector to verify whether tables without a primary key are supported.
Tables without a primary key (if the database supports it) may experience slower performance compared to tables equipped with primary keys.
The speed of data imports and incremental synchronizations can vary considerably across different connectors. Reference the specific databases teleport sections for further suggestions.
- Performance can also be influenced by the type of hardware used and the specific deployment location of the database. Look at specific databases teleport docs on limitations for further guidance.
Certain Data Definition Language (DDL) changes that may require a table re-sync are listed in Automatic table re-sync for Teleport sync method.
If you update large portions of a table between syncs, Fivetran Teleport Sync may be unable to determine the changes since the previous sync. When this occurs, the connection will re-sync the entire table to capture the latest data. To avoid these table re-syncs, we recommend that you increase the sync frequency of the connection.
If you are trying to sync large, rapidly changing tables (>50,000 rows changed per sync) and fall back to a full table scan regularly, we may disable your high volume tables until the issue is resolved.
- We warn you when tables exceed 15,000 changes in a single sync.
- When a table exceeds 50,000 changes in a sync, we flag it as unsupported. We also flag a table as unsupported if it encounters more than three Teleport failures in a seven day period. We continue to sync unsupported tables until we encounter an error, after which the table is automatically excluded from your connection syncs.
- You can re-include tables that have been automatically excluded, but they retain their unsupported flag and will be excluded again if they encounter another failure.
- In Hybrid Deployment, Teleport sync requires additional disk space to store the table hashes it creates to identify row changes between syncs. We will notify you about the additional disk space requirement through an error message in the Fivetran dashboard. To resolve it, continue adding disk space to the machine hosting the Hybrid Deployment Agent until the error clears.

Automatic re-sync

Under certain circumstances, Fivetran automatically performs a full table re-sync or a full source re-sync of your source database. The specific circumstances vary by database type.

The full table or full source re-sync automatically occurs at your next scheduled sync time. For example, let's say that your connection syncs every day at 2pm and you make database updates that necessitate a re-sync at 4pm. We schedule your automatic re-sync for 2pm the next day.

Automatic full table re-sync

Schema changes in the source database are the most common trigger for a full table re-sync. When the schema changes in the source, a full table re-sync lets us propagate these changes to your destination. You don’t have to take any action to initiate the re-sync, and the only difference you might notice is a longer than usual sync. See the chart below for specific causes for each database.

Note that we only perform automatic table re-syncs for SQL databases. Schema changes to NoSQL databases don’t require re-syncs, because non-relational databases do not have fixed schemas.

Automatic table re-sync for SQL databases

We schedule automatic full table re-syncs for SQL databases in the following circumstances:

Database	What schedules a full table re-sync
BigQuery	If you enable syncs for a table that you had previously excluded from syncs
CockroachDB	Expired GC threshold
MariaDB	A DDL (Data Definition Language) statement on a table without a primary key A DDL statement that re-orders columns A DDL statement that changes the column definition for SET or ENUM type columns A primary key change If you enable syncs for a table that you had previously excluded from syncs For Teleport syncs: if we encounter an error retrieving or processing a snapshot For Teleport syncs: if a DDL change contains a default value with a type other than Instant, LocalDate, certain LocalDateTime types, Boolean, String, Short, Int, Long, Float, Double, BigDecimal, or Json Any TRUNCATE DDL statements for tables that you have included in syncs
MySQL	A DDL (Data Definition Language) statement on a table without a primary key A DDL statement that re-orders columns A DDL statement that changes the column definition for SET or ENUM type columns A primary key change If you enable syncs for a table that you had previously excluded from syncs For Teleport syncs: if we encounter an error retrieving or processing a snapshot For Teleport syncs: if a DDL change contains a default value with a type other than Instant, LocalDate, certain LocalDateTime types, Boolean, String, Short, Int, Long, Float, Double, BigDecimal, or Json Any TRUNCATE DDL statements for tables that you have included in syncs
Oracle	A statement that changes a table’s schema If you enable syncs for a table that you had previously excluded from syncs A DDL statement that creates, deletes, or changes the position of a column If the table's SCN is no longer available to us Non-zero status for a LogMiner event If we've rescheduled a re-sync more than a few times, we stop the sync and generate a Task in your dashboard
PostgreSQL	If you enable syncs for a table that you had previously excluded from syncs If XMIN values get frozen for a table Schema changes in PostgreSQL don’t trigger a re-sync
SQL Server	If you enable syncs for a table that you had previously excluded from syncs If you change incremental sync methods If the table's LSN is no longer available to us Any TRUNCATE DDL statements for tables that you have included in syncs
SAP HANA	If you enable syncs for a table that you had previously excluded from syncs
Snowflake	If you enable syncs for a table that you had previously excluded from syncs Any DDL changes for tables that you have included in syncs, such as removing columns and adding columns with non-static default values

Automatic table re-sync for NoSQL databases:

We schedule automatic full table re-syncs for NoSQL databases in the following circumstances:

Database	What schedules a full table re-sync
Amazon DynamoDB	Change in pack mode Expired DynamoDB streams cursor
MongoDB	Change in pack mode Expired change streams cursor
DocumentDB	Expired change streams cursor

Automatic full source re-sync

Expired change logs require a full source re-sync, because we cannot perform our usual incremental syncs without them. Change logs can expire either in the sense of time or storage space. For example, logs might expire if you set your log retention period to 48 hours, and Fivetran isn’t able to sync for more than 48 hours. Logs can also expire if the storage space you allocated to them fills up. A full source re-sync ensures that the data in your destination matches your source data, and once it is complete, we resume our usual incremental sync schedule.

Automatic full source re-sync for SQL databases

We schedule automatic full source re-syncs for SQL databases in the following circumstances:

Database	What schedules a full source re-sync
MariaDB	Expired logs (Fivetran generates a task in your dashboard when we detect expired logs and need to perform a full source re-sync)
MySQL	Expired logs
Oracle	Expired logs
PostgreSQL	Expired logs
SQL Server	Expired logs

Automatic full source re-sync for NoSQL databases

We schedule automatic full source re-syncs for NoSQL databases in the following circumstances:

Database	What schedules a full source re-sync
Amazon DynamoDB	Expired DynamoDB streams cursor
MongoDB	Expired oplogs

Automatic table re-sync for Teleport sync method:

We schedule automatic full table re-syncs for Teleport sync method in the following circumstances:

Renaming a table/schema
Dropping columns
Unsupported column type changes
Adding a column with a non-static default value
All DDL changes for pk-less tables will trigger a re-sync

Unsupported operations

Fivetran’s database connectors deliberately do not support some operations.

Non-primary replicas in SQL Server

In SQL Server, you cannot connect Fivetran to a non-primary replica if you use Change Tracking (CT) as your incremental sync method. We can only sync with the primary replica in an availability group, because Microsoft does not allow CT to be enabled on any replica that is not designated as primary.

Repeated table truncations in SQL Server

If you use CT as your incremental sync method, repeated table truncations prompt table re-syncs because they continuously invalidate the table's CT system version. We recommend you avoid truncating or use a different syncing mechanism like Fivetran Teleport Sync if you expect truncating in your current source workflow.

Retention periods under 24 hours

A log retention period must be at least as large as your sync period, or else the logs will expire in between Fivetran connection syncs. We recommend setting your retention period to at least 48 hours to ensure there are sufficient logs for proper replication.

Data warehouses as sources

Fivetran supports the following data warehouses as sources:

You cannot connect Fivetran to other data warehouses like Teradata, Netezza, or Vertica as sources, because they have no change capture mechanism. Instead, connect us to whatever source databases are feeding these warehouses.

If you use a regular OLTP database like PostgreSQL or Oracle as a destination, you can connect it to Fivetran as a source. However, your Fivetran syncs may be unacceptably slow if you are populating this destination by periodically truncating and re-creating tables. Be sure to test this issue during your evaluation of Fivetran to make sure you get acceptable performance.

Must allow initial syncs to run without interruption

You cannot run initial syncs within a limited time window (for example, only allow Fivetran to sync from midnight to 8 a.m.). Time limits increase the likelihood that our connection will be interrupted and the sync will need to restart. There is no need to restrict our syncs to off-times; our initial sync queries are designed to be easy for your query planner to schedule around, so the impact on your production workload should be minimal.

Guaranteed initial sync completion dates

Fivetran cannot guarantee a completion date for your initial sync. While Fivetran works hard to make your initial sync as fast as possible, the duration of this process is different for every customer.

Skipping historical syncs

You cannot skip your historical sync.

One-time production data migrations

Fivetran does not support one-time migrations of production data.