MariaDB
MariaDB is an open-source fork of the popular MySQL relational database. Its development is led by some of the original developers of MySQL.
Fivetran offers the following solutions for replicating data from your MariaDB source database:
Supported services
Fivetran supports the following MariaDB database services:
Supportability Category | Supported Values |
---|---|
Supported versions | 10.1.2 - 10.11 |
MariaDB versions earlier than 10.1.2 may function correctly with the exception of fractional seconds in TIME, TIMESTAMP, and DATETIME columns. Prior to version 10.1.2, fractional seconds were stored differently, which causes problems with our incremental updates.
Maximum Speed (MBps) measures are based on measured end-to-end update speeds across Fivetran connectors.
Two major factors can cause disparities between our estimates and the exact replication speed for your Fivetran-connected databases: network latency and discrepancies in the format of the data we receive versus how the data is stored at rest in the data destination.
The ability to sync changes quickly also depends on configurable sync frequency. The risk of the sync falling behind, or being unable to keep up with data changes, decreases as the sync frequency increases. Fivetran recommends a higher sync frequency for data sources with a high rate of data changes.
To measure the rate of new data in your database, check the disk space usage metrics over time for databases hosted on cloud providers. For self-hosted databases, you can run the following query to determine disk space usage:
SELECT
table_schema,
table_name,
round(((data_length + index_length) / 1024 / 1024), 2) as size_in_mib
FROM INFORMATION_SCHEMA.TABLES
WHERE table_schema = 'DB_NAME'
ORDER BY (data_length + index_length) DESC;
Features
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
History mode | check | Selectable for all tables |
Custom data | check | All tables and fields |
Data blocking | check | Column level, table level, and schema level |
Column hashing | check | |
Re-sync | check | Table level |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking | check | AWS PrivateLink: Generic MariaDB on EC2, Amazon RDS for MariaDB Azure Private Link: Azure Database for MariaDB |
Setup guide
In your primary database, you need to do the following:
- Enable ROW format binary log replication
- Set the
binlog_row_image
binary logging system variable toFULL
in global scope - Set binary log expiration to a minimum of one day (24 hours). We recommend setting the log expiration to seven days.
- Allow access to your MariaDB database using Fivetran's IP
- (Optional) Allow access to a read-replica of your MariaDB database if you do not want to connect Fivetran to your production instance
- Make sure that the
mysql56_temporal_format
system variable is set to ON (this is the default). - Create a Fivetran-specific MariaDB user with read-level and replication permissions to binary logs
WARNING: This user must be reserved for Fivetran use and unique to your connector.
- Reserved: Create a user for Fivetran's exclusive use. At the beginning of each sync, a Fivetran user attempts to kill any zombie processes it left behind on previous syncs. If you try to run any operations as this user, they may be killed.
- Unique: If you are creating multiple connectors targeting the same source database, create a unique user for each connector. If you set up multiple connectors with the same Fivetran user, they can kill one another's connections, slowing down your sync unnecessarily. NOTE: The Fivetran MariaDB connector only supports the authentication plugin
caching_sha2_password
over TLS.
For specific instructions on how to set up your database, see the setup guide for your MariaDB database type:
Since MariaDB is a fork of the MySQL code base (up to version 5.5), the setup guides are nearly identical to the MySQL guides.
Sync overview
Once Fivetran is connected to your MariaDB primary database or read replica, we pull a full dump of all selected data from your database. We then connect to your binary log to pull all your new and changed data at regular intervals. The binary log is the same change tracking mechanism that MariaDB uses to perform its own replication for backups. If data in the source changes (for example, you add new tables or change a data type), Fivetran automatically detects and persists these changes into your destination.
Schema information
Fivetran tries to replicate the exact schema and tables from your MariaDB source database to your destination according to our standard database update strategies. For every schema in the MariaDB database that you connect, we create a schema in your destination that maps directly to its native schema. This ensures that the data in your destination is in a familiar format to work with.
Fivetran-generated columns
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row._fivetran_index
(INTEGER) shows the order of updates for tables that do not have a primary key._fivetran_id
(STRING) is the hash of the non-Fivetran values of each row. It's a unique ID that Fivetran uses to avoid duplicate rows in tables that do not have a primary key.
We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.
Type transformations and mapping
As we extract your data, we match MariaDB data types to types that Fivetran supports. If we don't support a certain data type, we automatically change that type to the closest supported type or, for some types, don't load that data at all. Our system automatically skips columns with data types that we don't accept or transform.
The following table illustrates how we transform your MariaDB data types into Fivetran supported types:
MariaDB Type | Fivetran Data Type | Fivetran Supported | Notes |
---|---|---|---|
BINARY | BINARY | True | |
BIGINT | LONG | True | |
BIT | BOOLEAN | True | BIT type with a single digit is supported |
BLOB | BINARY | True | |
CHAR | STRING | True | |
DATE | DATE | True | Invalid values will be loaded as NULL |
DATETIME | TIMESTAMP_NTZ | True | Invalid values will be loaded as NULL. Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
DECIMAL/ NUMERIC | BIGDECIMAL | True | |
DOUBLE | DOUBLE | True | |
ENUM | STRING | True | |
FLOAT | DOUBLE | True | |
GEOMETRY | JSON | True | |
GEOMETRYCOLLECTION | JSON | True | |
JSON | STRING | True | MySQL supports JSON as a native type, however in MariaDB, JSON is an alias of LONGTEXT |
INT | INTEGER | True | |
LINESTRING | JSON | True | |
LONGBLOB | BINARY | True | |
LONGTEXT | STRING | True | |
MEDIUMBLOB | BINARY | True | |
MEDIUMINT | INTEGER | True | |
MEDIUMTEXT | STRING | True | |
MULTILINESTRING | JSON | True | |
MULTIPOINT | JSON | True | |
MULTIPOLYGON | JSON | True | |
POINT | JSON | True | |
POLYGON | JSON | True | |
SET | STRING | True | |
SMALLINT | INTEGER | True | |
TIME | STRING | True | Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
TIMESTAMP | TIMESTAMP | True | MariaDB always stores timestamps in UTC Invalid values will be loaded as NULL. Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
TINYBLOB | BINARY | True | |
TINYINT | BOOLEAN | True | If the source TINYINT column has a specified width of exactly 1 (i.e., TINYINT(1) ) and contains values of only 0s and 1s, the destination type will be BOOLEAN. If it contains values other than 0 or 1, the destination type will be INTEGER. |
TINYINT | INTEGER | True | In all other cases, the destination type for TINYINT columns will be INTEGER. If the width isn't specified to be exactly 1 (either no specification or a value other than 1), the destination type will be INTEGER, even if the column contains only 1s or 0s. |
TINYTEXT | STRING | True | |
UNSIGNED BIGINT | BIGDECIMAL | True | |
UNSIGNED INT | LONG | True | |
UNSIGNED SMALLINT | INTEGER | True | |
VARCHAR | STRING | True | |
VARBINARY | BINARY | True | |
YEAR | INTEGER | True |
If we are missing an important data type that you need, reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual destination pages.
Excluding source data
If you don’t want to sync all the data from your primary database, you can exclude schemas, tables, or columns from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.
Alternatively, you can change the permissions of the Fivetran user you created and restrict its access to certain tables or columns.
Note that column permissions will not shield confidential data, such as PII (Personally Identifiable Information), from Fivetran because they do not apply to binary logs (binlogs). Through your binlogs, Fivetran has access to the full contents of any changed rows, including columns that have no SELECT permissions. However, we filter out the prohibited columns at the earliest possible stage of our syncs and do not load them into your destination.
Initial sync
When Fivetran connects to a new database, we first copy all rows from every table in every schema for which we have SELECT permission (except those you have excluded in your Fivetran dashboard) and add Fivetran-generated columns. Tables are copied in ascending size order (from smallest to largest). We copy rows by performing a SELECT statement on each table. For large tables, we copy a limited number of rows at a time so that we don't have to start the sync over from the beginning if our connection is lost midway.
The duration of initial syncs can vary depending on the number and size of tables to be imported. We therefore interleave incremental updates with the table imports during the initial sync.
Updating data
Fivetran performs incremental updates of any new or modified data from your source database. We use one of the following incremental sync methods for incremental updates: binary log and Fivetran Teleport Sync.
How we load UPDATE events into your destination depends on whether or not the table has a primary key. To find out which of your tables have primary keys, run this query in your source database:
SELECT
max(tc.CONSTRAINT_TYPE = 'PRIMARY KEY') IS NOT NULL
AND max(tc.CONSTRAINT_TYPE = 'PRIMARY KEY') = TRUE AS has_primary_key,
c.TABLE_SCHEMA AS table_schema,
c.TABLE_NAME AS table_name
FROM information_schema.COLUMNS c
LEFT JOIN information_schema.KEY_COLUMN_USAGE k ON k.TABLE_SCHEMA = c.TABLE_SCHEMA AND
k.TABLE_NAME = c.TABLE_NAME AND
k.COLUMN_NAME = c.COLUMN_NAME
LEFT JOIN information_schema.TABLE_CONSTRAINTS tc ON tc.TABLE_SCHEMA = k.TABLE_SCHEMA AND
tc.TABLE_NAME = k.TABLE_NAME AND
tc.CONSTRAINT_NAME = k.CONSTRAINT_NAME
LEFT JOIN information_schema.TABLES t ON t.TABLE_SCHEMA = c.TABLE_SCHEMA AND
t.TABLE_NAME = c.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
AND c.TABLE_SCHEMA NOT IN ('performanceschema', 'informationschema', 'mariadb', 'innodb', 'lookerscratch', 'tmp')
GROUP BY c.TABLE_SCHEMA, c.TABLE_NAME ORDER BY has_primary_key, c.TABLE_SCHEMA, c.TABLE_NAME;
Binary log
Fivetran performs incremental updates of any new or modified data from your source database. We use your binlogs to request only the data that has changed since our last sync. We require the binlogs to be in row format, so that they contain a separate event in each row. We pull INSERT, UPDATE, and DELETE events from your binlogs at regular intervals and upload them to your destination.
Fivetran Teleport Sync Beta
Fivetran Teleport Sync is a proprietary incremental sync method that can incrementally replicate your database with no additional setup other than a read-only SQL connection.
Use Fivetran Teleport Sync when the binary log method is not an option for the following reasons:
- You cannot or do not want to provide Fivetran access to your database's change logs
- You need to limit the tables or columns that Fivetran can read from
- Binary log syncing is slow or resource intensive due to the number of updates
- You load data into the source database using a TRUNCATE/LOAD or BACKUP/RESTORE process that invalidates change logs
- You currently use snapshot syncing without incremental updates and want a faster method
Fivetran Teleport Sync's queries perform the following operations on your database:
- Do a full table scan of each synced table
- Perform calculations on all values in each synced table's rows
- Aggregate a compressed table snapshot in the database's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your database:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Private Preview Fivetran Teleport Sync supports syncing tables without primary keys in Private Preview for existing connectors. All new connectors have this ability enabled automatically. Reach out to Fivetran Support if you're interested in trying this functionality for existing connectors.
IMPORTANT: Tables without a primary key can have many exact duplicates of a row, but duplicate rows create data discrepancy issues in Fivetran Teleport Sync. We generate a
_fivetran_id
column for tables without a primary key, so many identical source rows may share a_fivetran_id
. In that case, only one copy of the duplicate rows is synced to the destination.
Tables with a primary key
We merge changes to tables with primary keys into the corresponding tables in your destination:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
. - A DELETE in the source table updates the corresponding row in the destination with
_fivetran_deleted = TRUE
. - An UPDATE in the source updates the data in the corresponding row in the destination.
Tables without a primary key
NOTE: Fivetran Teleport Sync support for tables without a primary key is currently in Private Preview for existing connectors. All new connectors have this ability enabled automatically.
Fivetran Teleport Sync performs the following steps to sync tables without a primary key:
- Copy the table to a
TEMPORARY TABLE
on the MySQL source server. This table includes a generated primary key column,_fivetran_id
. Because this value depends on the values of the other columns in the row, any duplicate rows in the original table are removed. - Use the
TEMPORARY TABLE
to sync the data to the destination. - Once the sync is complete, drop the
TEMPORARY TABLE
.
NOTE: To sync tables without a primary key, Fivetran Teleport Sync requires the Fivetran database user to have the
CREATE TEMPORARY TABLES
permission.
For the binary log method, we handle changes to tables without a primary key differently:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
. - The
_fivetran_id
column helps us handle DELETE operations:- If there is a row in the destination that has a corresponding
_fivetran_id
value, that row will be updated with_fivetran_deleted = TRUE
. - If there is not a row in the destination that has a corresponding
_fivetran_id
value, a new row will be added with_fivetran_deleted = TRUE
.
- If there is a row in the destination that has a corresponding
- An UPDATE in the source table is treated as a DELETE followed by an INSERT, so it results in two rows in the destination:
- A row containing the old values with
_fivetran_deleted = TRUE
- A row containing the new values with
_fivetran_deleted = FALSE
- A row containing the old values with
As a result, one record in your source database may have several corresponding rows in your destination. For example, suppose you have a products
table in your source database with no primary key:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Boogie robot | 2 |
Cookie robot | 3 |
You load this table into your destination during your initial sync, creating this destination table:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:00' | 1 | FALSE | dfdf |
Boogie robot | 3 | '2000-01-01 00:00:00' | 2 | FALSE | ewra |
You then update a row:
UPDATE products SET quantity = 4 WHERE description = 'Cookie robot';
After your UPDATE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:00' | 3 | TRUE | dfdf |
Boogie robot | 3 | '2000-01-01 00:00:00' | 2 | FALSE | ewra |
Cookie robot | 4 | '2000-01-01 00:00:00' | 4 | FALSE | zxfd |
You then delete a row:
DELETE FROM products WHERE description = 'Boogie robot';
After your DELETE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Cookie robot | 2 | '2000-01-01 00:00:02' | 3 | TRUE | dfdf |
Cookie robot | 4 | '2000-01-01 00:00:02' | 4 | FALSE | zxfd |
Boogie robot | 3 | '2000-01-01 00:00:02' | 5 | TRUE | ewra |
So, while there may be just one record in your source database where description = Cookie robot
, there are two in your destination - an old version where _fivetran_deleted = TRUE
, and a new version where _fivetran_deleted = FALSE
.
We also de-duplicate rows before we load them into your destination. We use the _fivetran_id
field, which is the hash of the non-Fivetran values in every row, to avoid creating multiple rows with identical contents. If, for example, you have the following table in your source:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Then your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | '2000-01-01 00:00:00' | 0 | FALSE | asdf |
Deleted rows
We don't delete rows from the destination, though the way for how we process deletes differs for tables with primary keys and tables without a primary key. We handle deletes as part of streaming changes from the binlog. Note that we only process DELETE events from the binlog.
Unsupported MariaDB commands
Fivetran does not support several MariaDB commands:
DROP
(and re-creating the dropped table withCREATE
)LOAD
RENAME
ON DELETE CASCADE
ON UPDATE CASCADE
If you use any of these unsupported commands to delete or update the contents of a table, your changes will not be recorded in the binlogs. As a result, those records won't be replicated correctly in your destination.
Adding and dropping columns
When you add or drop a column, we attempt to automatically migrate your destination schema to the new table structure. In some cases, we won't be able to do this and instead perform an automatic re-sync of the changed table.
In the following scenarios, Fivetran will re-sync your table instead of automatically migrating it:
- Adding or dropping a column on a table without a primary key
- Changing column order
- Changing primary keys
- Modifying
ENUM
orSET
columns - Adding or dropping a column using a DDL statement that we're unable to parse properly
Renaming tables or columns
Renaming involves following three queries:
ALTER TABLE tbl_name RENAME [TO | AS] new_tbl_name
ALTER TABLE tbl_name CHANGE [COLUMN] old_col_name new_col_name column_definition
[FIRST | AFTER col_name]
ALTER TABLE tbl_name RENAME COLUMN old_col_name TO new_col_name
When you rename a table, we reimport the table as a new table, but we will not soft delete the old table in the destination. When you rename a column, both the old column and new column will appear in the table. We only update data in the new column moving forward.
Adding a column in Block All mode
When our connector detects a new column in a source table, it automatically performs a schema migration on that table unless you've chosen to block all newly-added tables and columns from syncing. If you excluded newly-added tables and columns from your syncs, you must manually re-sync your table whenever you add a new column.
Migrating service providers
If you want to migrate service providers, we will need to do a full re-sync of your data because the new service provider won't retain the same change tracking data as your original MariaDB database.