PostgreSQL
PostgreSQL is an open-source database. Fivetran's integration platform replicates data from your PostgreSQL source database and loads it into your destination.
Fivetran supports PostgreSQL as both a database connector and a destination.
Fivetran offers the following solutions for replicating data from your PostgreSQL source database:
- Fivetran native connector
- HVR - PostgreSQL
This document is about the Fivetran's native connector for PostgreSQL.
Supported services
Fivetran supports the following PostgreSQL database services:
- Generic PostgreSQL
- Amazon Aurora PostgreSQL
- Amazon Aurora Serverless V2
- Amazon RDS PostgreSQL
- Azure Database for PostgreSQL
- Google Cloud PostgreSQL
- Heroku PostgreSQL
Supported configurations
Fivetran supports the following PostgreSQL configurations:
| Supportability Category | Supported Values |
|---|---|
| Database versions | 9.4 - 17 |
| Maximum row size | 400MB per row |
| Connection limit per database | 3 |
| Transport Layer Security (TLS) | TLS 1.1 - 1.3 |
| Server encoding | UTF-8 |
Which PostgreSQL instance types we can connect to depends on whether you use logical replication or Query-Based as your incremental sync method. Read our Updating data documentation for more information.
| Instance Types | Logical Replication | Query-Based |
|---|---|---|
| Generic PostgreSQL | ||
| Primary instance | (version 10 or later) | (version 10 or later) |
| Standby instance | (version 16 or later) | (version 10 or later) |
| Amazon Aurora PostgreSQL | ||
| Primary instance | ||
| Standby instance | ||
| Amazon Aurora Serverless V2 | ||
| Primary instance | ||
| Standby instance | ||
| Amazon RDS PostgreSQL | ||
| Primary instance | ||
| Standby instance | (version 16 or later) | |
| Azure Database for PostgreSQL | ||
| Primary instance | ||
| Standby instance | (version 16 or later) | |
| Google Cloud PostgreSQL | ||
| Primary instance | ||
| Standby instance | (version 16 or later) | |
| Heroku PostgreSQL | ||
| Primary instance | (without extensions) | |
| Standby instance | (without extensions) |
Features
| Feature Name | Supported | Notes |
|---|---|---|
| Capture deletes | ||
| History mode | ||
| Custom data | ||
| Data blocking | ||
| Column hashing | ||
| Re-sync | ||
| Row filtering | ||
| API configurable | API configuration | |
| Priority-first sync | ||
| Fivetran data models | ||
| Private networking |
| |
| Authorization via API |
*Query-Based does not capture data that exists for an amount of time smaller than the sync interval.
Supported deployment models
We support the SaaS and Hybrid deployment models for all PostgreSQL implementations.
You must have an Enterprise or Business Critical plan to use the Hybrid Deployment model.
Setup guide
In your primary database, you need to do the following:
- Allow access to your PostgreSQL database using Fivetran's IP
- Create a Fivetran-specific PostgreSQL user with read-level permissions
- (Optional) Allow access to a read-replica of your PostgreSQL database. Using a read-replica can help to avoid unnecessary strain on your primary database.
- (Logical replication Only) Allow access to a logical replication slot
For specific instructions on how to set up your database, see the guide for your PostgreSQL database type:
- Generic PostgreSQL
- Amazon Aurora PostgreSQL
- Amazon RDS PostgreSQL
- Azure Database for PostgreSQL
- Google Cloud PostgreSQL
- Heroku PostgreSQL
Sync overview
Once Fivetran is connected to your PostgreSQL database or read replica, we pull a full dump of all selected data from your database. Using either logical replication or Query-Based change data capture process, we pull all your new and changed data at regular intervals.
If data in the source changes (for example, you add new tables or change a data type), Fivetran automatically detects and persists these changes into your destination. For every schema in your PostgreSQL source database, we create a schema in your destination that maps directly to its native schema. This ensures that the data in your destination is in a familiar format to work with.
Syncing empty tables and columns
Fivetran will sync empty tables and columns for your PostgreSQL connection. We only sync empty columns to your destination once you make another data alteration to that table. Simply adding a new column will not be reflected in your destination. For more information, see our Features documentation.
Schema information
Fivetran tries to replicate the exact schema and tables from your database to your destination. We name the source schema with the database schema name and replicate the source's tables to the destination schema.
We do not sync foreign tables, views, or materialized views.
We name the destination schema with the connection name appended by the database schema name converted as per our schema naming rules. For example, if the connection name is postgres_db and the database schema name is fivetran_schema, the destination schema name is postgres_db_fivetran_schema.
Fivetran allows you to control how source names for tables and columns are replicated to the destination. You can choose between preserving the original source names with UTF-8 values or standardizing the naming conventions for schemas, tables, and columns in the destination.
- To standardize the naming conventions in the destination, select Fivetran naming.
- To preserve the original source names in the destination, select Source naming.
You can modify this selection at any time before the initial sync completes, offering flexibility in how your data is structured in the destination. For more information, see our naming conventions documentation.
Fivetran-generated columns
Fivetran adds the following columns to every table that is added to your destination:
_fivetran_synced(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row._fivetran_id(STRING) is the hash of the non-Fivetran values of each row. It's a unique ID that Fivetran uses to identify rows in tables that do not have a primary key._fivetran_deleted(BOOLEAN) marks rows that were deleted in the source database.If you use Query-Based without deletes as your incremental sync method, your
_fivetran_deletedcolumn will have null values.
We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.
Type transformation and mapping
As we extract your data, we match PostgreSQL data types to types that Fivetran supports. If we don't support a certain data type, we automatically change that type to the closest supported type or, in some cases, don't load that data at all. Our system automatically skips columns of data types that we don't accept or transform.
For primary keys, all supported data types are valid.
The following table illustrates how we transform your PostgreSQL data types into Fivetran-supported types:
| PostgreSQL Type | Fivetran Type | Fivetran Supported | Notes |
|---|---|---|---|
| bigint / bigserial | LONG | True | |
| bit | STRING | True | |
| bit varying | STRING | True | |
| boolean | BOOLEAN | True | |
| box | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| bytea | BINARY | True | |
| character | STRING | True | |
| character varying | STRING | True | |
| cidr | STRING | True | |
| circle | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| citext | STRING | True | For more information, see our How Does Fivetran Sync PostgreSQL Extension Data Types documentation. |
| composite | STRING | True | |
| date | LOCALDATE | True | |
| daterange | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| double precision | DOUBLE | True | |
| geography | JSON | True | For more information, see our How Does Fivetran Sync PostGIS Data Types documentation. |
| geometry | JSON | True | For more details, see our How Fivetran Syncs PostGIS Data Types documentation. |
| hstore | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Extension Data Types documentation. |
| inet | STRING | True | |
| int4range | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| int8range | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| integer / serial | INTEGER | True | |
| interval | DOUBLE | True | |
| json | JSON | True | JSON columns with duplicate keys are handled by retaining the last key’s value. |
| jsonb | JSON | True | |
| line | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| line segment | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| ltree | STRING | True | For more information, see our How Does Fivetran Sync PostgreSQL Extension Data Types documentation. |
| macaddr | STRING | True | |
| macaddr8 | STRING | True | |
| money | BIGDECIMAL | True | |
| numeric / decimal | BIGDECIMAL | True | |
| numrange | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| oid | LONG | True | |
| path | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| pg_lsn | STRING | True | |
| point | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| polygon | JSON | True | For more information, see our How Does Fivetran Sync Native PostgreSQL Geometric Data Types documentation. |
| real | FLOAT | True | |
| smallint / smallserial | SHORT | True | |
| text | STRING | True | |
| time with time zone | STRING | True | |
| time without time zone | STRING | True | |
| timestamp with time zone | INSTANT | True | |
| timestamp without time zone | LOCALDATETIME | True | |
| tsquery | STRING | True | |
| tsrange | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| tstzrange | JSON | True | For more information, see our How Does Fivetran Sync PostgreSQL Range Data Types documentation. |
| tsvector | STRING | True | |
| uuid | STRING | True | |
| xml | XML | True |
We sync unknown data types to the destination as strings. If you need a data type we don't support, reach out to Fivetran support.
When loading data into your destination, we sometimes convert Fivetran data types into types supported by the destination. For more information, see the individual destination pages.
We support all array types (for example, integer[], text[], uuid[], etc.), including arrays of user-defined types. Array values are synced as JSON arrays in the destination.
Enumerated types
Fivetran supports enumerated types (ENUM), which are user-defined data types. Learn more in PostgreSQL's Enumerated Types documentation.
We transform and store enumerated types as STRING in the destination.
Domain data types
Fivetran supports domains, which are user-defined data types. When you create a domain, you choose an underlying data type and have the option of specifying additional constraints. Learn more in PostgreSQL's domains documentation.
We transform and store domain data based on their underlying data types.
Excluding source data
If you don’t want to sync all the data from your source database, you can exclude schemas, tables, or columns from your syncs on your Fivetran dashboard. To do so, go to your connection details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.
Alternatively, you can restrict the Fivetran user's access to certain tables or columns in your source database.
You cannot exclude columns which are primary keys or foreign keys.
How to allow only a subset of tables:
To grant access only to some tables in a schema, first make sure that the Fivetran user has access to the schema itself:
GRANT USAGE ON SCHEMA "some_schema" TO fivetran;
Next, remove any previously granted permissions to all tables in that schema:
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" REVOKE SELECT ON TABLES FROM fivetran;
REVOKE SELECT ON ALL TABLES IN SCHEMA "some_schema" FROM fivetran;
Repeat this command for each table you wish to include:
GRANT SELECT ON "some_schema"."some_table" TO fivetran;
GRANT SELECT ON "other_schema".* TO fivetran; /* all tables in schema */
Any tables created in the future will be excluded from the Fivetran user by default. To include them, run:
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" GRANT SELECT ON TABLES TO fivetran;
There is no way to grant access to all tables at once, so you need to individually select all the tables you do want. It is not possible to achieve exclusion by granting Fivetran access to all tables and then revoking access for a subset of tables.
How to allow only a subset of columns:
To grant access only to some columns in a table, first remove any previously granted permission to all columns in the table:
REVOKE SELECT ON "some_schema"."some_table" FROM fivetran;
Then grant permission to only specific columns (for example, some_column and other_column:)
GRANT SELECT (xmin, "some_column", "other_column") ON "some_schema"."some_table" TO fivetran;
Any new columns added to that table in the future will be excluded from access by default. To include them, re-run the command above with the new column included.
To grant access to all columns except one, you must individually grant access to all other columns. If you are using Query-Based as your change data capture method, you must also grant permissions to the hidden system column xmin. We need access to the xmin column to perform our incremental updates.
You can automate this process by scripting in your favorite language. Here is an example of a fixed set of SQL commands being executed via a BASH script:
#!/bin/sh
#Fill in the values without quotes or a space after the equals
host= # ex: 10.10.135.135
port= # ex: 30054
user= # ex: user
password= # ex: asdf235235asfsdf212
#List all of your SQL commands wrapped in single quotes, for example
sql=(
'REVOKE ALL PRIVILEGES, GRANT OPTION FROM fivetran;'
'GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO fivetran;'
'GRANT SELECT ON schema1.* TO fivetran;'
'GRANT SELECT ON schema1.* TO fivetran;'
)
#Connecting to Redshift, hence the PGPASSWORD
do
PGPASSWORD=$password psql --host=$host --port=$port --user=$user $db -c "${sql[$i]}"
done
Initial sync
When Fivetran connects to a new database, we first copy all rows from every table in every schema for which we have SELECT permissions (except for those that you excluded on your Fivetran dashboard) and add Fivetran-generated columns. We copy the tables in ascending size order (from smallest to largest). We copy rows by performing a SELECT statement on each table. For large tables, we copy a limited number of rows at a time so that we don't have to start the sync over from the beginning if the connection to your source is lost midway.
Updating data
Fivetran performs incremental updates of any new or modified data from your source database. We use one of the following methods to perform incremental updates:
The XMIN system column and Fivetran Teleport Sync methods are replaced by the Query-Based method, while the sync behavior remains the same.
Logical replication vs Query-Based
We recommend using logical replication as the incremental sync method if you have a large database.
There are several key differences between logical replication and Query-Based:
| Feature | Logical replication | Query-Based |
|---|---|---|
| Tracks inserts and updates | Yes | Yes |
| Tracks deletes | Yes | Optionally1 |
| Tracks DDL changes | Partially2 | Partially2 |
| Syncs tables with primary keys | Yes | Yes |
| Syncs tables without primary keys | Yes | Yes |
| Provides full record of changed data | Yes | Yes3 |
| Can read from a replica | Yes4 | Yes |
| Replicates partitioned tables | Yes5 | Partially6 |
| Replicates generated columns | Partially7 | Yes |
1 Query-Based with delete capture enabled retains only active records in the destination and does not preserve details about deleted rows. Use history mode to retain information about deleted rows.
2 Learn more in our Support for DDL changes documentation.
3 The Query-Based incremental sync method can capture entire tables as new.
4 Logical replication supports reading from a replica for PostgreSQL version 16 or later.
5 Logical replication supports replicating partitioned tables for PostgreSQL version 13 or later.
6 Query-Based with delete capture enabled only support child-to-child sync for partitioned tables.
7 Logical replication supports replicating GENERATED STORED columns with PUBLICATION option publish_generated_columns = stored for PostgreSQL version 18 or later.
Logical replication
Logical replication is based on logical decoding of the PostgreSQL write-ahead log (WAL). We recommend this method for incremental updates because it:
- Minimizes processing overhead on your PostgreSQL server
- Replicates row deletions for tables with primary keys
Fivetran supports the pgoutput plugin for logical replication. Your plugin must have its own replication slot. Ensure that you are connected to the correct database when you create your replication slot, or your connection will not be able to find the slot.
However, there are reasons why you might not want to or be able to use logical replication:
Logical replication does not replicate all DDL changes. Learn which DDL changes are supported in the Support for DDL changes table.
Logical replication does not replicate data from generated columns.
For PostgreSQL versions 13 and later, logical replication can replicate partitioned tables. Learn more in PostgreSQL's logical replication restrictions documentation.
Creating a column with the
ALTER TABLE ADD COLUMN SET DEFAULTcommand does not trigger an update to the existing table that the logical replication mechanism can recognize. Consequently, those changes are not part of our incremental updates. You need to do a full table re-sync to update the existing table's data in the destination. Learn more in PostgreSQL's Modifying Tables documentation.Heroku PostgreSQL does not support logical replication.
Fivetran does not support logical replication if you use the "swap and drop" method of replicating data. In the "swap and drop" method, you create a temporary table and load in data from your current table. You then drop the current table and rename the temporary table so it has the same name as your current table.
If logical replication is not an option for you, Fivetran can fall back on using the Query-Based method.
See our FAQ documentation for help in migrating your test_decoding connection to pgoutput.
Logical decoding output plugin
pgoutput is the standard logical decoding plugin for PostgreSQL version 10+. This plugin replicates from your custom publication without needing additional libraries.
A publication is a group of change events for a table or group of tables that you choose. To learn more, read PostgreSQL's publication documentation.
pgoutput allows you to use PostgreSQL's publications feature. This feature only replicates change events from tables in your publication, so you control which changes are replicated to Fivetran.
Query-Based
The XMIN system column method is replaced with the Query-Based method.
With Query-Based sync, Fivetran detects changes using PostgreSQL system columns:
xmin(always): Fivetran readsxminon each sync to identify rows that were inserted or updated since the last sync.ctid(only when capturing deletes is enabled): Fivetran scans currentctidvalues and compares them to the previously stored helper columnctid_fivetran_id. If a previously seenctidis missing, Fivetran treats the row as deleted.
Use the Query-Based method when you can’t use PostgreSQL logical replication, for example, if your PostgreSQL version is below 16 and you want to capture deletes from a read replica.
Query-Based sync may scan full tables to detect changes, which can increase load on your PostgreSQL source database and slow syncs, especially for large tables. Use Logical replication when possible.
DDL handled without a re-sync
The Query-Based method handles these DDL changes without triggering a full re-sync:
- Adding a new column to the table (
ALTER TABLE ... ADD COLUMN) - Changing an existing column's data type (
ALTER TABLE ... ALTER COLUMN TYPE) - Changing data types that do not affect existing values (for example, INT to BIGINT)
Rows that existed before you add a column will appear as NULL in the new column in the destination until those rows are updated in the source or you re-sync the table.
Query-Based limitations
- Performance overhead: Query-Based change detection can require scanning full tables to find updated rows, which can increase load on your PostgreSQL source database and slow syncs.
- Syncs rows on any column update:
xminchanges whenever any column in a row changes. As a result, Fivetran syncs the entire row on any update, even if the changed column isn't selected in your connection schema; Fivetran can't determine which updates to ignore. - Freezing and wraparound in high-volume databases: In high-volume databases,
xminfreezing and wraparound events can cause Fivetran to trigger automatic re-syncs, which may increase sync time. - Changes between syncs: Because Query-Based sync doesn’t continuously read transaction logs, it can’t guarantee accurate detection of rows that are deleted and reinserted between syncs. Learn more in Changes to data between syncs.
- Fast-changing rows: Rows that change multiple times within a single sync interval may be missed in the destination.
To reduce scan overhead and mitigate freezing/wraparound effects on PostgreSQL 15 or older, enable the pageinspect and pg_visibility PostgreSQL extensions. These extensions speed up incremental syncs and full imports, and help prevent older frozen rows from being re-synced during xmin wraparound events. For some accounts, these re-syncs may also increase MAR. For more information, see our FAQ page.
Capturing deletes
By default, Query-Based sync detects new and updated rows using the PostgreSQL xmin system column. To also detect deleted rows, enable the Capture Deletes option in the connection setup form.
When capturing deletes is enabled, Fivetran adds an internal helper column ctid_fivetran_id to each destination table that it creates and maintains. At the end of each sync, Fivetran stores the current ctid value for every source row in ctid_fivetran_id. On the next sync, Fivetran scans the table’s current set of ctid values and compares it to the previously stored ctid_fivetran_id values. If a previously stored ctid is missing, Fivetran treats the row as deleted and removes it from the destination to keep it in sync with the source.
Capturing deletes does not modify your source database. The helper column exists only in the destination tables that Fivetran builds.
If you enable capturing deletes before the initial sync, Fivetran creates the destination tables with ctid_fivetran_id during the initial sync. If you enable it after data is already synced, Fivetran performs a one-time re-sync of the selected tables to add ctid_fivetran_id. Once enabled, capturing deletes can’t be disabled for the connection.
Limitations
- If an
UPDATEchanges a row'sctid, Fivetran soft-deletes the old row version in the destination and writes a new row version. - Capturing deletes with Query-Based method works differently from Logical replication. It keeps only active records in the destination and doesn’t preserve details about deleted rows. To keep a record of deleted rows, use History Mode.
Fivetran Teleport Sync Sunset
The Fivetran Teleport Sync method has been sunset and replaced with the Query-Based replication method. Existing connections using the Teleport Sync will continue to function.
Fivetran Teleport Sync is a proprietary incremental sync method that can add delete capture with no additional setup other than a read-only SQL connection. Teleport captures updates using the PostgreSQL system columns xmin and ctid.
Use Fivetran Teleport Sync if enabling logical replication is not an option or if your Postgres version is below 16, and you want to capture deletes from a read replica.
Fivetran Teleport Sync supports syncing tables with composite primary keys.
Fivetran Teleport Sync's queries perform the following operations on your database:
- Do an index scan of each synced table's primary keys
- Aggregate a compressed table snapshot in the database's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your database:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS)
- A minimum of 10% of your largest table size as the temporary table space
Teleport captures deletes differently than log-based replication mechanisms. Learn more in the Changes to data between syncs section of our history mode documentation.
Fivetran Teleport Sync can sync the following DDL commands without needing to re-sync the table:
- Adding a new column
- Changing an existing column's data type
- Data type changes that do not affect existing values (For example, INT to BIGINT)
We only recommend the use of the Teleport Sync method if WAL or XMIN is not an option for you. We recommend using WAL as the primary incremental sync method if you have a high-volume transactional database.
Limitations
We may perform a table resync if the connection is paused for a long duration of time
If you update large portions of a table between syncs, Fivetran Teleport Sync may be unable to determine the changes since the previous sync. When this occurs, the connection will re-sync the entire table to capture the latest data. To avoid these table re-syncs, we recommend that you increase the sync frequency of the connection.
Fast-changing rows may not be marked as deleted in the destination. We consider a row fast changing if more than one of the following operations are applied to it within the connection's sync frequency:
INSERTUPDATEDELETE
The Fivetran Teleport Sync method uses the Query-Based delete capture mechanism (which relies on the PostgreSQL system column
ctid) instead of its default mechanism in the following cases:- Tables without primary keys
- Tables whose primary key includes a column that is not one of the following data types:
- smallint
- integer
- bigint
- varchar(n) (variable-length with length limit or fixed-length)
- text (variable unlimited length)
- uuid
- numeric(precision, scale)
- date
- tid
- If a table fails to sync three times within a seven-day rolling window
Fivetran Teleport Sync for PostgreSQL does not support syncing more than 400 tables or tables with more than 50 million rows.
Tables with a primary key
We merge changes to tables with primary keys into the corresponding tables in your destination:
- An
INSERTin the source table generates a new row in the destination. - An
UPDATEin the source table updates the data in the corresponding row in the destination. - (For logical replication) A
DELETEin the source table updates the corresponding row in the destination with_fivetran_deleted = TRUE. - (For Query-Based sync with capturing deletes enabled) Fivetran matches rows in the destination using
ctid_fivetran_id(not the source primary key). If anUPDATEchanges a row’sctid, Fivetran soft-deletes the old row version and writes a new row version. This can create multiple row versions in the destination.
Fivetran requires SELECT permissions for all primary key columns in every table with a primary key that you want to sync. If you use logical replication as your incremental sync method, all primary key columns must be part of the publication.
Tables without a primary key
For the Query-Based method without capture deletes, we handle changes to tables without a primary key as follows:
- An
INSERTin the source generates a new row in the destination. - An
UPDATEin the source generates a new row in the destination and leaves the old version of that row in the destination untouched. As a result, one record in your source database may have several corresponding rows in your destination. - (For Query-Based sync with capturing deletes enabled) Fivetran matches rows in the destination using
ctid_fivetran_id. If anUPDATEchanges a row’sctid, Fivetran soft-deletes the old row version and writes a new row version. This can create multiple row versions in the destination.
For logical replication connections that use pgoutput, we can soft delete the old version of a row when it is updated if the table's replica identity is set to FULL.
For logical replication, we can track deletes if the table's replica identity is set to FULL or INDEX. Otherwise, we can only track inserts. Learn more in the Deleted Rows section.
Identify tables with primary keys
To find out which of your tables have (or don't have) primary keys, run this query:
SELECT table_schema, table_name, (table_schema, table_name) IN ( SELECT _schema.nspname AS table_schema, _table.relname AS table_name FROM pg_catalog.pg_constraint c LEFT JOIN pg_catalog.pg_class _table ON c.conrelid = _table.oid LEFT JOIN pg_catalog.pg_namespace _schema ON _table.relnamespace = _schema.oid WHERE c.contype = 'p' ) AS has_primary_key FROM information_schema.tables WHERE table_type = 'BASE TABLE' AND table_schema NOT IN ('information_schema', 'looker_scratch') AND NOT table_schema ~ '^pg_';
The third column in the query results, has_primary_key, is a binary value that shows whether or not each table has a primary key.
Partitioned tables
Fivetran supports replicating PostgreSQL tables with partitioning. We support both declarative partitioning and partitioning using inheritance. For more information about how partitioning works, see PostgreSQL’s Table Partitioning documentation.
Declarative partitioning
Newly-added partitions are not automatically included in your syncs unless your schema configuration is set to Allow All.
Logical replication in PostgreSQL versions 10-12 does not support adding partitioned tables directly to the publication. If your workflow involves regularly creating new partitions, you must add each new partition to the publication, either manually or automatically through setting up a trigger. Otherwise, your syncs may be rescheduled because of the missing partitions in the publication.
We sync incremental data from partitioned tables into the parent table. If you'd like to sync your partitions individually, reach out to Fivetran Support.
Partitioning using inheritance
Because a child table can inherit from multiple root tables, we treat each table as a separate entity. Therefore, the data physically residing in a parent table is synced into the corresponding parent table in the destination. Similarly, the data physically residing in a child table is synced into the corresponding child table in the destination. You can use the ONLY keyword in a select statement to query data that physically resides in a specific table, excluding data in its child tables.
If you want to see the result of SELECT * FROM base_table in the destination instead, set up a trigger to propagate changes from child tables to the base table in the source. If this is not an option, consider creating a view in the destination to aggregate data from child tables to the base table.
Deleted rows
For logical replication, we can capture deletes for tables without primary keys if the replica identity set to FULL or INDEX.
The Query-Based method captures deletes differently from log-based replication mechanisms. It retains only active records in the destination and does not preserve details about deleted rows. Because Query-Based sync does not continuously read transaction logs, we cannot guarantee accurate detection of deleted rows that are removed and reinserted between syncs. Learn more in our Changes to data between syncs documentation.
If capturing deletes is disabled when choosing Query-Based incremental sync method, we will not recognize deleted rows at all (for more information, see our Query-Based documentation). Rows that are deleted in the source database are still present in the destination. If you want to recognize deleted rows, use logical replication, or turn on capturing deletes. If neither of those is an option, use one of these strategies to capture deletions in the source database:
Add a trigger to each table to log the primary keys of deleted rows in a separate table. Once this table has been replicated to the destination, you can use it to generate a view that excludes the deleted rows.
Instead of removing rows, add an 'is_deleted' column to each table and change the business logic to set this column to
TRUEwhen you delete a row. To avoid build up of obsolete rows in the source database, you can delete them after a certain period. Make sure you leave enough time for Fivetran replicate the deleted rows to the destination before you remove them (we recommend seven days).
Support for DDL changes
We support different DDL changes based on which incremental sync method your connection uses. The following is a non-exhaustive list of supported DDL changes:
- Yes: The DDL change is reflected in the destination. You do not need to perform a re-sync.
- Partially: The DDL change is partially reflected in the destination.
- No: The DDL change is not reflected in the destination.
| DDL Statement | Logical Replication | Query-Based | Notes |
|---|---|---|---|
| CREATE TABLE | Yes | Yes | All methods: Creates a new table in the destination. |
| CREATE SCHEMA | Yes | Yes | |
| ADD COLUMN | Yes | Yes | All methods: Rows synced before the ADD COLUMN operation have null values in the newly added destination column. Re-sync the table to update the existing rows with the default value. |
| ADD COLUMN ... DEFAULT | Partially | Partially | All methods: Rows synced before the ADD COLUMN operation have null values in the newly added destination column, and won’t automatically have the default value. Re-sync the table to update the existing rows with the default value. |
| RENAME TO new_table_name | No | No | All methods: The subsequent sync imports the table as a new table. |
| RENAME TO new_schema_name | No | No | All methods: The subsequent sync imports the schema as a new schema. |
| RENAME COLUMN | Yes | Yes | All methods: Adds a new column to the destination. The old column still exists in the destination. We sync new rows inserted after this DDL operation to the new column. If you need data in the new column from before the DDL operation, trigger a table re-sync. |
| DROP COLUMN | No | No | |
| ALTER COLUMN ... TYPE ... | Yes | Yes | Logical Replication: We support any data types changing to string-based data types. If your data type change is not backward compatible (for example, TEXT to BOOLEAN), drop the replication slot and perform a historical re-sync. Query-Based: Certain destinations support data type changes, such as BigQuery and PostgreSQL. |
| ALTER COLUMN ... TYPE ... USING | No | Yes | |
| ALTER COLUMN ... SET NOT NULL | No | No | |
| DROP SCHEMA | No | No | |
| DROP TABLE | No | No | |
| TRUNCATE TABLE | Yes | No | Logical Replication: If you truncate a partition, that change is not reflected in the destination. Truncating one table in a transaction is supported. Truncating a table and its dependent tables using TRUNCATE CASCADE is not supported. If you want to truncate multiple tables, truncate one table per transaction. Query-Based: Re-sync the table to capture the truncate change. |
Historical re-sync scenarios
- If you want to migrate to another service provider, we will need to do a full re-sync of your data because the new service provider will not retain the same change tracking data as your original PostgreSQL database.
- If you want to switch the incremental sync method from Logical replication to Query-Based.
- If you want to enable capture deletes while using the Query-Based method.
Specific conditions for rescheduled syncs
In addition to the general conditions for rescheduled syncs, we reschedule syncs for our PostgreSQL connections under the following conditions:
- When re-sync takes longer than 24 hours
- When Query-Based method is not able to capture all changes.
- When certain unrecoverable failures occur.