Generic PostgreSQL Setup Guidelink
Follow these instructions to replicate your generic PostgreSQL database to your destination using Fivetran.
To connect your generic PostgreSQL database to Fivetran, you need:
- PostgreSQL version 7.3 or above
- Your database host's IP (e.g.,
22.214.171.124) or domain (
- Your database's port (usually
- TLS enabled on your database
Choose incremental update mechanismlink
To keep your data up to date after the initial sync, we use one of two incremental update methods: logical replication and XMIN. Both methods keep a record of recent data changes, which allows Fivetran to update only the data that has changed since our last sync.
TIP: We recommend using logical replication as your incremental update mechanism. Learn more in our Updating data documentation.
IMPORTANT: You can only enable logical replication if your PostgreSQL server version is 9.4.15+, 9.5.10+, 9.6.6+, or 10.1+. Prior minor versions have breaking bugs.
Logical replication is based on logical decoding of the PostgreSQL write-ahead log (WAL). Fivetran reads the WAL to detect any new or changed data. We recommend using logical replication because it is faster than XMIN replication and allows Fivetran to detect deleted rows for tables with primary keys.
Learn more in our logical replication documentation.
The XMIN method is based on the hidden
xmin system column that is present in all PostgreSQL tables. With XMIN, Fivetran must scan every table in full to detect updated data. We do not recommend XMIN for near real-time data needs because XMIN replication is slower than logical replication and doesn't allow Fivetran to detect deleted rows.
Learn more in our XMIN documentation.
Choose connection method (TLS required)link
IMPORTANT: You must have TLS enabled on your database to connect to Fivetran.
Decide whether to connect your generic PostgreSQL database directly or using an SSH tunnel.
Fivetran connects directly to your database instance. This is the simplest and most secure method.
To connect directly, do the following to configure your firewall and/or other access control systems:
- Allow incoming connections to your PostgreSQL port (usually
5432) from Fivetran's IPs for your database's region.
- Allow outgoing connections from all ports (
65535) to Fivetran's IPs for your database's region.
How you do this will vary based on how your PostgreSQL database is hosted (cloud platform, on-premises, etc.).
Connect using SSH
Fivetran connects to a separate server in your network that provides an SSH tunnel to your database. You must connect through SSH if your database is in an inaccessible subnet.
To connect using SSH, follow our SSH connection instructions to give Fivetran access to your SSH tunnel.
Create a database user for Fivetran's exclusive use.
Open a connection to your PostgreSQL database in a PostgreSQL console (such as a SQL workbench or psql).
Create a user for Fivetran by executing the following SQL command. Choose a memorable name (for example,
some-passwordwith a password of your choice.
CREATE USER fivetran PASSWORD 'some-password';
Grant read-only accesslink
Grant the Fivetran user read-only access to all tables by running the following commands. To grant access to a schema other than PostgreSQL's default
public schema, replace
public with the schema name.
GRANT USAGE ON SCHEMA "public" TO fivetran; GRANT SELECT ON ALL TABLES IN SCHEMA "public" TO fivetran; ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT ON TABLES TO fivetran;
NOTE: The last command makes sure that any future tables will be accessible to Fivetran.
If you want to grant access to multiple schemas, you must run these three commands for each schema.
Restrict access to tables (optional)
If you want to limit Fivetran's access to your tables, grant the Fivetran user access to only the tables that you would like to sync. You must individually grant access for each table that you want to sync. You cannot grant access to all tables and then revoke access for a subset of tables.
Ensure that the Fivetran user has access to the schema that contains your table(s).
GRANT USAGE ON SCHEMA "some_schema" TO fivetran;
Revoke any previously granted permission to all tables in that schema.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" REVOKE SELECT ON TABLES FROM fivetran; REVOKE SELECT ON ALL TABLES IN SCHEMA "some_schema" FROM fivetran;
Repeat the following command for each table you want Fivetran to sync.
GRANT SELECT ON "some_schema"."some_table" TO fivetran;
By default, any tables that you create in the future will be excluded from the Fivetran user's access. To grant access to new tables, run the following command.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" GRANT SELECT ON TABLES TO fivetran;
Restrict access to columns (optional)
If you want to limit Fivetran's access to the columns in your tables, grant the Fivetran user access to only certain columns. You must individually grant access for each column that you want to sync.
Revoke any previously granted permission to read all columns in the table.
REVOKE SELECT ON "some_schema"."some_table" FROM fivetran;
Grant permission to the specific columns you want to sync (for example,
NOTE: If you chose XMIN as your incremental update mechanism, you must grant us access to the hidden system column
GRANT SELECT (xmin, "some_column", "other_column") ON "some_schema"."some_table" TO fivetran;
Once you restrict access to columns within a table, the Fivetran user will not have access to any new columns added to that table in the future. To grant access to new columns, you must rerun the command above.
Configure incremental update mechanismlink
Configure your chosen incremental update mechanism.
To enable logical replication, follow these steps:
Go to your PostgreSQL database.
Ensure that your server has ample free space for the logs. As soon as Fivetran processes a log, we delete it. However, we don't delete logs if the sync is interrupted (for example, if we lose access to your database). In this case, logs may accumulate on your server and consume additional storage. The amount of additional disk space that these logs consume is proportional to the number of changes committed on the server. If we can't resume a lost connection quickly enough and you need more disk space, you can drop the replication slot, which deletes its unconsumed logs.
Ensure that the
statement_timeoutsetting on your server is either
0(the default value to disable the timeout) or greater than
wal_levelparameter in your database configuration to
logical. For a standard PostgreSQL database, do this by adding a
wal_level = logicalline to the
postgresql.conf1file. Restart the server for this change to take effect.
Ensure that your
max_replication_slotsvalue is equal to or higher than the number of PostgreSQL connectors that use WAL plus the number of other replication slots your database uses.
wal_sender_timeoutparameter in your database configuration to
0to disable the timeout.
Add a record to your
pg_hba.conffile that allows your database to authenticate Fivetran's connection to the WAL.
Go to the
postgresql.conffile and ensure that the
max_wal_sendersparameter, which specifies the maximum number of concurrent connections to the WAL, is at least twice the total number of logical replication slots. For example, if your database uses 11 replication slots in total, then the
max_wal_sendersvalue must be 22 or greater.
Log into a PostgreSQL console (such as a SQL workbench or psql) as a superuser.
Create a logical replication slot named
fivetran_replication_slotfor the database you want to sync. If you use a custom slot name instead, it cannot start with a number. You must use the output plugin
test_decodingplugin is in the
SELECT pg_create_logical_replication_slot('fivetran_replication_slot', 'test_decoding');
Grant permission to the Fivetran user for reading the replication slot.
ALTER ROLE fivetran WITH REPLICATION;
Log in as the Fivetran user.
Verify that the Fivetran user can read the replication slot by running the following command:
SELECT count(*) FROM pg_logical_slot_peek_changes('fivetran_replication_slot', null, null);
If the query succeeds, then permissions are sufficient.
IMPORTANT: You must periodically tune the
max_wal_sizeparameters based on your PostgreSQL database operations. If you do not, you may experience replication failures. To learn how to tune, read this tuning checkpoints documentation.
The XMIN method is based on the hidden
xmin system column that is present in all PostgreSQL tables. To enable XMIN, ensure that the
statement_timeout setting on your server is either
0 (the default value to disable the timeout) or greater than
Finish Fivetran configurationlink
In your connector setup form, enter a destination schema prefix. This prefix applies to each replicated schema and cannot be changed once your connector is created.
In the Host field, enter your database host's IP (for example,
126.96.36.199) or domain (for example,
Enter your database instance's port number. The port number is usually
Enter the Fivetran-specific user that you created in Step 3 (for example,
Enter the password for the Fivetran-specific user that you created in Step 3.
Enter the name of your database (for example,
Choose your connection method. If you selected Connect via an SSH tunnel, provide the following information:
- SSH hostname (do not use a load balancer's IP address/hostname)
- SSH port
- SSH user
Choose your update method. If you selected Logical replication of the WAL, enter the name of your database's replication slot.
Click Save & Test. Fivetran will take it from here and sync your data from your PostgreSQL database.