Azure PostgreSQL Database Setup Guidelink
Follow these instructions to replicate your Azure PostgreSQL database to your destination using Fivetran.
Prerequisiteslink
To connect your PostgreSQL database to Fivetran, you need:
- PostgreSQL version 7.3 or above
- Your database host's IP (e.g.,
1.2.3.4
) or domain (e.g.,your.server.com
) - Your database's port (usually
5432
) - TLS enabled on your database
Choose a connection method (TLS required)link
IMPORTANT: You must have TLS enabled on your database to connect to Fivetran.
Decide whether to connect your Azure PostgreSQL database directly or using an SSH tunnel. How you configure your security groups will differ depending on this decision.
Connect directly
Fivetran connects directly to your database instance. This is the simplest and most secure method.
If you connect directly, you must create a rule in a security group that allows Fivetran access to your database instance.
Connect using SSH
Fivetran connects to a separate server in your network that provides an SSH tunnel to your database. You must connect through SSH if your database is in an inaccessible subnet.
If you connect using SSH, you must follow our SSH tunnel instructions before proceeding to the next step. You must then configure your tunnel server's security group to allow Fivetran access and configure the database instance's security to allow access from the tunnel.
Enable accesslink
You must configure the firewall to grant Fivetran's data processing servers access to your database server.
-
In the Azure console, open the SQL database firewall settings.
-
Select the Azure PostgreSQL database that you want to connect to Fivetran.
-
In Settings, click Connection security.
-
Add a new firewall rule. Enter Fivetran's IP in both the Start IP and End IP fields to define the firewall rule's range.
-
Click Save.
Create userlink
-
Open a connection to your master PostgreSQL database.
-
Create a user for Fivetran by executing the following SQL command. Replace
some-password
with a password of your choice.
CREATE USER fivetran PASSWORD 'some-password';
Grant read-only accesslink
Grant the fivetran
user read-only access to all tables by running the following commands. To grant access to a schema other than PostgreSQL's default public
schema, replace public
with the schema name.
GRANT USAGE ON SCHEMA "public" TO fivetran;
GRANT SELECT ON ALL TABLES IN SCHEMA "public" TO fivetran;
ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT ON TABLES TO fivetran;
NOTE: The last command makes sure that any future tables will be accessible to Fivetran.
If you want to grant access to multiple schemas, you must run these three commands for each schema.
Restrict access to tables (optional)link
If you want to limit Fivetran's access to your data, grant the fivetran
user access to only the tables that you would like to sync. You need to individually grant access for each table that you want to sync. It is not possible to achieve exclusion by granting access to all tables and then revoking access for a subset of tables.
-
Ensure that the
fivetran
user has access to the schema that contains your table(s).GRANT USAGE ON SCHEMA "some_schema" TO fivetran;
-
Revoke any previously granted permission to all tables in that schema.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" REVOKE SELECT ON TABLES FROM fivetran; REVOKE SELECT ON ALL TABLES IN SCHEMA "some_schema" FROM fivetran;
-
Repeat the following command for each table you want Fivetran to sync.
GRANT SELECT ON "some_schema"."some_table" TO fivetran;
-
By default, any tables that you create in the future will be excluded from the
fivetran
user's access. To grant access to new tables, run the following command.ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" GRANT SELECT ON TABLES TO fivetran;
Restrict access to columns (optional)link
You can also grant the fivetran
user access to only certain columns within a table. You need to individually grant access for each column that you want to sync.
NOTE: We need access to the hidden system column
xmin
for incremental updates.
-
Ensure that you have revoked any previously granted permission to read all columns in the table.
REVOKE SELECT ON "some_schema"."some_table" FROM fivetran;
-
Grant permission to the specific columns you want to sync (for example,
some_column
andother_column
).GRANT SELECT (xmin, "some_column", "other_column") ON "some_schema"."some_table" TO fivetran;
Once you restrict access to columns within a table, the fivetran
user will not have access to any new columns added to that table in the future. To grant access to new columns, you must rerun the command above.
Choose incremental update mechanismlink
To keep your data up to date after the initial sync, we use one of two incremental update methods: logical replication and XMIN. Both methods keep a record of recent data changes, which allows Fivetran to update only the data that has changed since our last sync. We support logical replication and the XMIN method on Azure PostgreSQL versions 10 or later. For earlier versions, we only support the XMIN method. As a result, our Azure PostgreSQL connector does not support replicating deleted data for versions before 10.
TIP: We recommend using logical replication as your incremental update mechanism. Learn more in our Updating data documentation.
Choose either logical replication or XMIN as your incremental update mechanism.
- To enable logical replication, proceed to the next section.
- To enable XMIN, skip ahead to the XMIN section.
Logical replicationlink
IMPORTANT: You can only enable logical replication if your Azure PostgreSQL version is 10 or later.
Logical replication is based on logical decoding of the PostgreSQL write-ahead log (WAL). To enable logical replication, follow these steps:
-
Connect to your master database. You cannot enable logical replication on a read replica.
-
Ensure that your server has ample free space for the logs. Logs that Fivetran has already processed are released. However, logs are not released if replication stops (for example, if we lose access). In this case, logs may accumulate on your server and consume additional storage. The amount of additional disk space consumed by these logs is proportional to the amount of changes committed on the server. If a lost connection can't be resumed quickly enough, you can drop the replication slot, which releases the storage of unconsumed logs. You would then need to do a full re-sync of your connector to reset the cursor in the replication slot.
-
In your Azure portal, do the following:
i. Set Azure replication support to logical, then click Save.
ii. Click Yes to restart the server to apply the change.
-
Log into a PostgreSQL console as a superuser (one that has the
rds_superuser
role). -
Run the following command to create a logical replication slot named
fivetran_replication_slot
for the database you wish to sync. You must use the output plugintest_decoding
.SELECT pg_create_logical_replication_slot('fivetran_replication_slot', 'test_decoding');
-
Grant permission to the
fivetran
user for reading the replication slot.GRANT rds_replication TO fivetran;
The
fivetran
user does not need therds_superuser
role. -
Log in as the
fivetran
user. -
Verify that the
fivetran
user can read the replication slot by running the following command.SELECT count(*) FROM pg_logical_slot_peek_changes('fivetran_replication_slot', null, null);
If the query succeeds, then permissions are sufficient.
XMINlink
You do not need to do any additional configuration for the XMIN method. You must use the XMIN method if your Azure PostgreSQL version is earlier than 10.
Enter user, password, and database in setup formlink
In your Fivetran setup form, enter your user, password, and database name.
- For the User, enter
fivetran@<servername>
, where<servername>
is part of your Azure host URL:<servername>.database.windows.net
. - For the Password, enter the password you set when you created the Fivetran user.
- For the Database, enter the database you want to replicate from.
Choose schema prefixlink
This is the last step of the integration. Each schema from the source database will be mapped to a schema in the destination by adding a prefix to the original schema name. For example, if your original database contains schemas "foo" and "bar" and if you choose the prefix "pre", then you will get schemas "pre_foo" and "pre_bar" in the output.
Related articleslink
description Connector Overview
account_tree Schema Information
settings API Connector Configuration