Azure Database for PostgreSQL Setup Guide
Follow these instructions to replicate your Azure Database for PostgreSQL to your destination using Fivetran.
Prerequisites
To connect your PostgreSQL database to Fivetran, you need:
- PostgreSQL version 9.5 - 16
- Your database host's IP (e.g.,
1.2.3.4
) or domain (e.g.,your.server.com
) - Your database's port (usually
5432
) - TLS enabled on your database
Setup instructions
IMPORTANT: Do not perform the Choose connection method step if you want to use Hybrid Deployment for your data pipeline.
Choose connection method (TLS required)
IMPORTANT: You must have TLS enabled on your database to connect to Fivetran.
Decide whether to connect your Azure Database for PostgreSQL directly, using an SSH tunnel, using Azure Private Link, or using Proxy Agent.
Connect directly
Connect directly
Fivetran connects directly to your database instance. This is the simplest method.
If you connect directly, you will create a rule in a security group that allows Fivetran access to your database instance.
Connect using SSH
Connect using SSH
Fivetran connects to a separate server in your network that provides an SSH tunnel to your database. You must connect through SSH if your database is in an inaccessible subnet.
Before you proceed to the next step, you must follow our SSH connection instructions.
Connect using Azure Private Link
IMPORTANT: You must have a Business Critical plan to use Azure Private Link.
Azure Private Link allows Virtual Networks (VNets) and Azure-hosted or on-premises services to communicate with one another without exposing traffic to the public internet. Learn more in Microsoft's Azure Private Link documentation.
Follow our Azure PrivateLink setup guide to configure Private Link for your database.
Connect using Proxy Agent
Fivetran connects to your database through the Proxy Agent, providing secure communication between Fivetran processes and your database host. The Proxy Agent is installed in your network and creates an outbound network connection to the Fivetran-managed SaaS.
To learn more about the Proxy Agent, how to install it, and how to configure it, see our Proxy Agent documentation.
(Optional) Authenticate using Microsoft Entra ID Private Preview
Fivetran supports authenticating using Microsoft Entra ID in the following configurations:
- Azure Database for PostgreSQL flexible server
- Connection methods:
- Direct connection
- Azure Private Link
- Incremental sync methods:
- XMIN
- Fivetran Teleport Sync
NOTE: You cannot authenticate using Microsoft Entra ID if you use the logical replication incremental sync method.
Expand for instructions
Follow the instructions in the Register an application with the Microsoft identity platform documentation register a new application.
Follow the instructions in the Microsoft Add credentials documentation to add a credential with a client secret. Store the generated client secret in a secure place, because you cannot access this value later.
In the Azure Database for PostgreSQL flexible server, navigate to the Authentication section and ensure that one of the following options is selected:
- Microsoft Entra authentication only
- PostgreSQL and Microsoft Entra authentication
Add a Microsoft Entra Admin if an admin is not already present.
Connect to the
postgres
database within your Azure Database for PostgreSQL flexible server using the Microsoft Entra Admin's credentials.Follow the instructions in the Manage Microsoft Entra roles in Azure Database for PostgreSQL - Flexible Server documentation to add the service principal as user in PostgreSQL. Once you run the command
SELECT * FROM pgaadauth_create_principal('fivetran_user', false, false);
, you can now use the service principalfivetran_user
as a user.Fill in the following details in the corresponding fields in the connector setup form:
- User: Registered App Display Name
- Password: Client Secret Value
- Entra App ID: Registered App (client) ID
- Azure Tenant ID: Directory (tenant) ID
Enable access
You must configure the firewall to grant Fivetran's data processing servers access to your database server.
In the Azure console, open the SQL database firewall settings.
Select the Azure Database for PostgreSQL that you want to connect to Fivetran.
In Settings, click Connection security.
Add a new firewall rule. Enter Fivetran's IP in both the Start IP and End IP fields to define the firewall rule's range.
Click Save.
Create user
Open a connection to your primary PostgreSQL database.
Create a user for Fivetran by executing the following SQL command. Replace
<username>
andsome-password
with a username and password of your choice.
CREATE USER <username> PASSWORD 'some-password';
Grant read-only access
Grant the Fivetran user read-only access to all tables by running the following commands. To grant access to a schema other than PostgreSQL's default public
schema, replace public
with the schema name.
GRANT USAGE ON SCHEMA "public" TO <username>;
GRANT SELECT ON ALL TABLES IN SCHEMA "public" TO <username>;
ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT ON TABLES TO <username>;
NOTE: The last command makes sure that any future tables will be accessible to Fivetran.
If you want to grant access to multiple schemas, you must run these three commands for each schema.
Restrict access to tables (optional)
If you want to limit Fivetran's access to your data, grant the Fivetran user access to only the tables that you would like to sync. You need to individually grant access for each table that you want to sync. It is not possible to achieve exclusion by granting access to all tables and then revoking access for a subset of tables.
Ensure that the Fivetran user has access to the schema that contains your table(s).
GRANT USAGE ON SCHEMA "some_schema" TO <username>;
Revoke any previously granted permission to all tables in that schema.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" REVOKE SELECT ON TABLES FROM <username>; REVOKE SELECT ON ALL TABLES IN SCHEMA "some_schema" FROM <username>;
Repeat the following command for each table you want Fivetran to sync.
GRANT SELECT ON "some_schema"."some_table" TO <username>;
By default, any tables that you create in the future will be excluded from the Fivetran user's access. To grant access to new tables, run the following command.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" GRANT SELECT ON TABLES TO <username>;
Restrict access to columns (optional)
You can also grant the Fivetran user access to only certain columns within a table. You need to individually grant access for each column that you want to sync.
NOTE: We need access to the hidden system column
xmin
for incremental updates.
Ensure that you have revoked any previously granted permission to read all columns in the table.
REVOKE SELECT ON "some_schema"."some_table" FROM <username>;
Grant permission to the specific columns you want to sync (for example,
some_column
andother_column
).IMPORTANT: If you chose XMIN as your incremental sync method, you must grant us access to the hidden system column
xmin
. While it's not mandatory, you can also grant us access to the hidden system columnctid
to make your initial sync faster.GRANT SELECT (xmin, ctid, some_column, other_column) ON "some_schema"."some_table" TO <username>;
Once you restrict access to columns within a table, the Fivetran user will not have access to any new columns added to that table in the future. To grant access to new columns, you must rerun the command above.
Configure incremental sync method
To keep your data up to date after the initial sync, we use one of the following incremental sync methods. The logical replication and XMIN methods keep a record of recent data changes, which allows Fivetran to update only the data that has changed since our last sync. Fivetran Teleport Sync instead takes snapshots of tables to calculate differences.
Configure your chosen incremental sync method:
TIP: We recommend using logical replication as your incremental sync method because it is faster than XMIN replication and allows Fivetran to detect deleted rows for tables with primary keys. Learn more in our Updating data documentation.
Logical replication
IMPORTANT: You can only enable logical replication if your Azure Database for PostgreSQL flexible server's version is 11 or later.
We do not support logical replication for Azure Database for PostgreSQL single server.
Logical replication relies on the logical decoding of the PostgreSQL write-ahead log (WAL). Fivetran reads the WAL using the pgoutput
plugin to detect any new or changed data. This plugin replicates data from your custom publication without needing additional libraries. Learn more in our logical replication documentation.
We support logical replication of read replicas only if you are using PostgreSQL version 16 or higher. If your PostgreSQL version is below 16, you must connect to your primary database.
To enable logical replication, follow these steps:
Ensure that your server has ample free space for the logs. Logs that Fivetran has already processed are released. However, logs are not released if replication stops (for example, if we lose access). In this case, logs may accumulate on your server and consume additional storage. The amount of additional disk space consumed by these logs is proportional to the amount of changes committed on the server. If a lost connection can't be resumed quickly enough, you can drop the replication slot, which releases the storage of unconsumed logs. You would then need to do a full re-sync of your connector to reset the cursor in the replication slot.
In your Azure portal, do the following:
i. Set the
wal_level
server parameter to LOGICAL, then click Save.ii. Click Yes to restart the server to apply the change.
Use a PostgreSQL console to log in to your primary database as an admin user.
Create a publication for your tables. If you want, you can create a publication for only certain tables so that you add or remove tables from the publication later on. Only changes from tables in the publication are replicated to Fivetran. Each database can have multiple distinct publications. You must have
CREATE
privileges or above to run this command.IMPORTANT: The publication name
fivetran_pub
quoted throughout this guide is used purely as an example. The actual publication name should be unique for every database and cannot start with a number.CREATE PUBLICATION fivetran_pub FOR TABLE table2, table4, table8;
To add or remove a table from a publication, run the following command. You must have ownership rights over the table(s).
ALTER PUBLICATION fivetran_pub ADD/DROP TABLE table_name;
Alternatively, you can create a publication for all of your tables. However, you cannot remove any table from this publication later on. You must have superuser privileges to run this command.
CREATE PUBLICATION fivetran_pub FOR ALL TABLES;
(Optional) You can choose which operations to include in the publication. For example, the following publication includes only
INSERT
andUPDATE
operations.CREATE PUBLICATION insert_only_pub FOR TABLE table1 WITH (publish = 'INSERT, UPDATE');
To add partitioned tables for PostgreSQL version 13 or later, run the following command to enable publish_via_partition_root.
CREATE PUBLICATION fivetran_pub FOR ALL TABLES WITH (publish_via_partition_root=true);
Create a logical replication slot for the database you want to sync by running the following command. You must use the standard output plugin
pgoutput
. Ensure that you are connected to the correct database when you create your replication slot, or your connector will not be able to find the slot.To create replication slot, the admin user has to have
REPLICATION
permission:ALTER ROLE <admin_role> WITH REPLICATION;
IMPORTANT: You must create a unique replication slot for every connector that uses the same PostgreSQL cluster. Replication slot names cannot start with a number. (The replication slot name
fivetran_pgoutput_slot
quoted throughout this guide is used purely as an example.)IMPORTANT: You need to create the replication slot after you have created the publication.
SELECT pg_create_logical_replication_slot('fivetran_pgoutput_slot', 'pgoutput');
NOTE: If your PostgreSQL server version is 16 or later and you want to sync from a standby, create the replication slot in the read replica.
Verify that your chosen tables are in the publication.
SELECT * FROM pg_publication_tables;
Grant the Fivetran user permission to read the replication slot.
ALTER ROLE fivetran WITH REPLICATION;
Log in as the Fivetran user.
Verify that the Fivetran user can read the replication slot by running the following command. Replace
fivetran_pgoutput_slot
with your replication slot name andfivetran_pub
with the publication name.SELECT count(*) FROM pg_logical_slot_peek_binary_changes('fivetran_pgoutput_slot', null, null, 'proto_version', '1', 'publication_names', 'fivetran_pub');
If the query succeeds, then permissions are sufficient.
XMIN
IMPORTANT: You must use the XMIN method if your Azure Database for PostgreSQL version is earlier than 10.
The XMIN method relies on the hidden xmin
system column that is present in all PostgreSQL tables. With XMIN, Fivetran must scan every table in full to detect updated data. The need for a full table scan can lead to significant processing overhead, especially for large tables. Learn more in our XMIN documentation.
You do not need to do any additional configuration for the XMIN method. However, we recommend that you filter frozen pages from incremental syncs. Filtering makes incremental syncs significantly shorter, which means that we use fewer resources in your source database.
(Optional) To filter frozen pages, run the following commands on your primary database as a superuser:
CREATE EXTENSION pg_visibility;
CREATE SCHEMA fivetran;
CREATE OR REPLACE FUNCTION fivetran.get_all_pages(v_table_name character varying)
RETURNS TABLE (
pagenumber integer,
all_visible_yn boolean,
all_frozen_yn boolean)
LANGUAGE plpgsql
SECURITY definer
AS $function$
declare
begin
RETURN QUERY
SELECT blkno::int as pageNumber,
all_visible as all_visible_yn,
all_frozen as all_frozen_yn
FROM pg_visibility_map($1::regclass);
END;
$function$;
GRANT USAGE ON SCHEMA fivetran TO <username>;
GRANT EXECUTE ON FUNCTION fivetran.get_all_pages TO <username>;
Fivetran Teleport Sync
Fivetran Teleport Sync is a proprietary incremental sync method that can capture deletes without requiring additional setup beyond a read-only SQL connection. Updates are captured using the XMIN system column. Learn more in our Fivetran Teleport Sync documentation.
If you are trying to connect with a standby or read replica, run the following SQL command on your primary database as the Fivetran user:
CREATE AGGREGATE BIT_XOR(IN v bigint) (SFUNC = int8xor, STYPE = bigint);
If you are not connecting with a read replica, you do not need to do any additional configuration. The aggregate that the Teleport mechanism will later use is automatically created for you.
Finish Fivetran configuration
In your connector setup form, enter a Destination schema prefix. This is used as the connector name and cannot be modified once the connector is created.
NOTE: Depending on your selection in the Destination schema names field, we will either prefix the connector name to each replicated schema or use the source schema names instead.
In the Host field, enter your database host's IP (for example,
1.2.3.4
) or domain (for example,your-database.cp0rdhwjbsae.us-east-1.rds.amazonaws.com
)Enter your database instance's port number. The port should be
5432
, unless you changed the default.For the User, enter the username:
- If you are connecting with Flexible Server, enter
<username>
. - If you are connecting with Single Server, enter
<username>@<servername>
, where<servername>
is part of your Azure host URL:<servername>.database.windows.net
.
- If you are connecting with Flexible Server, enter
Enter the password for the Fivetran-specific user that you created in Step 3.
Enter the name of the database that you want to replicate from (for example,
your_database
).(Hybrid Deployment only) If your destination is configured for Hybrid Deployment, the Hybrid Deployment Agent associated with your destination is pre-selected in the Select an existing agent drop-down menu. To use a different agent, select the agent of your choice, and then select the same agent for your destination.
(Not applicable to Hybrid Deployment) Choose your connection method. If you selected Connect via an SSH tunnel, copy or make a note of the Public Key and add it to the
authorized_keys
file while configuring the SSH tunnel, and provide the following information:- SSH host (do not use a load balancer's IP address/hostname)
- SSH port
- SSH user
Choose your Update Method. If you selected Logical Replication, enter both the name of your database's replication slot and publication name accordingly.
Private PreviewIn the Destination schema names field, select Fivetran naming or Source naming to determine how the schema names appear in your destination. For more information, see the Schema information section.
Click Save & Test. Fivetran tests and validates our connection to your Azure Database for PostgreSQL. Upon successful completion of the setup tests, you can sync your data using Fivetran.
Setup tests
Fivetran performs the following tests to ensure that we can connect to your Azure Database for PostgreSQL and that it is properly configured:
- The Connecting to SSH Tunnel Test validates the SSH tunnel details you provided in the setup form. It then checks that we can connect to your database using the SSH Tunnel. (We skip this test if you aren't connecting using SSH.)
- The Connecting to Host Test validates the database credentials you provided in the setup form. The test verifies that the host is not private and then checks the connectivity to the host.
- The Validating Certificate Test generates a pop-up window where you must choose which certificate you want Fivetran to use. It then validates that certificate and checks that we can connect to your database using TLS. (We skip this test if you selected an indirect connection method and then disabled the Require TLS through Tunnel toggle.)
- The Connecting to Database Test checks that we can access your database.
- The Connecting to WAL Replication Slot Test confirms that the database associated with the replication slot matches the name you supplied in the setup form. It then verifies that the replication slot uses the
pgoutput
plugin if you selected the logical replication update method. Lastly, it makes sure that the Fivetran user has replication privileges. (We skip this test if you selected XMIN or Teleport as your incremental sync method) - The Checking Configuration Values Test checks a set of WAL-configured values against the recommended settings and detects if they are below the recommended range. (We skip this test if you selected XMIN as your incremental sync method.)
- The Publication Test verifies that the supplied publication name exists in your database. (We skip this test if you selected XMIN as your incremental sync method.)
- The Validating Speed Setup test validates Fivetran can fetch data from your source database quickly enough. During this test, we measure our ability to download sample data from your source database to Fivetran, but we do not perform a historical sync. We start a timer, then download the sample data in memory. We then calculate the connector speed based on how much data we downloaded and how long it took to download. The test shows a warning if the download speed is less than 5MB/sec.
- The XMIN Extensions test checks that the correct extensions are enabled for XMIN. (This test is skipped if you select Logical Replication as your incremental sync method.)
NOTE: The tests may take a few minutes to finish running.
Related articles
description Connector Overview
account_tree Schema Information
settings API Connector Configuration