Hybrid Deployment with Docker and Podman Setup Guide

Follow our setup guide to set up the Hybrid Deployment model with Docker or Podman.

Prerequisites

To use Hybrid Deployment with Docker or Podman, you need a server or virtual machine (VM) in your local environment with the following:

Operating system: A modern and up-to-date Linux distribution with Docker (v20.10.17 or above) or Podman (v4.6.1 or above) container runtime.
CPU and memory: Minimum 8 vCPUs with x86-64 processors and 32 GB of RAM. The CPU and memory requirements depend on the number of jobs running concurrently. The minimum requirements for each job is 2 vCPUs and 4 GB of RAM. Therefore, we recommend adjusting the CPU and memory resources according to the volume of your dataset.
Storage:
- Minimum 50 GB of allocated disk space for the Docker or Podman storage location. For more information, see our FAQ documentation.
- Persistent local storage must be sufficient to accommodate the total dataset volume for all the connections you plan to deploy. The default location is $HOME/fivetran/data.
User: A non-root Linux user to run the containers (for example, fivetran). The user must have the permissions necessary to run Docker or Podman. For more information about creating a new Linux user, see our FAQ documentation.
Network:
- Reliable connectivity to both the source and destination.
- Outbound connectivity to the following external IP addresses and services:
  - mTLS connection to the Fivetran Orchestration Service: 35.188.225.82 - ldp.orchestrator.fivetran.com
  - HTTPS with secure token to Fivetran Public API: 35.236.237.87 - api.fivetran.com
  - Google Artifact Registry - us-docker.pkg.dev (Google address range: 142.250.0.0 - 142.251.255.255)
  - GitHub repository hosting the automated installation script: raw.githubusercontent.com/fivetran/hybrid_deployment (GitHub address range: 185.199.108.0 - 185.199.111.255)
  - Logs used by the Fivetran Platform Connector - storage.googleapis.com/fivetran-metrics-log-sr

We recommend running Docker or Podman in rootless mode for improved security. Before using rootless mode, ensure the user's $HOME location has at least 50 GB of allocated disk space. For more information about configuring Docker and Podman in rootless mode, see our FAQ documentation.
Amazon Linux 2 does not support Docker in rootless mode. If you want to use Amazon Linux as your operating system, we recommend using the latest Amazon Linux 2023 (AL2023) x86_64 AMI.
If you want to run multiple pipeline processes concurrently on the same host, make sure you adjust the underlying resource capacity accordingly.
If you want to use Docker Desktop, which already includes Docker Engine, confirm that your Docker subscription plan supports it.
We recommend using an encrypted filesystem to secure your files and directories.
If your firewall supports it, we recommend using domain hostnames instead of IP ranges to restrict outbound connections.

Setup instructions

If your Hybrid Deployment Agent was configured before September 18, 2024 using the auth.json and config.json files, you must re-configure it according to the new workflow documented in this setup guide. For more information about re-configuring your agent, see our FAQ documentation.

Create agent

Log in to your Fivetran account.
Go to the Destinations page and click Add destination.
Select your destination type.
Enter a Destination name of your choice.
Click Add.
In the destination setup form, choose Hybrid Deployment as your deployment model.
Click + Configure new agent.
In the Configure a new agent pane, read the Fivetran On-Prem Software License Addendum, and select the I have read and agree to the terms of the License Addendum and the Software Specific Requirements checkbox.
Click Next.
Choose the environment you want to use for deployment: Docker or Podman.
Click Next.
Enter an Agent name and click Generate agent token.
Make a note of the agent token and installation command. You will need the agent token for manual installation and the installation command for automated installation of the agent.
Each Hybrid Deployment Agent has a unique token and installation command. Also, the installation command varies based on the deployment type (Docker or Podman) you select.
Click Save.

You must install and start the agent before completing the destination setup.

(Optional) Configure proxy settings for local environment and container runtime

Perform this step only if you want the Hybrid Deployment Agent and its jobs to route traffic through a proxy server (for example, Squid). If you are not using a proxy server, skip to the next step.

Expand to see the instructions

Configure local Linux environment

Log in to your local machine using the Fivetran user.
Go to the /etc/profile.d/ directory and add a new file named proxy.sh.
In the proxy.sh file, set appropriate values for the following system-wide environment variables:
- http_proxy: Specifies the proxy server to use for HTTP requests.
- https_proxy: Specifies the proxy server to use for HTTPS requests.
- no_proxy: Excludes specific domains or IPs from using the proxy.
For example:
```
export http_proxy="http://my-squid-proxy.example.com:3128"
export https_proxy="http://my-squid-proxy.example.com:3128"
export no_proxy="localhost,127.0.0.1"
```
We recommend using the Fully Qualified Domain Name (FQDN) or IP address of the proxy host.
The no_proxy value can include a custom list of exclusions. In most environments, this includes localhost,127.0.0.1.
If the host is in an AWS environment, make sure the no_proxy variable includes the instance metadata address 169.254.169.254.

Configure container runtime

Expand to see the configuration instructions for Docker

To ensure Docker correctly uses proxy settings when you run containers from the command line, you must configure the appropriate proxy values. This involves updating Docker’s configuration files and, if necessary, the Docker service definition, depending on whether you are using Docker in rootless mode or with root privileges.

Set proxy values for Docker CLI

To apply the proxy settings for a specific user, add the proxy configuration to ~/.docker/config.json.

To apply the settings system-wide, add them to /etc/docker/config.json.

Do not use the system-wide option when using Docker in rootless mode; instead, use the user-specific file.

For example:

{
  "proxies": {
    "default": {
      "httpProxy": "http://my-squid-proxy.example.com:3128",
      "httpsProxy": "http://my-squid-proxy.example.com:3128",
      "noProxy": "localhost,127.0.0.1"
    }
  }
}

Update service configuration for Docker running in rootless mode

Perform this step only if you are using Docker in rootless mode. If you are running Docker with root privileges, skip to the next step.

If you are running Docker in rootless mode, modify the Docker systemd service file to include the required environment variables.

Go to ~/.config/systemd/user/ and open the docker.service file.

Add the environment variables to the [Service] section. For example:

[Service]
...
Environment="HTTP_PROXY=http://my-squid-proxy.example.com:3128"
Environment="HTTPS_PROXY=http://my-squid-proxy.example.com:3128"
Environment="NO_PROXY=localhost,127.0.0.1"
...
...

Reload the user-level systemd configuration and restart the service.

systemctl --user daemon-reload
systemctl --user restart docker.service
systemctl --user status docker.service

Perform the following steps to verify the settings:
i. Get the main PID of the rootless Docker daemon.
```
systemctl --user show --property=MainPID docker.service
```
ii. Inspect the process environment.
```
cat /proc/<pid>/environ
```

Update service configuration for Docker running with root privileges

Perform this step only if you are using Docker with root privileges.

Go to /usr/lib/systemd/system/ and open docker.service.

Add the environment variables to the [Service] section. For example:

[Service]
...
Environment="HTTP_PROXY=http://my-squid-proxy.example.com:3128"
Environment="HTTPS_PROXY=http://my-squid-proxy.example.com:3128"
Environment="NO_PROXY=localhost,127.0.0.1"
...

Reload the systemd configuration and restart the Docker service.

sudo systemctl daemon-reload
sudo systemctl restart docker.service
sudo systemctl status docker.service

Perform the following steps to verify that the proxy variables are applied:
i. Get the main PID of the Docker daemon.
```
systemctl show --property=MainPID docker.service
```
ii. Inspect the environment variables of the process.
```
cat /proc/<pid>/environ
```

Expand to see the configuration instructions for Podman

Podman primarily operates in rootless mode, meaning it runs under a regular user account without needing root privileges. To ensure Podman uses the appropriate proxy settings when pulling images or running containers, you must configure both the podman.service and podman.socket user-level systemd units.

Update Podman service configuration

Open the override file (~/.config/systemd/user/podman.service.d/override.con) for the Podman service.
```
systemctl --user edit podman.service
```

Add the environment variables to the [Service] section. For example:

[Service]
Environment="HTTP_PROXY=http://my-squid-proxy.example.com:3128"
Environment="HTTPS_PROXY=http://my-squid-proxy.example.com:3128"
Environment="NO_PROXY=localhost,127.0.0.1"

Update Podman socket configuration

Open the override file for the Podman socket.
```
systemctl --user edit podman.socket
```

Add the environment variables to the [Service] section. For example:

[Service]
Environment="HTTP_PROXY=http://my-squid-proxy.example.com:3128"
Environment="HTTPS_PROXY=http://my-squid-proxy.example.com:3128"
Environment="NO_PROXY=localhost,127.0.0.1"

Reload and restart Podman services

After updating the configuration, reload the systemd manager and restart the Podman services to apply the changes.

systemctl --user daemon-reexec
systemctl --user daemon-reload
systemctl --user restart podman.service
systemctl --user restart podman.socket

Install agent

You can install the agent using one of the following methods:

Automated installation (recommended): Install and start the agent by running a single command.
Manual installation: Create the agent directories and config.json file, and then start the agent manually.

Automated installation

Open a terminal and run the installation command Fivetran generated for your agent.

Example:

Before you run the command, you must set the value of TOKEN to your agent token and the value of RUNTIME to the container runtime (Docker or Podman) you selected on the Fivetran dashboard.

TOKEN="YOUR_TOKEN_HERE" RUNTIME=docker bash -c "$(curl -sL "https://raw.githubusercontent.com/fivetran/hybrid_deployment/main/install.sh")"

The installation command does the following:

Creates the agent directories in $HOME/fivetran using the install.sh script.
Creates the default config.json file with the agent token.
Starts the agent container image with the container runtime you selected.

The installation command creates the agent directories in the following structure:

$HOME/fivetran              --> Agent home directory  
      ├── hdagent.sh        --> Helper script to start/stop the agent container            
      ├── conf              --> Configuration file location   
      │   └── config.json   --> Default configuration file             
      ├── data              --> Persistent storage used during data pipeline processing   
      │   └── _samples      --> Hashed source sample files used during active row calculations                  
      ├── logs              --> Log file location       
      └── tmp               --> Local temporary storage used during data pipeline processing

Manual installation

Expand for instructions

Configure local environment for agent

Run the following commands to create the agent directories:

mkdir -p $HOME/fivetran
cd $HOME/fivetran
mkdir -p data conf logs tmp

These commands create the agent directories in the following structure:

$HOME/fivetran        --> Agent home directory              
      ├── conf        --> Configuration file location               
      ├── data        --> Persistent storage used during data pipeline processing           
      ├── logs        --> Log file location       
      └── tmp         --> Local temporary storage used during data pipeline processing

Create a configuration file, config.json, in $HOME/fivetran/conf directory.
In the config.json file, add the agent token Fivetran generated for your agent.
```
{
  "token": "YOUR_AGENT_TOKEN"
}
```
We recommend that you add the agent token to the config.json file. However, you can skip this step and use the token as an environment variable when starting the agent container.
By default, you do not have to add any additional configuration values to the config.json file. However, you can add additional values to the config.json file based on your requirements. For more information about the configuration parameters you can add, see the Agent configuration parameters section of this topic.

Start agent

Using Docker

Log in to your local machine using the Fivetran user.
Go to the base folder you created.

Create a Docker network and start the container.

#!/bin/bash

# Config file is expected in the conf/ sub folder
CONFIG_FILE=conf/config.json

# Token will be extracted from config file
TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/')

# Extract controller id from token
CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":")

# Docker socket 
SOCKET=/var/run/docker.sock

# Create docker network for agent container
docker network create -d bridge fivetran_ldp > /dev/null 2>&1

# Start agent container
docker run \
 -d \
 --restart "on-failure:3" \
 --pull "always" \
 --security-opt label=disable \
 --label fivetran=ldp \
 --label ldp_process_id=default-controller-process-id \
 --label ldp_controller_id=$CONTROLLER_ID \
 --name controller \
 --network fivetran_ldp \
 --env HOST_USER_HOME_DIR=$HOME \
 --env CONTAINER_ENV_TYPE="docker" \
 -v $HOME/fivetran/conf:/conf \
 -v $HOME/fivetran/logs:/logs \
 -v $SOCKET:/var/run/docker.sock \
 us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json

You can find your agent token in conf/config.json.
If you are running Docker in rootless mode, set the SOCKET value to reflect the rootless socket. For example, SOCKET=/var/run/user/$uid/docker.sock. You can use $(id -u) to get the $uid value.

Stop agent

The following script identifies the agent container, and then stops and removes it:

#!/bin/bash
CONTAINER_ID=$(docker ps -a -q -f name="^/controller" -f label=fivetran=ldp)
docker stop $CONTAINER_ID
docker rm $CONTAINER_ID
docker network rm fivetran_ldp

(Optional) Using docker-compose

We recommend using docker run to start your agent. However, if you want to use docker-compose instead of docker run, create a YAML file similar to the example below and then use it to start the agent.

Make sure your agent token is specified in your config.json file.

services:
   controller:
      container_name: controller
      image: us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production
      pull_policy: always
      restart: "no"
      labels:
         fivetran: ldp
         ldp_process_id: default-controller-process-id
         ldp_controller_id: <controller-id>
      security_opt:
         - label:disable
      environment:
         FIVETRAN_ENV: "prod"
         HOST_USER_HOME_DIR: $HOME
      volumes:
         - $HOME/fivetran/conf:/conf
         - $HOME/fivetran/logs:/logs
         - /var/run/docker.sock:/var/run/docker.sock
      command: -f /conf/config.json
networks:
  default:
    name: fivetran_ldp
    driver: bridge

In your YAML file, replace <controller-id> with your controller ID. You can get the ID using echo $TOKEN | base64 -d | cut -f1 -d":".
If you are running Docker in rootless mode, adjust the volume mount to reflect the rootless socket. For example, - /var/run/user/1000/docker.sock:/var/run/docker.sock. You can use $(id -u) to get the uid value.
We recommend that you use Docker Compose (v2.17.x or above). To start the agent, you can either use the docker compose sub-command or the latest docker-compose standalone utility.
You must use docker stop <agent-container> and docker rm <agent-container> to stop and remove the container before starting it because docker-compose down does not always detect the agent container.

Using Podman

Log in to your local machine using the Fivetran user.
Go to the base folder you created.

Create a Podman network and start the container.

#!/bin/bash

# Config file is expected in the conf/ sub folder
CONFIG_FILE=conf/config.json

# Token will be extracted from config file
TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/')

# Extract controller id from token
CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":")

# Podman socket 
XDG_RUNTIME_DIR=/run/user/$(id -u)
SOCKET=$XDG_RUNTIME_DIR/podman/podman.sock

# Create podman network for agent container
podman network create -d bridge fivetran_ldp > /dev/null 2>&1

# Start agent container
podman run \
 -d \
 --restart "on-failure:3" \
 --pull "always" \
 --security-opt label=disable \
 --label fivetran=ldp \
 --label ldp_process_id=default-controller-process-id \
 --label ldp_controller_id=$CONTROLLER_ID \
 --name controller \
 --network fivetran_ldp \
 --env HOST_USER_HOME_DIR=$HOME \
 --env CONTAINER_ENV_TYPE="podman" \
 -v $HOME/fivetran/conf:/conf \
 -v $HOME/fivetran/logs:/logs \
 -v $SOCKET:/run/user/1000/podman/podman.sock \
 us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json

You can find your agent token in conf/config.json.
If you are running Podman in rootless mode, set the SOCKET value to reflect the rootless socket.

Stop agent

The following script identifies the agent container, and then stops and removes it:

#!/bin/bash
CONTAINER_ID=$(podman ps -a -q -f name="^/controller" -f label=fivetran=ldp)
podman stop $CONTAINER_ID
podman rm $CONTAINER_ID
podman network rm fivetran_ldp

(Optional) Configure proxy settings for agent

Perform this step only if you want the Hybrid Deployment Agent and its jobs to route traffic through a proxy server (for example, Squid). If you are not using a proxy server, skip to the next step.

Expand to see the instructions

Open the config.json file for the Hybrid Deployment Agent.

Add the necessary proxy environment variables you specified for your local environment. For example:

...
"http_proxy": "http://my-squid-proxy.example.com:3128",
"https_proxy": "http://my-squid-proxy.example.com:3128",
"no_proxy": "localhost,127.0.0.1",
...

Restart the agent to apply the new settings.
```
./hdagent.sh stop
./hdagent.sh start
```

Verify agent status

Verify the agent status by doing any of the following:

Run docker ps -a or podman ps -a to verify whether the agent container is running.
Review the agent container logs.
On the Fivetran dashboard, go to Account Settings > General > Hybrid Deployment Agents and verify the agent status.

You can view and manage all the agents associated with your Fivetran account on the Fivetran dashboard (Account Settings > General > Hybrid Deployment Agents).

Agent configuration parameters

The only mandatory configuration parameter required in the config.json file to start the agent is token. You can find the agent token on the Fivetran dashboard while creating the agent. The token is unique to each agent and we use it to establish a secure network connection to Fivetran cloud. All other configurations are optional, with the default settings being sufficient for most use cases.

If you need any additional configuration specific to your environment, this section contains the list of supported configuration parameters and their descriptions. To apply these settings, create a config.json file with the required parameters and start the agent using this configuration file. We recommend that you store the configuration file is $HOME/fivetran/conf.

You can define all the configuration parameters either in the configuration file (recommended) or as environment variables when starting the agent. Environment variables always take precedence and override the values in the configuration file. If you do not update any parameter value, the default settings will be applied.

Example of a basic configuration:

config.json

{
  "token": "YOUR_TOKEN_HERE",
  "container_env_type": "docker",
  "host_persistent_storage_mount_path": "~/fivetran/data",
  "host_selinux_enabled": false,
  "save_controller_logs_to_file": true
}

The configuration options supported with their default values are listed in the table below.

Parameter	Default Value	Description
container_cpu_limit	0	Container CPU limit. The default value 0 indicates that the container does not have any CPU limit.
container_cpu_limit.integrations.your-integration-id (Discontinued)	0	Container CPU limit per integration. The default value 0 indicates that the container does not have any CPU limit.
container_cpu_limit_integrations_your-integration-id	0	Container CPU limit per integration. The default value 0 indicates that the container does not have any CPU limit.
container_env_type	docker	Container runtime used to run the agent containers. Default value: docker. Possible values: docker, podman.
container_memory_limit_gigabytes	4	Container memory limit in GB. The default value 4 indicates that the container has a 4GB memory limit.
container_memory_limit_gigabytes.integrations.your-integration-id (Discontinued)	4	Container memory limit (in GB) per integration. The default value 4 indicates that the container has a 4GB memory limit.
container_memory_limit_gigabytes_integrations_your-integration-id	4	Container memory limit (in GB) per integration. The default value 4 indicates that the container has a 4GB memory limit.
container_podman_sock_file_mount_path	unix:///run/user/1000/podman/podman.sock	Default agent container Podman socket.
controller_disk_space_abort_threshold_bytes	102400	Maximum threshold for agent host disk space. The agent aborts its operations at this threshold. Default value: 100 MB.
controller_disk_space_threshold_bytes	4294967296	Maximum threshold for agent host disk space warning. A warning appears on your dashboard at this threshold. Default value: 4 GB.
docker_pull_retries	10	Retry count for pulling new container images.
docker_pull_timeout_seconds	300	Timeout value in seconds for pulling new container images.
host_persistent_storage_mount_path	~/fivetran/data	Default persistent storage location used during pipeline data processing. This location should have sufficient disk space to hold your full dataset during the initial sync.
host_persistent_temp_storage_mount_path	null (no value)	This location is optional. Temporary storage location used during pipeline data processing. If set, this location should have sufficient disk space to hold your full dataset during the initial sync. Example: "~/fivetran/tmp"
host_selinux_enabled	false	If SELinux is enabled on the host running your agent containers, set this parameter to true. Default value: false.
log_clean_frequency_milliseconds	1800000	How often logs must be cleaned up.
log_folder_path	/logs	Default location inside agent container where logs will be stored.
log_retention_days	3	Log file retention in days.
poll_container_status_interval_seconds	10	Poll interval used for checking the container status.
port	8090	Agent health-check port number
profile	docker	Agent profile to use. Default value: docker. Possible values: docker, podman.
save_controller_logs_to_file	true	Enables saving the agent container log to local file. Default value: true.
save_job_logs_to_file	false	Enables saving the job (container) logs to local files. Default value: false.
http_proxy	null (no value)	Enables proxy server connection with the proxy server host address for HTTP connections, e.g. http://proxy.local:3128
https_proxy	null (no value)	Enables proxy server connection with the proxy server host address for HTTPS connections, e.g. http://proxy.local:3128
no_proxy	null (no value)	Disables proxy server connection for specified hosts, e.g. localhost,gateway

Hybrid Deployment Overview

Hybrid Deployment FAQ

API Hybrid Deployment Agent Management

Documentation Home

Hybrid Deployment with Docker and Podman Setup Guide

Prerequisites

Setup instructions

Create agent

(Optional) Configure proxy settings for local environment and container runtime

Configure local Linux environment

Configure container runtime

Set proxy values for Docker CLI

Update service configuration for Docker running in rootless mode

Update service configuration for Docker running with root privileges

Update Podman service configuration

Update Podman socket configuration

Reload and restart Podman services

Install agent

Automated installation

Manual installation

Configure local environment for agent

Start agent

Stop agent

Stop agent

(Optional) Configure proxy settings for agent

Verify agent status

Agent configuration parameters

Related articles