Hybrid Deployment with Docker and Podman Setup Guide
Follow our setup guide to set up the Hybrid Deployment model with Docker or Podman.
Prerequisites
To use Hybrid Deployment with Docker or Podman, you need a server or virtual machine (VM) in your local environment with the following:
- Operating system: A Linux distribution with Docker (v20.10.17 or above) or Podman (v4.6.1 or above) container runtime.
- CPU: Minimum 4 vCPUs with x86-64 processors.
- Memory: Minimum 8 GB of RAM.
- Storage:
- Minimum 50 GB of allocated disk space for the Docker or Podman storage location. For more information, see our FAQ documentation.
- Persistent local storage must be sufficient to accommodate the total dataset volume for all the connectors you plan to deploy. The default location is
$HOME/fivetran/data
.
- A non-root Linux user to run the containers (for example,
fivetran
). The user must have the permissions necessary to run Docker or Podman. For more information about creating a new Linux user, see our FAQ documentation. - Reliable connectivity to both the source and destination.
NOTE:
- We recommend running Docker or Podman in rootless mode. Before you run it in rootless mode, make sure the user's
$HOME
location has at least 50 GB of allocated disk space. For more information about configuring Docker and Podman in rootless mode, see our FAQ documentation.- If you want to run multiple pipeline processes concurrently on the same host, make sure you adjust the underlying resource capacity accordingly.
- If you want to use Docker Desktop, which already includes Docker Engine, make sure your Docker subscription plan supports it.
- We recommend using an encrypted filesystem to secure your files and directories.
Setup instructions
IMPORTANT: If your Hybrid Deployment Agent was configured before September 18, 2024 using the
auth.json
andconfig.json
files, you must re-configure it according to the new workflow documented in this setup guide. For more information about re-configuring your agent, see our FAQ documentation.
Create agent
Log in to your Fivetran account.
Go to the Destinations page and click Add destination.
Select your destination type.
Enter a Destination name of your choice.
Click Add.
In the destination setup form, choose Hybrid Deployment as your deployment model.
Click + Configure new agent.
In the Configure a new agent pane, read the Fivetran On-Prem Software License Addendum, and select the I have read and agree to the terms of the License Addendum and the Software Specific Requirements checkbox.
Click Next.
Choose the environment you want to use for deployment: Docker or Podman.
Click Next.
Enter an Agent name and click Generate agent token.
Make a note of the agent token and installation command. You will need the agent token for manual installation and the installation command for automated installation of the agent.
NOTE: Each Hybrid Deployment Agent has a unique token and installation command. Also, the installation command varies based on the deployment type (Docker or Podman) you select.
Click Save.
IMPORTANT: You must install and start the agent before completing the destination setup.
Install agent
You can install the agent using one of the following methods:
- Automated installation (recommended): Install and start the agent by running a single command.
- Manual installation: Create the agent directories and
config.json
file, and then start the agent manually.
Automated installation
Log in to your local machine using the Fivetran user.
Open a terminal and run the installation command Fivetran generated for your agent.
Example:
Before you run the command, you must set the value of
TOKEN
to your agent token and the value ofRUNTIME
to the container runtime (Docker or Podman) you selected on the Fivetran dashboard.TOKEN="YOUR_TOKEN_HERE" RUNTIME=docker bash -c "$(curl -sL "https://raw.githubusercontent.com/fivetran/hybrid_deployment/main/install.sh")"
The installation command does the following:
- Creates the agent directories in
$HOME/fivetran
using the install.sh script. - Creates the default
config.json
file with the agent token. - Starts the agent container image with the container runtime you selected.
The installation command creates the agent directories in the following structure:
$HOME/fivetran --> Agent home directory ├── hdagent.sh --> Helper script to start/stop the agent container ├── conf --> Configuration file location │ └── config.json --> Default configuration file ├── data --> Persistent storage used during data pipeline processing ├── logs --> Log file location └── tmp --> Local temporary storage used during data pipeline processing
- Creates the agent directories in
Manual installation
Expand for instructions
Configure local environment for agent
Log in to your local machine using the Fivetran user.
Run the following commands to create the agent directories:
mkdir -p $HOME/fivetran cd $HOME/fivetran mkdir -p data conf logs tmp
These commands create the agent directories in the following structure:
$HOME/fivetran --> Agent home directory ├── conf --> Configuration file location ├── data --> Persistent storage used during data pipeline processing ├── logs --> Log file location └── tmp --> Local temporary storage used during data pipeline processing
Create a configuration file,
config.json
, in$HOME/fivetran/conf
directory.In the
config.json
file, add the agent token Fivetran generated for your agent.{ "token": "YOUR_AGENT_TOKEN" }
IMPORTANT:
- We recommend that you add the agent token to the
config.json
file. However, you can skip this step and use the token as an environment variable when starting the agent container. - By default, you do not have to add any additional configuration values to the
config.json
file. However, you can add additional values to theconfig.json
file based on your requirements. For more information about the configuration parameters you can add, see the Agent configuration parameters section of this topic.
- We recommend that you add the agent token to the
Start agent
Using Docker
Log in to your local machine using the Fivetran user.
Go to the base folder you created.
Create a Docker network and start the container.
#!/bin/bash # Config file is expected in the conf/ sub folder CONFIG_FILE=conf/config.json # Token will be extracted from config file TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/') # Extract controller id from token CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":") # Docker socket SOCKET=/var/run/docker.sock # Create docker network for agent container docker network create -d bridge fivetran_ldp > /dev/null 2>&1 # Start agent container docker run \ -d \ --restart "on-failure:3" \ --pull "always" \ --security-opt label=disable \ --label fivetran=ldp \ --label ldp_process_id=default-controller-process-id \ --label ldp_controller_id=$CONTROLLER_ID \ --name controller \ --network fivetran_ldp \ --env HOST_USER_HOME_DIR=$HOME \ --env CONTAINER_ENV_TYPE="docker" \ -v $HOME/fivetran/conf:/conf \ -v $HOME/fivetran/logs:/logs \ -v $SOCKET:/var/run/docker.sock \ us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json
NOTE:
- You can find your agent token in
conf/config.json
.- If you are running Docker in rootless mode, set the
SOCKET
value to reflect the rootless socket. For example,SOCKET=/var/run/user/$uid/docker.sock
. You can use$(id -u)
to get the$uid
value.
Stop agent
The following script identifies the agent container, and then stops and removes it:
#!/bin/bash
CONTAINER_ID=$(docker ps -a -q -f name="^/controller" -f label=fivetran=ldp)
docker stop $CONTAINER_ID
docker rm $CONTAINER_ID
docker network rm fivetran_ldp
(Optional) Using docker-compose
We recommend using docker run to start your agent. However, if you want to use docker-compose instead of docker run, create a YAML file similar to the example below and then use it to start the agent.
IMPORTANT: Make sure your agent token is specified in your
config.json
file.
services: controller: container_name: controller image: us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production pull_policy: always restart: "no" labels: fivetran: ldp ldp_process_id: default-controller-process-id ldp_controller_id: <controller-id> security_opt: - label:disable environment: FIVETRAN_ENV: "prod" HOST_USER_HOME_DIR: $HOME volumes: - $HOME/fivetran/conf:/conf - $HOME/fivetran/logs:/logs - /var/run/docker.sock:/var/run/docker.sock command: -f /conf/config.json networks: default: name: fivetran_ldp driver: bridge
NOTE:
- In your YAML file, replace
<controller-id>
with your controller ID. You can get the ID usingecho $TOKEN | base64 -d | cut -f1 -d":"
.- If you are running Docker in rootless mode, adjust the volume mount to reflect the rootless socket. For example,
- /var/run/user/1000/docker.sock:/var/run/docker.sock
. You can use$(id -u)
to get theuid
value.- We recommend that you use Docker Compose (v2.17.x or above). To start the agent, you can either use the
docker compose
sub-command or the latestdocker-compose
standalone utility.- You must use
docker stop <agent-container>
anddocker rm <agent-container>
to stop and remove the container before starting it becausedocker-compose down
does not always detect the agent container.
Using Podman
Log in to your local machine using the Fivetran user.
Go to the base folder you created.
Create a Podman network and start the container.
#!/bin/bash # Config file is expected in the conf/ sub folder CONFIG_FILE=conf/config.json # Token will be extracted from config file TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/') # Extract controller id from token CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":") # Podman socket XDG_RUNTIME_DIR=/run/user/$(id -u) SOCKET=$XDG_RUNTIME_DIR/podman/podman.sock # Create podman network for agent container podman network create -d bridge fivetran_ldp > /dev/null 2>&1 # Start agent container podman run \ -d \ --restart "on-failure:3" \ --pull "always" \ --security-opt label=disable \ --label fivetran=ldp \ --label ldp_process_id=default-controller-process-id \ --label ldp_controller_id=$CONTROLLER_ID \ --name controller \ --network fivetran_ldp \ --env HOST_USER_HOME_DIR=$HOME \ --env CONTAINER_ENV_TYPE="podman" \ -v $HOME/fivetran/conf:/conf \ -v $HOME/fivetran/logs:/logs \ -v $SOCKET:/run/user/1000/podman/podman.sock \ us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json
NOTE:
- You can find your agent token in
conf/config.json
.- If you are running Podman in rootless mode, set the
SOCKET
value to reflect the rootless socket.
Stop agent
The following script identifies the agent container, and then stops and removes it:
#!/bin/bash
CONTAINER_ID=$(podman ps -a -q -f name="^/controller" -f label=fivetran=ldp)
podman stop $CONTAINER_ID
podman rm $CONTAINER_ID
podman network rm fivetran_ldp
Verify agent status
Verify the agent status by doing any of the following:
Run
docker ps -a
orpodman ps -a
to verify whether the the agent container is running.Review the agent container logs.
Verify the agent health check endpoint (
http://localhost:8090/healthz
).On the Fivetran dashboard, go to Account Settings > General > Hybrid Deployment Agents and verify the agent status.
NOTE: You can view and manage all the agents associated with your Fivetran account on the Fivetran dashboard (Account Settings > General > Hybrid Deployment Agents).
Agent configuration parameters
The only mandatory configuration parameter required in the config.json
file to start the agent is token
. You can find the agent token on the Fivetran dashboard while creating the agent. The token is unique to each agent and we use it to establish a secure connection to Fivetran cloud. All other configurations are optional, with the default settings being sufficient for most use cases.
If you need any additional configuration specific to your environment, this section contains the list of supported configuration parameters and their descriptions. To apply these settings, create a config.json
file with the required parameters and start the agent using this configuration file. We recommend that you store the configuration file is $HOME/fivetran/conf
.
You can define all the configuration parameters either in the configuration file (recommended) or as environment variables when starting the agent. Environment variables always take precedence and override the values in the configuration file. If you do not update any parameter value, the default settings will be applied.
Example of a basic configuration:
config.json
{
"token": "YOUR_TOKEN_HERE",
"container_env_type": "docker",
"host_persistent_storage_mount_path": "~/fivetran/data",
"host_selinux_enabled": false,
"save_controller_logs_to_file": true
}
The configuration options supported with their default values are listed in the table below.
Parameter | Default Value | Description |
---|---|---|
container_cpu_limit | 0 | Container CPU limit. The default value 0 indicates that the container does not have any CPU limit. |
container_env_type | docker | Container runtime used to run the agent containers. Default value: docker. Possible values: docker, podman. |
container_memory_limit_gigabytes | 4 | Container memory limit in GB. The default value 4 indicates that the container has a 4GB memory limit. |
container_podman_sock_file_mount_path | unix:///run/user/1000/podman/podman.sock | Default agent container Podman socket. |
controller_disk_space_abort_threshold_bytes | 102400 | Maximum threshold for agent host disk space. The agent aborts its operations at this threshold. Default value: 100 MB. |
controller_disk_space_threshold_bytes | 4294967296 | Maximum threshold for agent host disk space warning. A warning appears on your dashboard at this threshold. Default value: 4 GB. |
docker_pull_retries | 10 | Retry count for pulling new container images. |
docker_pull_timeout_seconds | 300 | Timeout value in seconds for pulling new container images. |
host_persistent_storage_mount_path | ~/fivetran/data | Default persistent storage location used during pipeline data processing. This location should have sufficient disk space to hold your full dataset during the initial sync. |
host_persistent_temp_storage_mount_path | null (no value) | This location is optional. Temporary storage location used during pipeline data processing. If set, this location should have sufficient disk space to hold your full dataset during the initial sync. Example: "~/fivetran/tmp" |
host_selinux_enabled | false | If SELinux is enabled on the host running your agent containers, set this parameter to true. Default value: false. |
log_clean_frequency_milliseconds | 1800000 | How often logs must be cleaned up. |
log_folder_path | /logs | Default location inside agent container where logs will be stored. |
log_retention_days | 3 | Log file retention in days. |
poll_container_status_interval_seconds | 10 | Poll interval used for checking the container status. |
port | 8090 | Agent health-check port number |
profile | docker | Agent profile to use. Default value: docker. Possible values: docker, podman. |
save_controller_logs_to_file | true | Enables saving the agent container log to local file. Default value: true. |
save_job_logs_to_file | false | Enables saving the job (container) logs to local files. Default value: false. |
Related articles
description Hybrid Deployment Overview
assignment Hybrid Deployment FAQ
settings API Hybrid Deployment Agent Management