Hybrid Deployment Setup Guide
Follow our setup guide to set up the Hybrid Deployment model for your data pipeline.
Prerequisites
To use Hybrid Deployment, make sure your local environment has the following prerequisites in place:
- Operating system: A Linux distribution with Docker (v20.10.17 or above) or Podman (v4.6.1 or above) container runtime installed and configured.
- CPU: Minimum 4 vCPUs with x86-64 processors.
- Memory: Minimum 8 GB of RAM.
- Storage:
- Minimum 50 GB allocated disk space for the Docker or Podman storage location. For more information see our FAQ documentation.
- The persistent local storage must be sufficient to accommodate the dataset volume for each connector you plan to run. This location is specified with the
host_persistent_storage_mount_path
configuration parameter. The default location is$HOME/fivetran/data
.
- A non-root Linux user to run the containers (for example,
fivetran
). The user must have the permissions necessary to run Docker or Podman. For more information about creating a new Linux user, see our FAQ documentation.
NOTES:
- We recommend that you always run Docker or Podman in rootless mode. If using rootless ensure the user
$HOME
location have at least 50GB of allocated disk space. For more information about configuring Docker and Podman in rootless mode, see our FAQ documentation.- If you want to run multiple pipeline processes concurrently on the same host, make sure you adjust the underlying resource capacity accordingly.
- If you want to use Docker Desktop, which already includes Docker Engine, make sure your Docker subscription plan supports it.
- Using an encrypted filesystem is recommended.
Setup instructions
IMPORTANT: Hybrid Deployment Agents that were configured before September 18, 2024 using the
auth.json
andconfig.json
files should be updated using the new token-based flow. Using theauth.json
option is deprecated. Update your agent to use the new token-based workflow documented in this setup guide. For more information about updating your agent to use the token-based workflow, see our FAQ documentation.
Create agent
Log in to your Fivetran account.
Go to the Destinations page and click Add destination.
Select your destination type.
Enter a Destination name of your choice.
Click Add.
In the destination setup form, set the Enable hybrid deployment toggle to ON.
Click + Configure new agent.
In the Configure a new agent pane, read the Fivetran On-Prem Software License Addendum, and select the I have read and agree to the terms of the License Addendum and the Software Specific Requirements checkbox.
Click Next.
Select the container runtime you want to use for your containers: Docker or Podman.
Click Next.
Enter an Agent name and click Generate agent token.
Make a note of the agent token and installation command. You will need the agent token for manual installation and the installation command for automated installation of the agent.
NOTE: Each Hybrid Deployment Agent has a unique token and installation command.
Click Save.
IMPORTANT: You must install and start the agent before completing the destination setup.
Install agent
You can install the agent using one of the following methods:
- Automated installation (recommended): Install and start the agent by running a single command.
- Manual installation: Create the agent directories and
config.json
file, and then start the agent manually.
Automated installation
Log in to your local machine using the Fivetran user.
Open a terminal and run the installation command that was generated when you created the agent on the Fivetran dashboard.
Example:
Before you run the command, you must set the value of
TOKEN
to your agent token and the value ofRUNTIME
to the container runtime (Docker or Podman) you selected on the Fivetran dashboard.TOKEN="YOUR_TOKEN_HERE" RUNTIME=docker bash -c "$(curl -sL "https://raw.githubusercontent.com/fivetran/hybrid_deployment/main/install.sh")"
The installation command does the following:
- Creates the agent directories in
$HOME/fivetran
using the install.sh script. - Creates the default
config.json
file with the agent token. - Starts the agent container image with the container runtime you chose.
The installation command creates the agent directories in the following structure:
$HOME/fivetran --> Agent home directory ├── hdagent.sh --> Helper script to start/stop the agent container ├── conf --> Config file location │ └── config.json --> Default config file ├── data --> Persistent storage used during data pipeline processing ├── logs --> Log file location └── tmp --> Local temporary storage used during data pipeline processing
- Creates the agent directories in
Managing the agent
NOTE: The agent container will be automatically started during the automated installation process.
Use the hdagent.sh script to manage (start/stop) the agent container.
Usage:
./hdagent.sh [-r docker|podman] start|stop|status
NOTE: The default runtime will be docker, if using podman use -r podman.
Manual installation
Expand for instructions
Configure local environment for agent
Log in to your local machine using the Fivetran user.
Run the following commands to create the agent directories:
mkdir -p $HOME/fivetran cd $HOME/fivetran mkdir -p data conf logs tmp
These commands create the agent directories in the following structure:
$HOME/fivetran --> Agent home directory ├── conf --> Config file location ├── data --> Persistent storage used during data pipeline processing ├── logs --> Log file location └── tmp --> Local temporary storage used during data pipeline processing
Create a configuration file,
config.json
, in$HOME/fivetran/conf
directory.In the
config.json
file, add the agent token that was generated when you created the agent on Fivetran dashboard.{ "token": "YOUR_AGENT_TOKEN" }
IMPORTANT:
- We recommend that you add the agent token to the
config.json
file. However, you can skip this step and use the token as an environment variable when starting the agent container. - By default, you do not have to add any additional configuration values to the
config.json
file. However, you can add additional values to theconfig.json
file based on your requirements. For more information on the configuration parameters that you can add, see the Agent configuration parameters section of this topic.
- We recommend that you add the agent token to the
Start agent
Using Docker
Log in to your local machine using the Fivetran user.
Go to the base folder you created.
Create a Docker network and start the container.
#!/bin/bash # Config file is expected in the conf/ sub folder CONFIG_FILE=conf/config.json # Token will be extracted from config file TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/') # Extract controller id from token CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":") # Docker socket SOCKET=/var/run/docker.sock # Create docker network for agent container docker network create -d bridge fivetran_ldp > /dev/null 2>&1 # Start agent container docker run \ -d \ --restart "on-failure:3" \ --pull "always" \ --security-opt label=disable \ --label fivetran=ldp \ --label ldp_process_id=default-controller-process-id \ --label ldp_controller_id=$CONTROLLER_ID \ --name controller \ --network fivetran_ldp \ --env HOST_USER_HOME_DIR=$HOME \ --env CONTAINER_ENV_TYPE="docker" \ -v $HOME/fivetran/conf:/conf \ -v $HOME/fivetran/logs:/logs \ -v $SOCKET:/var/run/docker.sock \ us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json
NOTES:
- The config file is expected to be in
conf/config.json
and should contain the token.- If you are running Docker in rootless mode, set the
SOCKET
value to reflect the rootless socket. For example,SOCKET=/var/run/user/$uid/docker.sock
. You can use$(id -u)
to get and set the$uid
value.
Stop agent
The following script identifies the agent container, and then stops and removes it:
#!/bin/bash
CONTAINER_ID=$(docker ps -a -q -f name="^/controller" -f label=fivetran=ldp)
docker stop $CONTAINER_ID
docker rm $CONTAINER_ID
docker network rm fivetran_ldp
(Optional) Using docker-compose
We recommend you use docker run to start your agent. However, if you want to use docker-compose instead of docker run, create a YAML file similar to the following example below and use this to start the agent.
Important: Make sure the token is specified in your
conf/config.json
and that your docker socket is specified correctly.
services: controller: container_name: controller image: us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production pull_policy: always restart: "no" labels: fivetran: ldp ldp_process_id: default-controller-process-id ldp_controller_id: <controller-id> security_opt: - label:disable environment: FIVETRAN_ENV: "prod" HOST_USER_HOME_DIR: $HOME volumes: - $HOME/fivetran/conf:/conf - $HOME/fivetran/logs:/logs - /var/run/docker.sock:/var/run/docker.sock command: -f /conf/config.json networks: default: name: fivetran_ldp driver: bridge
NOTES:
- In your YAML file, replace
<controller-id>
with your controller ID. You can get the ID usingecho $TOKEN | base64 -d | cut -f1 -d":"
.- If you are running Docker in rootless mode, adjust the volume mount to reflect the rootless socket. For example,
- /var/run/user/1000/docker.sock:/var/run/docker.sock
. You can use$(id -u)
to get correct uid (1000 in example) value.- We recommend that you use Docker Compose (v2.17.x or above). To start the agent, you can either use the
docker compose
sub-command or the latestdocker-compose
standalone utility.- You must use
docker stop <agent-container>
anddocker rm <agent-container>
to stop and remove the container before starting it becausedocker-compose down
does not always detect the agent container.
Using Podman
Log in to your local machine using the Fivetran user.
Go to the base folder you created.
Create a Podman network and start the container.
#!/bin/bash # Config file is expected in the conf/ sub folder CONFIG_FILE=conf/config.json # Token will be extracted from config file TOKEN=$(grep -o '"token": *"[^"]*"' "$CONFIG_FILE" | sed 's/.*"token": *"\([^"]*\)".*/\1/') # Extract controller id from token CONTROLLER_ID=$(echo $TOKEN | base64 -d | cut -f1 -d":") # Podman socket XDG_RUNTIME_DIR=/run/user/$(id -u) SOCKET=$XDG_RUNTIME_DIR/podman/podman.sock # Create podman network for agent container podman network create -d bridge fivetran_ldp > /dev/null 2>&1 # Start agent container podman run \ -d \ --restart "on-failure:3" \ --pull "always" \ --security-opt label=disable \ --label fivetran=ldp \ --label ldp_process_id=default-controller-process-id \ --label ldp_controller_id=$CONTROLLER_ID \ --name controller \ --network fivetran_ldp \ --env HOST_USER_HOME_DIR=$HOME \ --env CONTAINER_ENV_TYPE="podman" \ -v $HOME/fivetran/conf:/conf \ -v $HOME/fivetran/logs:/logs \ -v $SOCKET:/run/user/1000/podman/podman.sock \ us-docker.pkg.dev/prod-eng-fivetran-ldp/public-docker-us/ldp-agent:production -f /conf/config.json
NOTES:
- The config file is expected to be in
conf/config.json
and should contain the token.- If you are running Podman in rootless mode, set the
SOCKET
value to reflect the rootless socket.
Stop agent
The following script identifies the agent container, and then stops and removes it:
#!/bin/bash
CONTAINER_ID=$(podman ps -a -q -f name="^/controller" -f label=fivetran=ldp)
podman stop $CONTAINER_ID
podman rm $CONTAINER_ID
podman network rm fivetran_ldp
Verify agent status
You can verify the agent status by doing any of the following:
Run
docker ps -a
orpodman ps -a
to verify whether the the agent container is running.Review the agent container logs.
Verify the agent health check endpoint (
http://localhost:8090/healthz
).On the Fivetran dashboard, go to Account Settings > General > Hybrid Deployment Agents and verify the agent status.
NOTE: You can manage all the agents associated with your Fivetran on the Hybrid Deployment Agents tab.
Agent configuration parameters
The only mandatory configuration parameter needed in the config.json
file to start the agent is token
. You can be found in the agent token on the Fivetran dashboard while creating the agent. The token is unique to each agent and is used to establish a secure connection to Fivetran. All other configurations are optional, with the default settings being sufficient for most use cases.
If you need any additional configuration specific to your environment, this section contains the list of supported configuration parameters and their descriptions. To apply these settings, create a config.json
file with the required parameters and start the agent using this configuration file. We recommend that you store the configuration file is $HOME/fivetran/conf
.
You can define all the configuration parameters either in the configuration file (recommended) or as environment variables when starting the agent. Environment variables take precedence and override the values in the configuration file. If you do not update any parameter value, the default settings will be applied.
Example of a basic configuration:
config.json
{
"token": "YOUR_TOKEN_HERE",
"container_env_type": "docker",
"host_persistent_storage_mount_path": "~/fivetran/data",
"host_selinux_enabled": false,
"save_controller_logs_to_file": true
}
The configuration options supported with their default values are listed in the table below.
Parameter | Default Value | Description |
---|---|---|
container_cpu_limit | 0 | Agent container CPU limit. The default value 0 indicates that the container does not have any CPU limit. |
container_env_type | docker | Container runtime used to run the agent containers. Default value: docker. Possible values: docker, podman. |
container_memory_limit_gigabytes | 0 | Agent container memory limit in GB. The default value 0 indicates that the container does not have any memory limit. |
container_podman_sock_file_mount_path | unix:///run/user/1000/podman/podman.sock | Default agent container Podman socket. |
controller_disk_space_abort_threshold_bytes | 102400 | Maximum threshold for agent host disk space. The agent aborts its operations at this threshold. Default value: 100 MB. |
controller_disk_space_threshold_bytes | 4294967296 | Maximum threshold for agent host disk space warning. A warning appears on your dashboard at this threshold. Default value: 4 GB. |
docker_pull_retries | 10 | Retry count for pulling new container images. |
docker_pull_timeout_seconds | 300 | Timeout value in seconds for pulling new container images. |
host_persistent_storage_mount_path | ~/fivetran/data | Default persistent storage location used during pipeline data processing. This location should have sufficient disk space to hold your full dataset during the initial sync. |
host_persistent_temp_storage_mount_path | null (no value) | This location is optional. Temporary storage location used during pipeline data processing. If set, this location should have sufficient disk space to hold your full dataset during the initial sync. Example: "~/fivetran/tmp" |
host_selinux_enabled | false | If SELinux is enabled on the host running your agent containers, set this parameter to true. Default value: false. |
log_clean_frequency_milliseconds | 1800000 | How often logs must be cleaned up. |
log_folder_path | /logs | Default location inside agent container where logs will be stored. |
log_retention_days | 3 | Log file retention in days. |
poll_container_status_interval_seconds | 10 | Poll interval used for checking the container status. |
port | 8090 | Agent health-check port number |
profile | docker | Agent profile to use. Default value: docker. Possible values: docker, podman. |
save_controller_logs_to_file | true | Enables saving the agent container log to local file. Default value: true. |
save_job_logs_to_file | false | Enables saving the job (container) logs to local files. Default value: false. |
Related articles
description Hybrid Deployment Overview
assignment Hybrid Deployment FAQ
settings API Hybrid Deployment Agent Management