Hybrid Deployment Frequently Asked Questions
Read answers to frequently asked questions about Hybrid Deployment.
How does Fivetran secure the connection between its cloud and the agent?
In the Hybrid Deployment model, you establish an outbound connection to a secured Fivetran endpoint using mTLS. You do not have to open any ports because Fivetran does not connect to your local environment. You can also limit the outbound traffic to this endpoint.
When registering a Hybrid Deployment Agent, Fivetran generates and provides an auth.json
file containing the certificates specific to your account and agent.
What data does Fivetran receive from my local environment?
Fivteran receives only metadata from your local environment.
The agent in your environment sends the following information to the Fivetran cloud environment:
- Registration information: When you start the agent on your machine, it sends the following data to the orchestration component of the Fivetran cloud:
- Orchestration server hostname
- Orchestration server port number
- Client certificate and key
- Hybrid Deployment Agent logs: Fivetran logs only the internal tracing messages for your agent, and the agent sends them securely to the Fivetran cloud. These logs contain the agent registration status, sync job status, container errors, etc. A copy of these logs is available on your local machine for review. By default, these logs are available in
~/Fivetran/logs/<processing_agent_name>
. However, you can change their location. You can also disable the logging in your local environment. - Sync (data pipeline processing) logs: The sync logs contain internal tracing messages of all internal Fivetran processes and events, such as sync start and end times and sync errors. These logs contain only the metadata of synced objects to indicate when Fivetran processes the objects and their status. A copy of these logs is available on your local machine for review. By default, these logs are available in
~/Fivetran/logs/<processing_agent_name>/jobs/<process_id>
. However, you can disable logging in your local environment. - Hybrid Deployment Agent metrics: The agent sends the following metrics to indicate its performance and status:
- Start time and run duration of the agent
- Initiation status of the sync jobs and their failures, internal integration ID, process ID, and Docker container metadata for debugging purposes
- Initiation status of the test jobs (for validating connector and destination credentials) and their internal integration ID, process ID, and Docker container metadata
- Initiation status of the schema config jobs (for retrieving connector schema and object details), internal integration ID, process ID, and Docker container metadata
- Sync job metrics: The agent sends the following sync job metrics:
- Number of rows extracted or loaded per object
- Volume of data extracted or loaded
- Processing time for data extraction or loading
Does the Hybrid Deployment Agent connect to the source or destination?
No, the agent does not connect directly to the source or destination. The primary role of the agent is to start the jobs that handle various tasks in the data pipeline. These tasks include performing connection tests (setup tests) for the source and destination, retrieving the source schema (for example, tables and columns), and executing extract, process, and load operations. While the agent initiates and manages these jobs, including cleaning them up after completion, it does not establish direct connections to the source or destination.
How does Fivetran maintain the security of container images?
We regularly scan all the container images to detect any potential security vulnerabilities. We perform these scans using Static Application Security Testing (SAST) and Software Composition Analysis (SCA) tools.
Can any Hybrid Deployment Agent run my sync jobs?
No, a Hybrid Deployment Agent is specific to a particular Fivetran user account and destination. Each combination of user account, destination, and agent has a unique registration in Fivetran and has a specific TLS certificate.
When the Fivetran Cloud Orchestrator schedules a job, it creates a new OAuth token and sends it to the agent. This token is specific to a user account and valid only for 60 minutes. The OAuth token lets the agent download the necessary containers from our Artifact Registry.
Are all connections to the Fivetran dashboard (UI) secure?
Yes, Fivetran uses TLS (v1.2 or above) to encrypt all connections to its dashboard. It does not allow any direct connections between the dashboard and the agent, processes, or containers.
How does Fivetran secure my source and destination credentials?
By default, your connections to your source and destinations are SSL-encrypted. Fivetran securely stores your credentials in a key management system backed by a hardware security module. Fivetran's cloud service provider manages this hardware security module. You can also use your own keys for additional control over the encryption Fivetran uses.
Is the Hybrid Deployment Agent FIPS 140-2 compliant?
Fivetran has not yet tested the Hybrid Deployment Agent on a FIPS 140-2-enabled machine. Therefore, it is not currently FIPS 140-2 certified.
Where can I access Fivetran's compliance reports, security certifications, and policies?
You can access them in Fivetran's Trust Center.
Which external IP addresses does Hybrid Deployment access?
In addition to the connector and destination, Hybrid Deployment uses the following outbound connections:
- mTLS connection to the Fivetran Orchestration Service: 35.188.225.82 - ldp.orchestrator.fivetran.com
- HTTPS with secure token to Fivetran Public API: 35.236.237.87 - api.fivetran.com
- Google Artifact Registry - us-docker.pkg.dev
- GitHub repository hosting the automated installation script for Docker and Podman: raw.githubusercontent.com/fivetran/hybrid_deployment
- Logs used by the Fivetran Platform Connector - storage.googleapis.com/fivetran-metrics-log-sr
How can I create a new Linux user?
Log in to the machine where you want to host the Hybrid Deployment Agent.
Run the following command to create a user group for Fivetran:
sudo groupadd <group_name>
NOTE: Replace
<group_name>
in the command with a user group name of your choice.Run the following command to create a user for Fivetran:
sudo useradd -g <group_name> -m <username>
NOTE: Replace
<group_name>
in the command with the name of the user group you created and<username>
with a username of your choice.Run the following command to switch to the user you created for Fivetran:
sudo su - <username>
NOTE: Replace
<username>
in the command with the name of the user you created for Fivetran.
How can I configure Docker in rootless mode?
For better security, we recommend that you always run Docker in rootless mode. For more information about how to configure Docker in rootless mode, see Docker documentation.
IMPORTANT:
- Using the
xfs
filesystem for/home
($HOME) is highly recommended.- A minimum of 50GB of free disk space for
/home
is recommended.- Before you enable Docker in rootless mode, make sure you stop the system-wide Docker service with the following command:
sudo systemctl disable --now docker.service docker.service
- The Docker rootless binaries will be available in the
$HOME/bin
folder.
Example steps using the docker supplied installation script (as the non-root user):
curl -fsSL "https://get.docker.com/rootless" | sh
The above commands should already have started the Docker service. If it has not started, do the following to enable and start it:
Run the following as the non-root user:
systemctl --user enable --now docker.service systemctl --user start --now docker.service systemctl --user status docker.service
Run the following command to allow services to run even after the user logs out:
sudo loginctl enable-linger <username>
How can I configure Podman in rootless mode?
For better security, we recommend that you always run Podman in rootless mode. By default, Podman runs in rootless mode on most systems.
IMPORTANT:
- Using the
xfs
filesystem for/home
($HOME) is highly recommended.- A minimum of 50GB of free disk space for
/home
is recommended.
To configure Podman in rootless mode, do the following:
Ensure that the XDG_RUNTIME_DIR environment variable is set.
export XDG_RUNTIME_DIR=/run/user/$(id -u)
NOTE: To persist this, you can add it to the
.bashrc
file associated with the user.Run the following commands to start the Podman socket in rootless mode:
systemctl --user enable --now podman.socket systemctl --user start --now podman.socket
Switch to a root user and run the following command to allow services to run even after the user logs out:
sudo loginctl enable-linger <username>
What is the storage requirements when using containers?
It is recommended to have at least 50GB of allocated disk space for Docker or Podman to store container images and run containers.
This loctation should be monitored and may require additional space if processing large datasets.
The location used by Docker or Podman for storing and running containers depends on its configuration, more specific if configured in rootful or rootless mode. Rootless mode is more secure and the recommended option.
Default "rootful mode" location is usually configured in
/var/lib/
example:- Docker:
/var/lib/docker
- Podman:
/var/lib/podman
- Docker:
Default "rootless mode" location will by default be located in the user home folder, example:
$HOME/.local
NOTES:
- These locations are configurable, use the
docker info
(Docker Root Dir) orpodman info
commands to identify the location.
How can I update my agent to use the new token-based workflow?
You can update your existing agent using one of the following methods:
- Automated update (recommended)
- Manual update
Automated update
Use this method if using the default $HOME/fivetran location for your Hybrid Deployment agent.
If you are using a custom location, please use the manual update steps in next section.
Stop the agent container.
Regenerate a new Token for the agent.
This can be done from the Hybrid Deployment Agents tab in the Fivetran dashboard. Account Settings > General > Hybrid Deployment Agents, identify your agent and regenerate the agent Token
As you are using the default $HOME/fivetran location, you can now run the auto installation command using the newly generated Token.
The install scrip will detect your currentconfig.json
, update and move it to$HOME/fivetran/conf
. Theauth.json
anddocker-compose.yaml
files will be left as is and can be removed following the update.Install and start the new agent using the automated installation command.
Manage the agent container using the hdagent.sh script.
Manual update
Use this step if you are not using the default location $HOME/fivetran
.
Stop the agent container.
Regenerate a new Token for the agent.
This can be done from the Hybrid Deployment Agents tab in the Fivetran dashboard. Account Settings > General > Hybrid Deployment Agents, identify your agent and regenerate the agent Token to be used in the next step.
Install and start the new agent manually.
How can I start or stop my agent containers?
You use the hdagent.sh script to manage (start or stop) your agent container. To use this script, run the following:
./hdagent.sh [-r docker] start|stop|status
NOTE: If you are using Podman, replace
docker
withpodman
in the command.
How can I configure my agent to automatically start after the host machine is rebooted?
Depending on your container runtime environment, run one of the following commands to stop the agent:
Run the following command for Docker:
./hdagent.sh [-r docker] stop
Run the following command for Podman:
./hdagent.sh [-r podman] stop
Go to the agent installation directory (
$HOME/fivetran
) and open thehdagent.sh
file.In the
hdagent.sh
file, replace--restart "on-failure:3"
with--restart "always"
.Start the HD agent.
./hdagent.sh [-r docker|podman] start
Depending on your container runtime environment, run one of the following commands to start the agent:
Run the following command for Docker:
./hdagent.sh [-r docker] startp
Run the following command for Podman:
./hdagent.sh [-r podman] start
How can I verify whether my Persistent Volume Claim is working?
When you deploy the Hybrid Deployment Agent in Kubernetes, you must have a persistent shared storage across the worker nodes.
To create the persistent shared storage, you must specify the Storage Class (SC), define the Persistent Volume (PV), and then create a Persistent Volume Claim (PVC). Once you create the PVC, you can run a quick test by creating a Pod that can access the storage and write a file to it.
To check your persistent shared storage, create a YAML file using the sample content below. The sample content uses a PVC named your-data-vol-claim-name
and a namespace named default
. You can modify these values according to your environment and deploy the file using kubectl apply -f <file_name>
.
apiVersion: v1
kind: Pod
metadata:
name: test-pvc-pod
namespace: default
spec:
containers:
- name: test-pvc-container
image: alpine
command: ["/bin/sh"]
args: ["-c", "echo 'Testing pvc storage is working' >> /data/$(date -u).txt; tail -f /dev/null"]
volumeMounts:
- name: test-storage
mountPath: /data
volumes:
- name: test-storage
persistentVolumeClaim:
claimName: your-data-vol-claim-name