Amazon S3 as Target
Fivetran HVR supports integrating changes into S3 location. This section describes the configuration requirements for integrating changes using Integrate and Refresh into S3 location.
Customize Integrate
Defining action Integrate is sufficient for integrating changes into an S3 location. However, the default file format written into a target file location is HVR's own XML format and the changes captured from multiple tables are integrated as files into one directory. The integrated files are named using the integrate timestamp.
You may define other actions for customizing the default behavior of integration mentioned above. Following are few examples that can be used for customizing integration into the S3 location:
Group | Table | Action | Annotation |
---|---|---|---|
S3 | * | FileFormat | This action may be defined to:
|
S3 | * | Integrate | To segregate and name the files integrated into the target location, define parameter RenameExpression. For example, if RenameExpression={hvr_tbl_name}/{hvr_integ_tstamp}.csv is defined, then for each table in the source, a separate folder (with the same name as the table name) is created in the target location, and the files replicated for each table are saved into these folders. This also enforces unique name for the files by naming them with a timestamp of the moment when the file was integrated into the target location. |
S3 | * | ColumnProperties | This action defines properties for a column being replicated. This action may be defined to:
|
S3 Encryption
HVR supports client or server-side encryption of the files uploaded into S3 locations. For more information about S3 data encryption, refer to the AWS Documentation.
In HVR's UI, client or server-side encryption for an S3 location can be enabled while creating a location or by editing an existing location's source and target properties. The available options are:
The Location Property equivalent to the UI field is shown inside (bracket) below.
Client-Side with Master Key: Enables client-side encryption using a master symmetric key for AES.
MASTER SYMMETRIC KEY (S3_Encryption_Master_Symmetric_Key) must be supplied for this encryption method.
MATERIALS DESCRIPTION (S3_Encryption_Materials_Description) field is optional.
Client-Side with KMS: Enables client-side encryption with customer master keys (CMKs) stored in AWS key management service (KMS).
CUSTOMER MASTER KEY ID (S3_Encryption_KMS_Customer_Master_Key_Id) must be specified. This can be the ID of the key available in your own account or the full ARN if it is in another account.
Options for Client-Side with KMS:
No KMS Credentials: Use the region and credentials of the S3 connection.
KMS Region and Key: Specify the region and credentials.
- KMS REGION (S3_Encryption_KMS_Region), KMS KEY ID (S3_Encryption_KMS_Access_Key_Id), KMS SECRET KEY (S3_Encryption_KMS_Secret_Access_Key) must be specified.
KMS Instance Profile: Use AWS IAM role.
- IAM ROLE (S3_Encryption_KMS_IAM_Role) must be specified.
MATERIALS DESCRIPTION (S3_Encryption_Materials_Description) field is optional.
Server-Side (S3_Encryption_SSE): Enables server-side encryption with Amazon S3 managed keys.
- MATERIALS DESCRIPTION (S3_Encryption_Materials_Description) field is optional.
Server-Side with KMS (S3_Encryption_SSE_KMS): Enables server-side encryption with customer master keys (CMKs) stored in AWS key management service (KMS).
CUSTOMER MASTER KEY ID (S3_Encryption_KMS_Customer_Master_Key_Id): This can be the ID of the key available in your own account or the full ARN if it is in another account. If CUSTOMER MASTER KEY ID is not specified, a KMS managed CMK is used.
MATERIALS DESCRIPTION (S3_Encryption_Materials_Description) field is optional.
State Directory
By default, HVR creates its internal state files in a sub-directory named _hvr_state within the location’s top directory.
This option in HVR UI allows you to specify a custom path for HVR’s internal state files, which are used during file replication. The state directory can be configured as a path within the location’s top directory or placed outside of it. If a relative path (e.g., ../work) is specified, it will be interpreted relative to the location’s top directory.
If the state directory is on the same file system as the location’s top directory, HVR ensures that file move operations are ‘atomic,’ so users only see fully written files and never partial versions.
This option is equivalent to the location property File_State_Directory.
Intermediate Directory
This option in the HVR UI allows you to specify a directory path for storing intermediate (temporary) files generated during Compare. These files are created during both "direct file compare" and "online compare" operations.
Using an intermediate directory can enhance performance by ensuring that temporary files are stored in a location optimized for the system's data processing needs.
This setting is particularly relevant for target file locations, as it determines where the intermediate files are placed during the Compare operation. If this option is not enabled, the intermediate files are stored by default in the integratedir/_hvr_intermediate directory, where integratedir is the replication DIRECTORY (File_Path) defined for the target file location.
This option is equivalent to the location property Intermediate_Directory.
Intermediate Directory is Local
This option in HVR UI specifies that the Intermediate Directory will be created on the local drive of the file location's server.
This setting is crucial for optimizing performance, as it reduces network latency and avoids potential permission issues associated with remote storage. By storing intermediate files locally, HVR can process data more efficiently, taking advantage of the speed and reliability of local storage.
This option is particularly beneficial when the HVR Agent has access to ample local storage, enabling it to handle large data volumes without relying on networked storage solutions.
This option is equivalent to the location property Intermediate_Directory_Is_Local.
Integrate Limitations
By default, for file-based target locations, HVR does not replicate the delete
operation performed at the source location.