Azure Data Lake Storage as Target
Fivetran HVR supports integrating changes into Azure Data Lake Storage (DLS) location. This section describes the configuration requirements for integrating changes using Integrate and Refresh into ADLS location.
Due to technical limitations, Azure Data Lake Storage is not supported in the HVR releases since 6.1.5/3 to 6.1.5/9.
Customize Integrate
Defining action Integrate is sufficient for integrating changes into an ADLS location. However, the default file format written into a target file location is HVR's own XML format and the changes captured from multiple tables are integrated as files into one directory. The integrated files are named using the integrate timestamp.
You may define other actions for customizing the default behavior of integration mentioned above. Following are few examples that can be used for customizing integration into the ADLS location:
Group | Table | Action | Annotation |
---|---|---|---|
ADLS | * | FileFormat | This action may be defined to:
|
ADLS | * | Integrate | To segregate and name the files integrated into the target location, define parameter RenameExpression. For example, if RenameExpression={hvr_tbl_name}/{hvr_integ_tstamp}.csv is defined, then for each table in the source, a separate folder (with the same name as the table name) is created in the target location, and the files replicated for each table are saved into these folders. This also enforces unique name for the files by naming them with a timestamp of the moment when the file was integrated into the target location. |
ADLS | * | ColumnProperties | This action defines properties for a column being replicated. This action may be defined to:
|
State Directory
By default, HVR creates its internal state files in a sub-directory named _hvr_state within the location’s top directory.
This option in HVR UI allows you to specify a custom path for HVR’s internal state files, which are used during file replication. The state directory can be configured as a path within the location’s top directory or placed outside of it. If a relative path (e.g., ../work) is specified, it will be interpreted relative to the location’s top directory.
If the state directory is on the same file system as the location’s top directory, HVR ensures that file move operations are ‘atomic,’ so users only see fully written files and never partial versions.
This option is equivalent to the location property File_State_Directory.
Intermediate Directory
This option in the HVR UI allows you to specify a directory path for storing intermediate (temporary) files generated during Compare. These files are created during both "direct file compare" and "online compare" operations.
Using an intermediate directory can enhance performance by ensuring that temporary files are stored in a location optimized for the system's data processing needs.
This setting is particularly relevant for target file locations, as it determines where the intermediate files are placed during the Compare operation. If this option is not enabled, the intermediate files are stored by default in the integratedir/_hvr_intermediate directory, where integratedir is the replication DIRECTORY (File_Path) defined for the target file location.
This option is equivalent to the location property Intermediate_Directory.
Intermediate Directory is Local
This option in HVR UI specifies that the Intermediate Directory will be created on the local drive of the file location's server.
This setting is crucial for optimizing performance, as it reduces network latency and avoids potential permission issues associated with remote storage. By storing intermediate files locally, HVR can process data more efficiently, taking advantage of the speed and reliability of local storage.
This option is particularly beneficial when the HVR Agent has access to ample local storage, enabling it to handle large data volumes without relying on networked storage solutions.
This option is equivalent to the location property Intermediate_Directory_Is_Local.
Integrate Limitations
By default, for file-based target locations, HVR does not replicate the delete
operation performed at the source location.