Capture
Action Capture instructs Fivetran HVR to capture changes from a location. Various parameters are available to modify the functionality and performance of capture.
For a database location, HVR gives you an option to capture changes using the location's Capture_Method property (log-based or trigger-based method). HVR recommends using the log-based data capture because it has less impact on database resources as it reads data directly from its logs, without affecting transactions, manages large volumes of data and supports more data operations, such as truncates, as well as DDL capture. In contrast, the trigger-based data capture creates triggers on tables that require change data capture, so firing the triggers and storing row changes in a shadow table slow down transactions and introduces overhead.
When defined on a file location this action instructs HVR to capture files from a file location's directory. Changes from a file location can be replicated both to a database location and to a file location if the channel contains table information. In this case any files captured are parsed (see action FileFormat).
If Capture is defined on a file location without table information then each file captured is treated as a 'blob' and is replicated to the integrate file locations without HVR recognizing its format. If such a 'blob' file channel is defined with only actions Capture and Integrate (no parameters) then all files in the capture location's directory (including files in sub-directories) are replicated to the integrate location's directory. The original files are not touched or deleted, and in the target directory the original file names and sub-directories are preserved. New and changed files are replicated, but empty sub-directories and file deletions are not replicated.
Bidirectional replication (replication in both directions with changes happening in both file locations) is not currently supported for file locations. File deletion is not currently captured by HVR.
If Capture is defined on a file location without parameter DeleteAfterCapture and location property File_State_Directory is used to define a state directory outside of the file location's top directory, then HVR file capture becomes read only; write permissions are not needed.
Parameters
This section describes the parameters available for action Capture. By default, only the supported parameters for the selected location type(s) are displayed in the Capture dialog.
Following are the two tabs/ways, which you can use for defining action parameters in this dialog:
- Regular: Allows you to define the required parameters by using the UI elements like checkbox and text field.
- Text: Allows you to define the required parameters by specifying them in the text field. You can also copy-paste the action definitions from HVR documentation, emails, or demo notes.
Parameter | Argument | Description |
---|---|---|
IgnoreSessionName | sess_name | Instruct the capture job to ignore changes performed by the specified session name. Multiple ignore session names can be defined for a job, either by defining IgnoreSessionName multiple times or by specifying a comma-separated list of names as its value. Normally HVR capture avoids recapturing changes made during HVR integration by ignoring any changes made by sessions named hvr_integrate. This prevents looping during bidirectional replication but means that different channels ignore each other's changes. The session name actually used by integration can be changed using action Integrate with parameter SessionName. For more information, see Managing Recapturing Using Session Names. If this parameter is defined for any table with log based capture, then it affects all tables captured from that location. An example for using this parameter is available in section Using IgnoreSessionName (below). |
Coalesce | Causes coalescing of multiple operations on the same row into a single operation. For example, an INSERT and an UPDATE can be replaced by a single INSERT; five UPDATEs can be replaced by one UPDATE, or an INSERT and a DELETE of a row can be filtered out altogether. The disadvantage of not replicating these intermediate values is that some consistency constraints may be violated on the target database. This parameter should not be used together with parameter SapUnpack in action Transform. | |
NoBeforeUpdate | Do not capture 'before row' for an update unless it is a key update. By default when an update happens HVR will capture both the 'before' and 'after' version of the row. This lets integration only update columns which have been changed and also allows collision detection to check the target row has not been changed unexpectedly. Defining this parameter can improve performance, because less data is transported. But that means that integrate will update all columns (normally HVR will only update the columns that were actually changed by the update statements and will leave the other columns unchanged). If this parameter is defined for any table with log based capture, then it affects all tables captured from that location. | |
NoTruncate | Do not capture SQL truncate table statements such as TRUNCATE in Oracle and modifymytblto truncated in Ingres. If this parameter is not defined, then these operations are replicated using hvr_op value 5. For Db2 for z/OS, this parameter affects only TRUNCATE IMMEDIATE. HVR will always capture TRUNCATE if used without IMMEDIATE option (this will be replicated using hvr_op value 0). This parameter is not supported for Microsoft SQL Server. | |
AugmentIncomplete | col_type | During capture, HVR may receive partial/incomplete values for certain column types. Partial/incomplete values are the values that HVR cannot capture entirely due to technical limitations in the database interface. This parameter instructs HVR to perform additional steps to retrieve the full value from the source database, this is called augmenting. This parameter also augments the missing values for key updates.
Valid values for col_type are:
In certain situations, the default behavior changes and you should not override it to a 'lesser' value to prevent data inconsistency.
|
IgnoreCondition | sql_expr | Ignore (do not capture) any changes that satisfy expression sql_expr (e.g. Prod_id < 100). This logic is added to the HVR capture rules/triggers and procedures. This parameter differs from the Restrict /CaptureCondition as follows:
|
IgnoreUpdateCondition | sql_expr | Ignore (do not capture) any update changes that satisfy expression sql_expr. This logic is added to the HVR capture rules/triggers and procedures. This parameter requires location property Capture_Method set to DB_TRIGGER. |
HashBucketsIngres | int | Identify the number int of hash buckets, with which the capture table is created. This implies that Ingres capture tables have a hash structure. This reduces the chance of locking contention between parallel user sessions writing to the same capture table. It also makes the capture table larger and I/O into it sparser, so it should only be used when such locking contention could occur. Row level locking ( |
HashKeyIngres | col_list | Identify the list of columns col_list, the values of which are used to calculate the hash key value. The The key specified does not have to be unique; in some cases concurrency is improved by choosing a non-unique key for hashing. This parameter requires location property Capture_Method set to DB_TRIGGER. |
DeleteAfterCaptureFile/FTP/Sharepoint | Delete file after capture, instead of capturing recently changed files. If this parameter is defined, then the channel moves files from the location. Without it, the channel copies files if they are new or modified. | |
PatternFile/FTP/Sharepoint | pattern | Only capture files whose names match pattern. The Possible patterns are:
|
IgnorePatternFile/FTP/Sharepoint | pattern | Ignore files whose names match pattern. For example, to ignore all files underneath sub-directory qqq specify ignore pattern qqq/**/*. The rules and valid forms for IgnorePattern are the same as for Pattern, except that 'named patterns' are not allowed. |
IgnoreUnterminatedFile/FTP/Sharepoint | pattern | Ignore files whose last line does not match pattern. This ensures that incomplete files are not captured. This pattern matching is supported for UTF 8 files but not for UTF 16 file encoding. |
IgnoreSizeChangesFile/FTP/Sharepoint | Changes in file size during capture is not considered an error when capturing from a file location. | |
AccessDelayFile/FTP/Sharepoint | secs | Delay reading file for secs seconds to ensure that writing is complete. HVR will ignore this file until its last create or modify timestamp is more than secs seconds old. |
UseDirectoryTimeFile/FTP/Sharepoint | When checking the timestamp of a file, check the modify timestamp of the parent directory (and its parent directories), as well as the file's own modify timestamp. This can be necessary on Windows when parameter DeleteAfterCapture is not defined to detect if a new file has been added by someone moving it into the file location's directory; on Windows file systems moving a file does not change its timestamp. It can also be necessary on Unix/Windows if a sub-directory containing files is moved into the file location directory. The disadvantage of this parameter is that when one file is moved into a directory, then all of the files in that directory will be captured again. This parameter cannot be defined with parameter DeleteAfterCapture (it is not necessary). |
Writing Files while Fivetran HVR is Capturing Files
It is often better to avoid having HVR capture from files while they are still being written. One reason is to prevent HVR replicating an incomplete version of the file to the integrate machine. Another problem is that if parameter DeleteAfterCapture is defined, then HVR will attempt to delete the file before it is even finished.
Capture of incomplete files can be avoided by defining AccessDelay or IgnoreUnterminated.
Another technique is to first write the data into a filename that HVR capture will not match (outside the file location directory or into a file matched with IgnorePattern) and then move it when it is ready to a filename that HVR will match. On Windows this last technique only works if DeleteAfterCapture is defined, because the file modify timestamp (that HVR capture would otherwise rely on) is not changed by a file move operation.
A group of files can be revealed to HVR capture together by first writing them in sub-directory and then moving the whole sub-directory into the file location's top directory together.
- If column hvr_op is not defined, then it default to 1 (insert). Value 0 means delete, and value 2 means update.
- Binary values can be given with the format attribute (see example above).
- If the name attribute is not supplied for the <column> tag, then HVR assumes that the order of the <column> tags inside the <row> matches the order in the HVR repository tables (column col_sequence of the HVR_COLUMN repository table).
Examples
This section includes an example of using the parameter IgnoreSessionName.
Using IgnoreSessionName
HVR allows to run a purge process on an Oracle source location without stopping active replication. Purging is deleting obsolete data from a database. To ensure that the deleted data does not replicate to a target location, the purge process must be started by a database user (e.g. PurgeAdmin) other than the user (e.g. hvruser) under which the replication process is running, and HVR must be configured to ignore the session name of the PurgeAdmin.
The steps for implementing this scenario are as follows:
In a source database, create a new user PurgeAdmin that will run a purge script against this database.
Grant the applicable permissions to user PurgeAdmin, e.g. a privilege to delete rows in another schema:
grant delete any table to PurgeAdmin;
In the UI, update action Capture defined on the existing channel by adding parameter IgnoreSessionName:
- In the Actions panel, click on the row containing action Capture.
- In the Action: Capture dialog, select parameter IgnoreSessionName and specify the user name PurgeAdmin.
- Click OK.
Perform Activate Replication to re-activate the capture job and apply the changes in action Capture:
- Click Only Specific Replication Components and enable only Jobs. Activating Replication with option Jobs will suspend and restart the affected jobs automatically.
- Click Only Specific Replication Components and enable only Jobs. Activating Replication with option Jobs will suspend and restart the affected jobs automatically.