Extended Data Type Support
Since v5.3.1/19
There are database-specific data types that are not natively supported by HVR. These data types are called "extended data types" in HVR. Different extended data types need different capture and integrate expressions to be defined on a channel to convert them to the ones supported by HVR. Generally, a cast to a varchar or clob column works well, though some types might be better represented by a numeric data type. For nested object types, a stored procedure to serialize the object may be defined. For examples, see section Expression Library below.
When a table with an extended data type is added to a channel, HVR's Table Explore displays the extended data types as a data type name enclosed in special markers: <<datatype>>. The datatype is the name of the data type as defined in a particular database and can be used in data type pattern matching similar to the regular data types.
Configuring HVR for Extended Data Types
Action ColumnProperties with parameters /DataTypeMatch, /CaptureExpression, /CaptureExpressionType, /IntegrateExpression, and /DataType should be defined for converting extended data types to supported data types during capture, compare or refresh (source) and back to extended data types during integrate, compare or refresh (target).
Right-click a channel, navigate to New Action, and click ColumnProperties.
In the New Action: ColumnProperties window, select /DatatypeMatch and choose the required <<datatype>> from the drop-down list. The /DatatypeMatch parameter in combination with the {{hvr_col_name}} pattern in the capture and integrate expressions allow you to define capture expression based on the data type of a column, rather than the name of the column. This parameter fully supports extended data types, allowing you to specify a single expression for all columns of a specific extended data type in all tables in all channels.
Select the /CaptureExpression option and type in the required expression depending on the database involved in the replication. See section Expression Library below for the extended data types in different databases and their relevant capture and integrate expressions.
Select the /CaptureExpressionType option and then select SQL_WHERE_ROW from the drop-down list. The /CaptureExpressionType automatically generates extra SQL required to run the specified capture expression against a table being replicated.
When /CaptureExpressionType=SQL_WHERE_ROW is defined, HVR will:
- execute a capture expression once for each row
- add a FROM clause for the table being replicated
- add a WHERE clause matching the key of the current row.
For more information on the /CaptureExpressionType option, refer to the relevant section of the ColumnProperties page.
Example:
The parameters defined as /CaptureExpression=cast(col as varchar) and /CaptureExpressionType=SQL_WHERE_ROW cause HVR to generate the following SQL expression against a table being replicated:
select (cast(col as varchar)) from schema.table where key1=key1, ...
Select the /IntegrateExpression option and type in the required expression depending on the database involved in the replication. See section Expression Library below for the extended data types in different databases and their relevant capture and integrate expressions.
The following screenshot demonstrates an example setup of the ColumnProperties action to capture an extended data type.
Table Create
The extended data type defined as <<datatype>> in HVR is just a base name of a specific data type without any attributes like NOT NULL, DEFAULT, or allowed values in enumeration-like data types. This might not be sufficient to create a table on a target side. So, to enable table creation on the target side, you need to define action ColumnProperties with the /DatatypeMatch and /DataType parameters on the target location.
Right-click a target location, navigate to New Action and select ColumnProperties from the list.
In the New Action: ColumnProperties window, select the /DatatypeMatch parameter and choose the required extended data type from the drop-down list.
Select the /Datatype parameter and select the same from the drop-down list.
Click the Text tab in the bottom left corner. In the text editor, type the necessary attributes that HVR will put in the CREATE TABLE statement on the target side. An example expression for the /Datatype parameter can be /DataType=<<datatype(42) NOT NULL DEFAULT '(zero)'::datatype>>.
Final Expression
An example expression for the entire channel with columns of extended data type <<datatype>> for table creation can be as follows:
Group | Table | Action/Parameters |
---|---|---|
* | * | ColumnProperties /DatatypeMatch=<<datatype>>, /CaptureExpression="from_datatype({{hvr_col_name}})", /CaptureExpressionType=SQL_WHERE_ROW, /IntegrateExpression="to_datatype({{hvr_col_name}})", /DataType="<<datatype(42) NOT NULL DEFAULT '(zero)'::datatype>>" |
Excluding Extended Data Types from Replication
To ignore replication of an extended data type that is enrolled in the table definitions of the HVR catalogs, define the ColumnProperties /Absent parameter. Since capture of /Absent columns requires a capture expression, you can use a dummy expression to satisfy that requirement. If you add the /CaptureExpression of type SQL_PER_CYCLE, the number of executions of this expression is reduced to one per cycle negating the performance cost. Reasons for doing this include the convenience of enrolling tables as they are in the source database when a target database cannot accept this data type.
Group | Table | Action/Parameters |
---|---|---|
* | * | ColumnProperties /DatatypeMatch=<<datatype>>, /CaptureExpression="0", /CaptureExpressionType=SQL_PER_CYCLE, /Absent |
Advantages and Disadvantages
The primary advantage is that the extended data type feature allows HVR to access data types that are not supported, and allow full flexibility to tune the use of otherwise unsupported or totally custom data types, even in heterogeneous replication scenarios. The downside includes performance cost and the requirement to add custom expressions.
Capture Performance
Since HVR can not process the native representation of extended data types, there is a performance cost of capturing these types. For each row, HVR will need to do a query to the source database to augment in the value as a supported type. This also causes the consistency to change to eventual consistency since there will be a time discrepancy between the commit and the execution of the capture expression in the order of the capture latency.
Integration has no noticeable overhead.
Bulk Refresh Performance
In general, during bulk refresh, HVR uses the bulk load APIs of the target database platform. These, however, require native support for all data types of the interface, and generally, do not allow expressions. If extended data types are present in a table, this table drops down to batched SQL statements for insertion.
Bulk compare or row-wise refresh and compare have no noticeable overhead.
Coercion
HVR coerces data types from the source database to the target database, but can not do this for extended data types. However, HVR ensures that the data type returned by the capture expression is localized to a data type supported by the integrate location. It is the responsibility of the integrate expression to deal with possible incompatibilities that might result from interpreting a value. This means that features like TableProperties /CoerceErrorPolicy only apply to the localization of the data type, not to the processing of the integrate expression.
AdaptDDL
While using AdaptDDL in combination with extended data types the following should be noted:
- Tables with extended data types require expressions, and AdaptDDL can add tables. If AdaptDDL adds tables to your channel, but they do not have any expressions defined beforehand, the channel will fail. If you use /DatatypeMatch to define the expressions of data types that will be adapted in the future, they will be used.
- There is a restriction in comparing extended data types. HVR does not assign meaning to the name the database gives the data type, that is located in the type inside markers << >>. For comparison, HVR considers all extended data types equal to each other and can not detect differences for the purpose of updating channel definitions or executing ALTER TABLE statements on the target.
Restrictions
- Executing capture expressions during Capture requires a WHERE clause containing key information, so a key column with an extended data type cannot be used during Capture.
- Extended data types cannot be used on a primary key column.
Expression Library
If the /CaptureExpression converts an extended data type on a source location to a generic data type that is supported on a target location, then the /IntegrateExpression is not required on the target location.
For example, if /CaptureExpression=cast({{hvr_col_name}} as char) is defined on a MySQL source location for the set data type, the /IntegrateExpression is not required on a Snowflake target location because Snowflake supports the char data type.
MySQL
Extended Data Type: set
- /CaptureExpression=cast({{hvr_col_name}} as char)
- /IntegrateExpression={{hvr_col_name}}
Extended Data Type: enum
/CaptureExpression=coalesce(convert({{hvr_col_name}},char),'')
/CaptureExpressionType=SQL_WHERE_ROW
/IntegrateExpression={{hvr_col_name}} /Datatype=varchar /Length=x
- /Datatype=varchar is not required if the target table is defined with the correct enum data type on the corresponding column. If you are uncertain about the maximum length, clob can be used instead of varchar.
- The value defined in /Length must be the same or greater than the length in the source table's column.
Oracle
Extended Data Type: xmltype
- /CaptureExpression=xmltype.getClobVal({{hvr_col_name}})
- /IntegrateExpression=xmltype.createXml({{hvr_col_name}})
Extended Data Type: SDO_GEOMETRY
- /CaptureExpression=SDO_UTIL.TO_WKTGEOMETRY({{hvr_col_name}})
- /IntegrateExpression=SDO_UTIL.FROM_WKTGEOMETRY({{hvr_col_name}})
PostgreSQL
- Extended Data Type: interval
- /CaptureExpression=cast({{hvr_col_name}} as varchar(100))
- /IntegrateExpression=cast({{hvr_col_name}} as interval)
SQL Server
Extended Data Type: sql_variant
- /CaptureExpression=convert(nvarchar,{{hvr_col_name}}, 1)
- /IntegrateExpression=cast({{hvr_col_name}} as nvarchar)
Using these expressions integrates each values with a BaseType of nvarchar.
- Extended Data Type: geometry
- /CaptureExpression={{hvr_col_name}}.ToString()
- /IntegrateExpression=geometry::STGeomFromText({{hvr_col_name}},0)
- Extended Data Type: geography
- /CaptureExpression={{hvr_col_name}}.STAsText()
- /IntegrateExpression=geography::STGeomFromText({{hvr_col_name}},4326)
- Extended Data Type: hierarchyid
- /CaptureExpression={{hvr_col_name}}.ToString()
- /IntegrateExpression=hierarchyid::Parse({{hvr_col_name}})