OpenSearch Betalink
Updated November 16, 2023
OpenSearch is a fork of Elasticsearch. It is a document-based NoSQL database. It stores JSON documents in a distributed, RESTful search and analytics engine built on Lucene.
Supported serviceslink
Fivetran supports the following OpenSearch services:
Supported configurationslink
Fivetran supports the following OpenSearch configurations:
Supportability Category | Supported Values |
---|---|
Database versions | 1.0.0 or above (OpenSearch) 1.4.0 - 1.13.3 (Open Distro) |
Connector limit per database | No limit |
Transport Layer Security (TLS) | TLS 1.1 - 1.3 |
Known limitationslink
OpenSearch field names are case-sensitive, but columns are case-insensitive in your Fivetran destination. We therefore reject fields with the same name but different capitalization as duplicate columns. For example, Field
and field
are different field names in OpenSearch but would both map to the field
column in your destination. To avoid this error, use different names for each field.
Learn more in Fivetran's naming conventions documentation.
Featureslink
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
Custom data | check | |
Data blocking | check | Column level |
Column hashing | check | |
Re-sync | check | Connector and table level |
History | ||
API configurable | ||
Priority-first sync | ||
Fivetran data models | ||
Private networking |
Setup guidelink
For specific instructions on how to set up your OpenSearch connector, see the guide for your service type:
Sync overviewlink
Once Fivetran is connected to your OpenSearch instance, we fetch all historical data up to the present state. We then sync only the most recent inserts and updates at regular intervals using the sequence number and version fields on the documents. We capture deleted data using Fivetran Teleport Sync.
Fivetran Teleport Synclink
Fivetran Teleport Sync is a proprietary database replication method that offers the completeness of snapshots while approaching the speed of log-based systems. With this sync mechanism, Fivetran can incrementally sync deleted data with no additional setup other than a read-only connection.
Fivetran Teleport Sync's queries perform the following operations on your OpenSearch instance:
- Do a full scan of each synced index's unique IDs
- Aggregate a compressed unique ID snapshot in the instance's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your OpenSearch instance:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Schema informationlink
Fivetran tries to replicate the exact indices from your OpenSearch source database to your destination. For every index in the OpenSearch database that you connect to Fivetran, we create a table in your destination that maps to its native schema.
NOTE: We replicate the selected indices to a single schema named after the destination schema name of your choice.
Fivetran-generated columnslink
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs.
Type transformations and mappinglink
As we extract your data, we match OpenSearch data types to types that Fivetran supports. Our system attempts to infer the types of any columns with data types that we don't recognize.
The following table illustrates how we transform your OpenSearch data types into Fivetran supported types:
OpenSearch Type | Fivetran Type | Fivetran Supported |
---|---|---|
BINARY | BINARY | True |
BOOLEAN | BOOLEAN | True |
TEXT | STRING | True |
KEYWORD | STRING | True |
INTEGER | INT | True |
SHORT | SHORT | True |
BYTE | False | |
DOUBLE | DOUBLE | True |
FLOAT | FLOAT | True |
HALF_FLOAT | FLOAT | True |
SCALED_FLOAT | BIGDECIMAL | True |
LONG | LONG | True |
DATE | INSTANT | True |
DATE_NANOS | INSTANT | True |
OBJECT | JSON | True |
NESTED | JSON | True |
JOIN | JSON | True |
ARRAY | JSON | True |
ALIAS | False |
Like Elasticsearch, OpenSearch allows you to put more than one value into a field as an array. Our syncs identify array types from the source table and sync the data as JSON to your destination.
If we are missing an important data type that you need, please reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual data destination pages.
Nested datalink
If your data is nested, we extract the topmost layer of data and sync the rest as JSON. For example, the following source table...
{
"foo": 1,
"bar": 2,
"nested": {
"baz": 3
}
}
content_copy
...is converted to the following table when we load it into your destination:
foo INTEGER | bar INTEGER | nested JSON |
---|---|---|
1 | 2 | {"baz":3} |
Index aliaseslink
We ignore index aliases. We sync only the original index names to your destination.
Excluding source datalink
If you don't want to sync all the data from your master database, you can exclude indices from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.