OpenSearch Beta
OpenSearch is a fork of Elasticsearch. It is a document-based NoSQL database. It stores JSON documents in a distributed, RESTful search and analytics engine built on Lucene.
Supported services
Fivetran supports the following OpenSearch services:
Supported configurations
Fivetran supports the following OpenSearch configurations:
Supportability Category | Supported Values |
---|---|
Database versions | 1.0.0 or above (OpenSearch) 1.4.0 - 1.13.3 (Open Distro) |
Connector limit per database | No limit |
Transport Layer Security (TLS) | TLS 1.1 - 1.3 |
Known limitations
OpenSearch field names are case-sensitive, but columns are case-insensitive in your Fivetran destination. We therefore reject fields with the same name but different capitalization as duplicate columns. For example, Field
and field
are different field names in OpenSearch but would both map to the field
column in your destination. To avoid this error, use different names for each field.
Learn more in Fivetran's naming conventions documentation.
Features
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
History mode | ||
Custom data | check | |
Data blocking | check | Column level |
Column hashing | check | |
Re-sync | check | Connector and table level |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking |
Setup guide
For specific instructions on how to set up your OpenSearch connector, see the guide for your service type:
Sync overview
Once Fivetran is connected to your OpenSearch instance, we fetch all historical data up to the present state. We then sync only the most recent inserts and updates at regular intervals using the sequence number and version fields on the documents. We capture deleted data using Fivetran Teleport Sync.
Fivetran Teleport Sync
Fivetran Teleport Sync is a proprietary incremental sync method that can incrementally sync deleted data with no additional setup other than a read-only connection.
Fivetran Teleport Sync's queries perform the following operations on your OpenSearch instance:
- Do a full scan of each synced index's unique IDs
- Aggregate a compressed unique ID snapshot in the instance's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your OpenSearch instance:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Schema information
Fivetran tries to replicate the exact indices from your OpenSearch source database to your destination. For every index in the OpenSearch database that you connect to Fivetran, we create a table in your destination that maps to its native schema.
NOTE: We replicate the selected indices to a single schema named after the destination schema name of your choice.
Fivetran-generated columns
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.
Type transformations and mapping
As we extract your data, we match OpenSearch data types to types that Fivetran supports. Our system attempts to infer the types of any columns with data types that we don't recognize.
The following table illustrates how we transform your OpenSearch data types into Fivetran supported types:
OpenSearch Type | Fivetran Type | Fivetran Supported |
---|---|---|
BINARY | BINARY | True |
BOOLEAN | BOOLEAN | True |
TEXT | STRING | True |
KEYWORD | STRING | True |
INTEGER | INT | True |
SHORT | SHORT | True |
BYTE | False | |
DOUBLE | DOUBLE | True |
FLOAT | FLOAT | True |
HALF_FLOAT | FLOAT | True |
SCALED_FLOAT | BIGDECIMAL | True |
LONG | LONG | True |
DATE | INSTANT | True |
DATE_NANOS | INSTANT | True |
OBJECT | JSON | True |
NESTED | JSON | True |
JOIN | JSON | True |
ARRAY | JSON | True |
ALIAS | False |
Like Elasticsearch, OpenSearch allows you to put more than one value into a field as an array. Our syncs identify array types from the source table and sync the data as JSON to your destination.
If we are missing an important data type that you need, reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual data destination pages.
Nested data
If your data is nested, we extract the topmost layer of data and sync the rest as JSON. For example, the following source table...
{
"foo": 1,
"bar": 2,
"nested": {
"baz": 3
}
}
...is converted to the following table when we load it into your destination:
foo (INTEGER) | bar (INTEGER) | nested (JSON) |
---|---|---|
1 | 2 | {"baz":3} |
NOTE: The text in parentheses next to the column name indicates the data type of that column. For example, "
foo
(INTEGER)" means the column name isfoo
and it stores INTEGER data.
Index aliases
We ignore index aliases. We sync only the original index names to your destination.
Excluding source data
If you don't want to sync all the data from your primary database, you can exclude indices from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.