OpenSearch Beta

OpenSearch is a fork of Elasticsearch. It is a document-based NoSQL database. It stores JSON documents in a distributed, RESTful search and analytics engine built on Lucene.

Supported services

Fivetran supports the following OpenSearch services:

Supported configurations

Fivetran supports the following OpenSearch configurations:

Supportability Category	Supported Values
Database versions	1.0.0 or above (OpenSearch) 1.4.0 - 1.13.3 (Open Distro)
Connection limit per database	No limit
Transport Layer Security (TLS)	TLS 1.1 - 1.3

Known limitations

OpenSearch field names are case-sensitive, but columns are case-insensitive in your Fivetran destination. We therefore reject fields with the same name but different capitalization as duplicate columns. For example, Field and field are different field names in OpenSearch but would both map to the field column in your destination. To avoid this error, use different names for each field.

Learn more in Fivetran's naming conventions documentation.

Features

Feature Name	Supported	Notes
Capture deletes		All tables and fields
History mode
Custom data
Data blocking		Column level
Column hashing
Re-sync		Connector and table level
API configurable		API configuration
Priority-first sync
Fivetran data models
Private networking
Authorization via API

Setup guide

For specific instructions on how to set up your OpenSearch connection, see the guide for your service type:

Sync overview

Once Fivetran is connected to your OpenSearch instance, we fetch all historical data up to the present state. We then sync only the most recent inserts and updates at regular intervals using the sequence number and version fields on the documents. We capture deleted data using Fivetran Teleport Sync.

Fivetran Teleport Sync

Fivetran Teleport Sync is a proprietary incremental sync method that can incrementally sync deleted data with no additional setup other than a read-only connection.

Fivetran Teleport Sync's queries perform the following operations on your OpenSearch instance:

Do a full scan of each synced index's unique IDs
Aggregate a compressed unique ID snapshot in the instance's memory

For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your OpenSearch instance:

1 GB Free RAM
1 Free CPU Core
IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).

Schema information

Fivetran tries to replicate the exact indices from your OpenSearch source database to your destination. For every index in the OpenSearch database that you connect to Fivetran, we create a table in your destination that maps to its native schema.

We replicate the selected indices to a single schema named after the destination schema name of your choice.

Fivetran-generated columns

Fivetran adds the following columns to every table in your destination:

_fivetran_deleted (BOOLEAN) marks rows that were deleted in the source database.
_fivetran_synced (UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.

We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.

Type transformations and mapping

As we extract your data, we match OpenSearch data types to types that Fivetran supports. Our system attempts to infer the types of any columns with data types that we don't recognize.

The following table illustrates how we transform your OpenSearch data types into Fivetran supported types:

OpenSearch Type	Fivetran Type	Fivetran Supported
BINARY	BINARY	True
BOOLEAN	BOOLEAN	True
TEXT	STRING	True
KEYWORD	STRING	True
INTEGER	INT	True
SHORT	SHORT	True
BYTE		False
DOUBLE	DOUBLE	True
FLOAT	FLOAT	True
HALF_FLOAT	FLOAT	True
SCALED_FLOAT	BIGDECIMAL	True
LONG	LONG	True
DATE	INSTANT	True
DATE_NANOS	INSTANT	True
OBJECT	JSON	True
NESTED	JSON	True
JOIN	JSON	True
ARRAY	JSON	True
ALIAS		False

Like Elasticsearch, OpenSearch allows you to put more than one value into a field as an array. Our syncs identify array types from the source table and sync the data as JSON to your destination.

If we are missing an important data type that you need, reach out to support.

In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual data destination pages.

Nested data

If your data is nested, we extract the topmost layer of data and sync the rest as JSON. For example, the following source table...

{
  "foo": 1,
  "bar": 2,
  "nested": {
    "baz": 3
  }
}

...is converted to the following table when we load it into your destination:

foo (INTEGER)	bar (INTEGER)	nested (JSON)
1	2	`{"baz":3}`

The text in parentheses next to the column name indicates the data type of that column. For example, "foo (INTEGER)" means the column name is foo and it stores INTEGER data.

Index aliases

We ignore index aliases. We sync only the original index names to your destination.

Excluding source data

If you don't want to sync all the data from your primary database, you can exclude indices from your syncs on your Fivetran dashboard. To do so, go to your connection details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.