Elasticsearch Beta
Elasticsearch is a document-based NoSQL database. It stores JSON documents in a distributed, RESTful search and analytics engine built on Lucene.
Supported services
Fivetran supports the following Elasticsearch services:
- Elastic Cloud
- Self-Hosted Elasticsearch
Supported configurations
Fivetran supports the following Elasticsearch configurations:
Supportability Category | Supported Values |
---|---|
Database versions | 7.10.0 or above |
Connector limit per database | No limit |
Transport Layer Security (TLS) | TLS 1.1 - 1.3 |
Known limitations
- We do not support unmapped fields in an index. Indices that have an index mapping defined with the
dynamic
field set tooff
may contain unmapped fields and may result in sync failures. - Elasticseach field names are case-sensitive, but columns are case-insensitive in your Fivetran destination. We therefore reject fields with the same name but different capitalization as duplicate columns. For example,
Field
andfield
are different field names in Elasticsearch but would both map to thefield
column in your destination. To avoid this error, use different names for each field.TIP: Learn more in Fivetran's naming conventions documentation.
Features
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | |
History mode | ||
Custom data | check | |
Data blocking | check | |
Column hashing | check | |
Re-sync | check | |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking | ||
Authorization via API | check |
Setup guide
For specific instructions on how to set up your Elasticsearch connector, see the guide for your Elasticsearch service type:
Sync overview
Once Fivetran is connected to your Elasticsearch instance, we fetch all historical data up to the present state. We then sync only the most recent inserts and updates at regular intervals using the sequence number and version fields on the documents. We capture deleted data using Fivetran Teleport Sync.
Fivetran Teleport Sync
Fivetran Teleport Sync is a proprietary incremental sync method that can incrementally sync deleted data with no additional setup other than a read-only connection.
Fivetran Teleport Sync's queries perform the following operations on your Elasticsearch instance:
- Do a full scan of each synced index's unique IDs
- Aggregate a compressed unique ID snapshot in the instance's memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your Elasticsearch instance:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Schema information
Fivetran tries to replicate the exact indices from your Elasticsearch source database to your destination. For every index in the Elasticsearch database that you connect to Fivetran, we create a table in your destination that maps to its native schema.
NOTE: We replicate the selected indices to a single schema named after the destination schema name of your choice.
Fivetran-generated columns
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.
Type transformations and mapping
As we extract your data, we match Elasticsearch data types to types that Fivetran supports. Our system attempts to infer the types of any columns with data types that we don't recognize.
The following table illustrates how we transform your Elasticsearch data types into Fivetran supported types:
Elasticsearch Type | Fivetran Type | Fivetran Supported |
---|---|---|
BINARY | BINARY | True |
BOOLEAN | BOOLEAN | True |
TEXT | STRING | True |
KEYWORD | STRING | True |
CONSTANT_KEYWORD | STRING | True |
WILDCARD | STRING | True |
INTEGER | INT | True |
SHORT | SHORT | True |
BYTE | False | |
DOUBLE | DOUBLE | True |
FLOAT | FLOAT | True |
HALF_FLOAT | FLOAT | True |
SCALED_FLOAT | BIGDECIMAL | True |
LONG | LONG | True |
UNSIGNED_LONG | LONG | True |
DATE | INSTANT | True |
DATE_NANOS | INSTANT | True |
OBJECT | JSON | True |
FLATTENED | JSON | True |
NESTED | JSON | True |
JOIN | JSON | True |
ARRAY | JSON | True |
ALIAS | False |
Elasticsearch allows you to put more than one value into a field as an array. Our syncs identify array types from the source table and sync the data as JSON to your destination.
If we are missing an important data type that you need, reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual data destination pages.
Nested data
If your data is nested, we extract the topmost layer of data and sync the rest as JSON. For example, the following source table...
{
"foo": 1,
"bar": 2,
"nested": {
"baz": 3
}
}
...is converted to the following table when we load it into your destination:
foo (INTEGER) | bar (INTEGER) | nested (JSON) |
---|---|---|
1 | 2 | {"baz":3} |
NOTE: The text in parentheses next to the column name indicates the data type of that column. For example, "
foo
(INTEGER)" means the column name isfoo
and it stores INTEGER data.
Index aliases
We ignore index aliases. We sync only the original index names to your destination.
Excluding source data
If you don’t want to sync all the data from your primary database, you can exclude indices from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.