Amazon DynamoDB
Amazon DynamoDB is a fully-managed, proprietary NoSQL database service that is offered as part of Amazon Web Services (AWS).
Supported configurations
Fivetran supports the following Amazon DynamoDB configurations:
Supportability Category | Supported Values |
---|---|
Connector limit per database | No limit |
Features
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | |
History mode | check | |
Custom data | check | |
Data blocking | check | |
Column hashing | check | |
Re-sync | check | |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking | check | |
Authorization via API | check |
Setup guide
Follow our step-by-step Amazon DynamoDB setup guide to connect Amazon DynamoDB with your destination using Fivetran connectors.
Sync overview
Once Fivetran is connected to your database, we pull a full dump of all selected data from your database. We then use Amazon DynamoDB Streams to pull new and changed data at regular intervals.
Pack mode options
Pack modes determine the form in which Fivetran delivers your data. There are two pack modes - packed and unpacked.
NOTE: In the tables below, the text in parentheses next to the column name indicates the data type of that column. For example, "
foo
(INTEGER)" means the column name isfoo
and it stores INTEGER data.
Packed mode (default)
In packed mode, the following source table
{
"foo": 1, <== partion key and/or sort key
"bar": 2,
"nested": {
"baz": 3
}
}
is delivered to your destination as
foo (INTEGER) | data (JSON) |
---|---|
1 | {"foo":1, "bar":2, "nested":{"baz":3}} |
Unpacked mode
Fivetran unpacks one layer of nested fields and infer types.
In unpacked mode, the following source table
{
"foo": 1, <== partion key and/or sort key
"bar": 2,
"nested": {
"baz": 3
}
}
is delivered to your destination as
foo (INTEGER) | bar (INTEGER) | nested (JSON) |
---|---|---|
1 | 2 | {"baz":3} |
Switching pack modes
You can switch pack modes at any time in your Fivetran dashboard. When you change the pack mode for your connector, we automatically perform a full re-sync.
To change the pack mode for your connector, do the following:
- In the connector dashboard, go to the Setup tab.
- Click Edit connection details.
- In the connector setup form, change the Pack mode.
- Click Save & Test.
Schema information
Fivetran tries to replicate the exact schema and tables from your Amazon DynamoDB source database to your destination.
Fivetran-generated columns
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs. For more information about these columns, see our System Columns and Tables documentation.
Type transformations and mapping
As we extract your data, we match Amazon DynamoDB data types to types that Fivetran supports. If we don't support a data type, we automatically change that type to the closest supported type or, in some cases, don't load that data at all. Our system automatically skips columns with data types that we don't accept or transform.
The following table illustrates how we transform your Amazon DynamoDB data types into Fivetran supported types:
Amazon DynamoDB Data Type | Fivetran Data Type | Fivetran Supported | Notes |
---|---|---|---|
NUMBER | BIG_DECIMAL | True | |
STRING | True | We infer the data type based on the value present in the field | |
BINARY | BINARY | True | |
BOOLEAN | BOOLEAN | True | |
NULL | True | We infer the data type based on the non-null value present in the column | |
LIST | JSON | True | |
MAP | JSON | True | |
NUMBER_SET | JSON | True | |
STRING_SET | JSON | True | |
BINARY_SET | JSON | True |
Excluding source data
If you don’t want to sync all the data from your primary database, you can exclude schemas or tables from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.
NOTE: During a sync, if your connector's Schema Change Handling option is set to
Allow columns
orBlock all
, we do not check the Fivetran role's access to tables you have excluded by revoking their permissions. If you have already granted access to these tables and want to include them in your syncs, trigger a schema reload from your connector's dashboard or using the REST API.
Alternatively, you can change the permissions of the Fivetran-specific IAM role you created and restrict its access to certain tables. To restrict the IAM role's access to only specific tables, add the tables names to the Resource
section of the IAM policy as shown in the following example:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "dynamodb:ListTables",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:Scan"
],
"Resource": [
"arn:aws:dynamodb:{region}:{account-number}:table/{table-name}*"
]
}]
}
NOTE: Make sure you add
*
in the Resource element at the end of each table ARN.
We can find the ARN of a table using the following instructions:
- In your AWS console, select DynamoDB, then select Tables.
- Select a table you want to provide access to.
- Go to the General information section and click Additional info
- At the bottom of this section, you will find the Amazon Resource Name (ARN).
Initial sync
When Fivetran connects to a new Amazon DynamoDB database, we scan through each of your selected tables one at a time to fetch your data. We recommend that you have a high provisioned throughput read capacity for your tables so that Fivetran doesn't encounter provisioned throughput throttling error.
Updating data
Fivetran performs incremental updates of any new or modified data from your source database. We use the Amazon DynamoDB streams to fetch only the data that has changed since our last sync.
We merge changes to your tables into the corresponding tables in your destination:
- Every inserted row in the source generates a new row in the destination with
_fivetran_deleted = FALSE
. - Every updated row in the source updates the data in the corresponding row in the destination, with
_fivetran_deleted = FALSE
. - For every deleted row, the
_fivetran_deleted
column value is set toTRUE
for the corresponding row in the destination.
Amazon DynamoDB streams data retention limit
Amazon DynamoDB streams have a retention period of 24 hours. If your syncs fail for more than 24 hours, we automatically trigger automatic re-syncs to make sure we capture the data changes we missed.
Deleted rows
We do not delete rows from your destination. When a row is deleted from the source table, we set the _fivetran_deleted
column value of the corresponding row in the destination to TRUE
.