DynamoDB link
Updated 6 days ago
Amazon DynamoDB is a fully-managed, proprietary NoSQL database service that is offered as part of Amazon Web Services (AWS).
Supported configurationslink
Fivetran supports the following DynamoDB configurations:
Supportability Category | Supported Values |
---|---|
Connector limit per database | No limit |
Featureslink
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
Custom data | check | All tables and fields |
Data blocking | check | Column level, table level, and schema level |
Column hashing | check | |
Re-sync | check | Table level |
History | check | Supports history mode. |
API configurable | check | API configuration |
Priority-first sync | ||
Fivetran data models | ||
Private networking | check | AWS PrivateLink (DynamoDB on EC2 only) |
Setup guidelink
Follow our step-by-step DynamoDB setup guide to connect DynamoDB with your destination using Fivetran connectors.
Sync overviewlink
Once Fivetran is connected to your database, we pull a full dump of all selected data from your database. We then use DynamoDB Streams to pull new and changed data at regular intervals.
Pack mode optionslink
Pack modes determine the form in which Fivetran delivers your data. There are two pack modes - packed and unpacked.
Packed mode (default)link
In packed mode, the following source table
{
"foo": 1, <== partion key and/or sort key
"bar": 2,
"nested": {
"baz": 3
}
}
content_copy
is delivered to your destination as
foo INTEGER | data JSON |
---|---|
1 | {"foo":1, "bar":2, "nested":{"baz":3}} |
Unpacked modelink
Fivetran unpacks one layer of nested fields and infer types.
In unpacked mode, the following source table
{
"foo": 1, <== partion key and/or sort key
"bar": 2,
"nested": {
"baz": 3
}
}
content_copy
is delivered to your destination as
foo INTEGER | bar INTEGER | nested JSON |
---|---|---|
1 | 2 | {"baz":3} |
Switching pack modeslink
You can switch pack modes at any time in your Fivetran dashboard. When you change the pack mode for your connector, we automatically perform a full re-sync.
To change the pack mode for your connector, do the following:
- In the connector dashboard, go to the Setup tab.
- Click Edit connection details.
- In the connector setup form, change the Pack mode.
- Click Save & Test.
Schema informationlink
Fivetran tries to replicate the exact schema and tables from your DynamoDB source database to your destination.
Fivetran-generated columnslink
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs.
Type transformations and mappinglink
As we extract your data, we match DynamoDB data types to types that Fivetran supports. If we don't support a data type, we automatically change that type to the closest supported type or, in some cases, don't load that data at all. Our system automatically skips columns with data types that we don't accept or transform.
The following table illustrates how we transform your DynamoDB data types into Fivetran supported types:
DynamoDB Data Type | Fivetran Data Type | Fivetran Supported | Notes |
---|---|---|---|
NUMBER | BIG_DECIMAL | True | |
STRING | True | We infer the data type based on the value present in the field | |
BINARY | BINARY | True | |
BOOLEAN | BOOLEAN | True | |
NULL | True | We infer the data type based on the non-null value present in the column | |
LIST | JSON | True | |
MAP | JSON | True | |
NUMBER_SET | JSON | True | |
STRING_SET | JSON | True | |
BINARY_SET | JSON | True |
Excluding source datalink
If you don’t want to sync all the data from your master database, you can exclude schemas or tables from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Column Blocking documentation.
Alternatively, you can change the permissions of the Fivetran-specific IAM role you created and restrict its access to certain tables. To restrict the IAM role's access to only specific tables, add the tables names to the Resource
section of the IAM policy as shown in the following example:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "dynamodb:ListTables",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:Scan"
],
"Resource": [
"arn:aws:dynamodb:{region}:{account-number}:table/{table-name}*"
]
}]
}
content_copy
NOTE: Make sure you add
*
in the Resource element at the end of each table ARN.
We can find the ARN of a table using the following instructions:
- In your AWS console, select DynamoDB, then select Tables.
- Select a table you want to provide access to.
- Go to the General information section and click Additional info
- At the bottom of this section, you will find the Amazon Resource Name (ARN).
Initial synclink
When Fivetran connects to a new DynamoDB database, we scan through each of your selected tables one at a time to fetch your data. We recommend that you have a high provisioned throughput read capacity for your tables so that Fivetran doesn't encounter provisioned throughput throttling error.
Updating datalink
Fivetran performs incremental updates of any new or modified data from your source database. We use the DynamoDB streams to fetch only the data that has changed since our last sync.
We merge changes to your tables into the corresponding tables in your destination:
- Every inserted row in the source generates a new row in the destination with
_fivetran_deleted = FALSE
. - Every updated row in the source updates the data in the corresponding row in the destination, with
_fivetran_deleted = FALSE
. - For every deleted row, the
_fivetran_deleted
column value is set toTRUE
for the corresponding row in the destination.
DynamoDB streams data retention limitlink
DynamoDB streams have a retention period of 24 hours. If your syncs fail for more than 24 hours, we automatically trigger automatic re-syncs to make sure we capture the data changes we missed.
Deleted rowslink
We do not delete rows from your destination. When a row is deleted from the source table, we set the _fivetran_deleted
column value of the corresponding row in the destination to TRUE
.