AWS Lambda

AWS Lambda is a serverless computing platform that runs code in response to events and automatically manages the compute resources required by that code.

Connector SDK is now generally available.

If you're considering building a custom connector, we recommend you use the Fivetran Connector SDK. Connector SDK offers a simpler development experience and your custom connector is hosted by Fivetran.

Features

Feature Name	Supported	Notes
Capture deletes		The function must include the `softDelete` field in its response.
History mode
Custom data		All tables and fields
Data blocking		Column level
Column hashing
Re-sync		Connection level
API configurable		API configuration
Priority-first sync
Fivetran data models
Private networking		AWS PrivateLink
Authorization via API

Supported languages

AWS Lambda supports the following languages:

Node.js (JavaScript)
Python
Java (Java 8 compatible)
C# (.NET Core)
Go

Function request and response

Fivetran uses the AWS SDK for the AWS Lambda Functions connector. For more information about how Fivetran syncs data from your cloud function, see our Sync overview documentation.

Request format

Fivetran's request has a standard format. It is a JSON object with the following fields:

agent is an informational object.
state is a JSON object that contains cursors from the previous successful function execution. It is key to performing incremental updates. A cursor is a bookmark that marks the data Fivetran has already synced, for example, a timestamp, ID, or index. For the initial sync, state is an empty JSON object {}. Fivetran expects an updated state object in every response.
For more information about the state object, see How to Use the state Object.
secrets (optional) is a JSON object that contains access keys or API keys for the upstream APIs. Secrets allow you to store information (API tokens or database passwords) that you don’t want to maintain in your code. We use encryption at rest to store the secrets. We pass the secrets into your function every time we call the function.
For Lambda connections created on or after October 4, 2022, in the connection setup form, click + Add secrets and then specify the key-value pairs. For example, if you want to pass the secrets as {'apiKey': 'yourApiKey', 'consumerKey': 'test'}, add a key-value pair for each entry in the JSON structure. From all the key-value pairs, we construct the JSON object and then pass it to the Lambda function. For example, if you add ('apiKey', 'yourApiKey'), ('consumerKey': 'test') as key-value pairs in the setup form, Fivetran passes secrets: {'apiKey': 'yourApiKey', 'consumerKey': 'test'} to the Lambda function.
For Lambda connections created before October 4, 2022, in the connection setup form, modify your secrets using a JSON format if it is already configured, otherwise use + Add secrets. For more information about the secrets object, see How to Use the secrets Object.
customPayload (optional) is a JSON object as a set of key-value pairs that can be used to specify custom information. We pass the custom payloads into your function every time we call the function.
setup_test (optional) is a boolean that lets the function know that Fivetran has invoked the function for the setup tests. The function runs a lightweight job to test the connectivity and returns a JSON object with the hasMore field set to false.
We don't add this field to the request during syncs.
sync_id (optional) is the Fivetran sync identifier (UUID). You can find the sync_id in your connection's dashboard logs and use it to debug and link function logs with connection logs.
When the setup_test field is set to true, we add the setup-test value to the sync_id field in the request. We call the function once with the sync_id, and if we get an error, we then call the function without the sync_id.
bucket (optional) is a string that provides the S3 bucket name the function needs to use to push data if you have opted to use the Sync through S3 bucket method when configuring the connection.
file (optional) is a string that provides the file name the function needs to create when pushing data to the bucket if you have opted to use the Sync through S3 bucket method when configuring the connection.

Example request

{
    "agent" : "Fivetran AWS Lambda Connector/<external_id>/<schema>",
    "state": {
        "cursor": "2020-01-01T00:00:00Z"
    },
    "secrets": {
        "apiToken": "abcdefghijklmnopqrstuvwxyz_0123456789"
    },
    "customPayload": {
        "samplePayload": "payload_value"
    },
    "sync_id": "468b681-c376-4117-bbc0-25d8ae02ace1",
    "bucket": "s3-test-bucket",
    "file": "1657006633.json"
}

In this example,

external_id is the unique ID tied to your connection. You can find the ID in your connection setup form. You need this to configure your AWS account to connect with Fivetran.
schema is the destination schema name you enter when you first set up your connection.

Response format

The response is a JSON object with the following fields:

state contains the updated state value(s).
insert (optional) specifies the entities and records to be inserted. Fivetran reads the data and infers the data type and the number of columns. Fivetran doesn't consider this field's content if you have opted to use the Sync through S3 bucket method when configuring the connection.
delete (optional) specifies the entities and records to be deleted. Use this field to mark records as deleted. Fivetran doesn't delete the record; instead it marks the record as deleted by setting _fivetran_deleted column value to true. If you specify the delete field values, you must also specify the schema field values. Fivetran doesn't consider this field's content if you have opted to use the Sync through S3 bucket method when configuring the connection.
Fivetran creates the _fivetran_deleted column in the destination table, only if your function response has the delete field.
schema (optional) specifies primary key columns for each entity. You must be very consistent with the schema field and the primary key columns to avoid any unwanted behavior. If you don’t specify the schema, Fivetran appends the data.
hasMore is an indicator for Fivetran to make a follow-up call for fetching the next set of data. Fivetran keeps making repeated calls until it receives hasMore = false.
For more information about the hasMore field, see How to Use the hasMore Object.
softDelete(optional) specifies the list of entities to be soft deleted. Fivetran marks the records of these entities as deleted by setting the value of the _fivetran_deleted column to true. We recommend that you use this field if you do not want to specify the individual records to be deleted in the delete field. If table is specified in both delete and softDelete, delete section will be of no effect.

Example response

{
    "state": {
        "transaction": "2020-01-02T00:00:00Z",
    },
    "insert": {
        "transaction": [
            {"id":1, "amount": 100},
            {"id":3, "amount": 50}
        ],
    },
    "delete": {
        "transaction": [
            {"id":2},
        ],
    },
    "schema" : {
        "transaction": {
            "primary_key": ["id"]
        },
    },
    "hasMore" : false,
    "softDelete" : ["transaction"]
}

In this example,

state contains the transaction cursor.
transaction is an entity. Fivetran creates the TRANSACTION table with id and amount columns.
The function inserts records 1 and 3 into the TRANSACTION table.
The function marks record 2 as deleted from the TRANSACTION table.
hasMore is set to false to indicate that there are no more records.
softDelete marks all the records of the TRANSACTION table as deleted by setting the value of its _fivetran_deleted column to true.

Storage bucket file format

If you have opted to use the Sync through S3 bucket method when configuring the connection, you must provide the data in a S3 bucket file.

The file should contain data in a JSON format with the following fields:

insert specifies the entities and records to be inserted. Fivetran reads the data and infers the data type and the number of columns.
delete (optional) specifies the entities and records to be deleted. Use this field to mark records as deleted. Fivetran doesn't delete the record; instead it marks the record as deleted by setting _fivetran_deleted column value to true. If you specify the delete field values, you must also specify the schema field values.
Fivetran creates the _fivetran_deleted column in the destination table, only if your function response has the delete field.

Example response

{
    "insert": {
        "transaction": [
            {"id":1, "amount": 100},
            {"id":3, "amount": 50}
        ],
    },
    "delete": {
        "transaction": [
            {"id":2},
        ],
    }
}

In this example,

transaction is an entity. Fivetran creates the TRANSACTION table with id and amount columns.
The function inserts records 1 and 3 into the TRANSACTION table.
The function marks record 2 as deleted from the TRANSACTION table.

Custom error handling

Cloud functions may fail due to various reasons, including code execution errors, runtime issues, or internal errors. Add an error handling mechanism in your Lambda function response to report an error on your Fivetran dashboard.

Design your Lambda function to report an error:

Use the errorMessage field in your response to indicate function execution errors. Fivetran creates an Error on the connection dashboard with your custom error message. For example, for the following response, Fivetran creates a This is an error Error on your dashboard:
```
{
"errorMessage": "This is an error"
}
```
Use the errorType and stackTrace fields to pass additional information about the error. You must specify the errorMessage field to use the errorType and stackTrace fields. For example, for the following response, Fivetran creates an Error alert on your dashboard with the error type and stack trace details:
```
{
"errorMessage": "name 'response' is not defined",
"errorType": "NameError",
"stackTrace": [
    [
    "/var/task/lambda_function.py",
    35,
    "lambda_handler",
    "response['errorMessage'] = \"This is an error\""
    ]
]
}
```

The following sample function demonstrates how you can use custom error handling:

import json
import requests

def lambda_handler(request, context):
   try:
       url = "https://api.example.com/resource"
       data = {"key1": "value1", "key2": "value2"}

       response = requests.get(url, data=data)
       response.raise_for_status()  # Raise an exception for non-200 status codes

       # Process successful response data here

   except requests.exceptions.RequestException as e:
       response = {}
       response["errorMessage"] = "name 'response' is not defined"
       response["errorType"] = "NameError"
       response["stackTrace"] = "--Stack trace of the error--"
       return response

Setup guide

Follow our step-by-step AWS Lambda setup guide to connect AWS Lambda with your destination.

Limitations

AWS Lambda limits invocation payloads to 6 MB. Because Fivetran captures the data from the Lambda invocation response, the data size is limited to 6 MB.

Fivetran recommendations

We recommend using the Sync through S3 bucket method if your data size exceeds 6 MB. In this sync method, we expect that your function pushes the data to a file in your S3 bucket. You must provide the file’s object key in the Lambda request. After Lambda completes the function execution, we fetch the data from the S3 bucket and sync it to your destination.

View function logs

You can access detailed logs about your function and request processing. You can use these logs to track and debug errors:

Access and analyze function logs in AWS Cloudwatch. For more information, see Using Amazon CloudWatch logs with AWS Lambda.
Use Fivetran's AWS Cloudwatch log service to connect and stream Fivetran log events.
Fivetran generates and logs several types of data related to your account and destinations. You can use these logs to monitor your connections, track your usage, and audit changes. Use the Fivetran Platform Connector to deliver your logs to a schema in your destination.

Frequently asked questions

For more information about cloud functions, see the following: