Best Practices
Need to get your connector up and running quickly?
Our team of Professional Services experts is available to provide free advisory services to help you build your first Connector SDK connection. This includes guidance on setup, troubleshooting, and best practices. To get started, simply file a support ticket.
Save time nowLearn best practices to leverage the Connector SDK:
- Declare primary keys
- Declare columns and data types
- Use a revision control system
- Consider centralizing deployment to a code deployment system
- Upgrade your Fivetran Connector SDK
- Use
fivetran debug
- Create your git CI/CD pipeline
Declare primary keys
To optimize the uploading of your data to your destination, we recommend that you use the Schema()
method to declare the primary key for each of your tables.
The use of the Schema()
method is optional - Fivetran automatically syncs any table regardless of whether you have declared the table in the Schema()
method or not. For tables not declared, Fivetran creates a surrogate primary key column named _fivetran_id
. See our System Columns documentation for more details.
Declare columns and data types
We do not recommend using the Schema()
method to declare all data columns and their data types. The Connector SDK infers the type of data in any column you send to us using Fivetran's standard data type inference. Declaring data types is useful if you have a particular data type you want to ensure we use.
Use fivetran debug
Use fivetran debug
to test and troubleshoot your connector's behavior with your source's actual data. The tester works by emulating Fivetran's core. However, it is running on your local machine, so it isn't as performant as you will experience when running your connector in production. We recommend running fivetran debug
on multiple small samples of your data to fully test your connector's logic. Deploy to production once you are ready to test with larger data sets. State management and configuration can be used to control exactly what is tested with each run.
Large data set recommendation
With high-volume data sets that take a long time to complete a sync. you must follow the following checkpointing best practices:
- Checkpoint regularly, (for example, every 10 minutes). Checkpointing saves the state, which saves the sync progress. If sync fails with an error, then, without checkpointing, the next sync will start from the previous sync's starting point. With regular checkpointing, the next sync will start from the last checkpoint of the previous failed sync, thus saving time.
- Avoid calling checkpoints too frequently during long syncs, although frequent checkpoints may be acceptable for shorter syncs. Excessive checkpointing can cause performance issues, so it’s best to checkpoint approximately once every 10 minutes.
Use a revision control system
Any time you are writing code, unexpected events can happen. We strongly recommend using a revision control system like GitHub or Bitbucket to keep all your code changes tracked and secure. This is also hugely beneficial when collaborating on a custom connector with other members of your team.
Consider centralizing deployment to a code deployment system
Deploying code from individual machines is generally not considered good practice. The Connector SDK is a command line-based tool designed to work smoothly with standard automated code integration and code deployment tools.
Upgrade your Fivetran Connector SDK
We recommend using the latest version of the Fivetran Connector SDK, as it contains all the latest definitions and updates. You can easily upgrade it via pip by running the following command:
pip install --upgrade fivetran_connector_sdk