How Fivetran Deals With Data Types
At Fivetran, our goal is to help you centralize your data with as much fidelity as possible so that you can use it to solve pressing business problems. Although storage costs in the age of the cloud are very low, they can still matter at large scales, and data types still influence performance. To that end, Fivetran compares every field that we integrate against a hierarchy of data types and assigns a type of the minimum appropriate size.
In our hierarchy, we regard JSON and TEXT as the largest and most encompassing data types, with theoretical size limits in the gigabytes. Moving downward, every subsequent data type has a smaller maximum memory footprint than its predecessor, though the specifics will vary by the data warehouse you’re using. These data types are universal across all major data warehouses.
Suppose you wanted to store the column
number_of_employees from the Salesforce Accounts table. Since people can only ever be counted in whole units, there is no reason to store the values as 16-byte DECIMAL instead of 8-byte INTEGER. Fivetran will detect that the field consists entirely of integers and assign the data type accordingly. Over one field and a few hundred or thousand rows, the difference between 16 bytes or 8 bytes per record won’t mean much in terms of storage or memory use. These problems add up, though, over multiple fields and millions of rows.
Automatically assigning each field to its appropriate place in the hierarchy is part of our strategy for making data integration as seamless and reliable as possible. Designing databases and tuning data types for performance isn’t easy, and we don’t think you should dirty your hands with gratuitously low-level details like specifying
VARCHAR (10), or suffer slowdowns from designating unsuitable data types. We faithfully replicate your data from its source to your data warehouse so that you can use it immediately.
If you want to see for yourself the impact that this hierarchy can have on your organization, let us know!