Data Blocking and Column Hashing
When you don't want sensitive data such as personally identifiable information (PII) to be synced to your destination, you can use either of the following methods:
- Data Blocking - exclude the tables and columns containing sensitive data from your syncs.
- Column Hashing - hash the values of the columns that store sensitive data. Hashing is a method for anonymizing data by replacing the actual value with a hash value. Using this method, you still sync these columns that now store hashed values to your destination. This lets you join datasets using the hashed columns as keys.
The following sections describe the Data Blocking and Column Hashing features in detail and list their specifics and limitations, as well as provide some useful tips and answers to frequently asked questions.
Data blocking
You can block specific tables and columns from replicating to your destination. Data Blocking lets you avoid sending personally identifiable information (PII) to your destination. This may be helpful as a part of your strategy for GDPR compliance.
NOTE: You cannot block primary key columns.
NOTE: Column blocking does not reduce monthly active rows (MAR). Fivetran still captures and writes a record to the destination even if the only change is in a blocked column.
All Fivetran connectors, except Magic Folder connectors, support data blocking.
Does data blocking prevent data from being stored on Fivetran's servers?
We retain your data for as little time as possible. After you uncheck a table or a column on the Schema tab of the connector details page, the data in that table or column still passes through our systems and may be stored temporarily during a sync. See our Retention of Customer Data documentation for more details.
Learn how to configure Data Blocking and Column Hashing in our Configure Data Blocking and Column Hashing Guide.
Column hashing
Column hashing is a method for anonymizing data in your destination while preserving its analytical value. You can join across data sets without introducing sensitive data to your destination. Because column hashing lets you anonymize personally identifiable information (PII) and store it in your destination, it may help with GDPR compliance.
NOTE: Hashing is a one-way operation, unlike encryption, where data can be encrypted and decrypted. Once data is hashed, it cannot return to its previous state.
All Fivetran connectors, except Magic Folder connectors, support column hashing.
When you select Hashed, the next time your data syncs, Fivetran ingests your data, hashes it, and then writes the hashed data to your destination. To add an extra layer of security, Fivetran uses a unique salt per destination to ensure that the data cannot be decoded based on knowledge of the default Fivetran algorithm.
The salt is per destination so that all identical fields, such as email addresses, are still joinable between all data Fivetran loads into that destination.
NOTE: Once you choose to hash a column, we apply column hashing to all future syncs. If you also wish to hash the historical data in that column, you must re-sync.
If you are loading data with your own process into the same destination and would like to match our hashing, a user with the Account Administrator role of your Fivetran account can view the column hashing salt in the Data Security Settings tab of the chosen destination. The hashing algorithm is SHA-256.
Calculate your own hashes
To generate the same hash value that we use for columns, make sure you have the Account Administrator role, and then perform the following steps:
Add the salt value as a suffix to the original value of the column. For example, if the value of your columns is
foo_bar
and the salt value isi_am_the_secret_salt
, the concatenated value isfoo_bari_am_the_secret_salt
.TIP: If the original column value is
null
, replace the value with an empty string.Convert the value (from Step 1) to byte array (byte[]) format using UTF-8 encoding.
TIP: In Java, you can convert using the
value.toString().getBytes("UTF-8")
command.Use the byte array (from Step 2) as an input to the hashing algorithm.
Convert the output byte array to a Base64 encoded string.
This encoded string is the final hash value.
Learn how to configure Data Blocking and Column Hashing and schema change settings in our Configure Data Blocking and Column Hashing Guide.