Set Up Google Cloud Storage Data Lake
This tutorial explains how to configure a Google Cloud Storage (GCS) data lake as your destination using the Fivetran Managed Data Lake Service. It walks through the required GCP setup, Fivetran configuration, and how to validate data ingestion.
Watch the video
What you will learn
This tutorial covers the following steps:
Configure Google Cloud Storage
- Create a Cloud Storage bucket for your data lake
- Select the appropriate storage class and region
- Define the folder structure using prefixes
Configure authentication and access
- Create a service account in your Google Cloud project
- Grant the required permissions to access the bucket
- Generate and securely store service account credentials
- Apply least-privilege access principles
Configure the destination in Fivetran
- Create a new Managed Data Lake Service destination
- Select Google Cloud Storage
- Provide the project ID and bucket details
- Upload or provide service account credentials
- Specify the prefix path
- Select the appropriate cloud region
- Test the connection
Ingest and validate data
- Add a connector to start data ingestion
- Verify that Fivetran writes data to the expected bucket and path
- Confirm schema and file structure
Understand storage formats and catalog integration
- Use the Fivetran Iceberg REST catalog with supported query engines
- Understand how data is written in Delta Lake format
- Enable downstream analytics workflows
Summary
After completing this tutorial, you will have a working Google Cloud Storage data lake integrated with Fivetran and be able to query your data using supported analytics tools.