Data integration is a buyer’s market with plenty of choices. So many that it can be challenging to identify and select the right data transformation methodology for your business. In this blog, we will discuss the differences between SQL-coded transformations and tools using a graphical user interface (GUI).
Challenges facing the modern business
Businesses that have yet to adopt a data transformation solution or have a legacy method in place are faced with the following challenges:
- Reduced data team impact: Data teams struggle to keep up with the high volume of requests to add new sources, update existing models, and create new reports. This means teams are spending too much time building and maintaining, instead of focusing on high-value activities like data analytics and exploration.
- Extended time to insights: Access and technical barriers can slow down the integration process. Additionally, we hear from centralized data teams that modeling source data they are unfamiliar with takes a long time.
- Data is outdated: In data analytics platforms where data is handled by disparate systems, latency issues can arise. Data becomes stale and unusable, leading to eroded trust by data consumers.
- Unreliable data: A lack of transparency, visibility and collaboration amongst team members results in duplicated data efforts, causing conflicting results and inaccurate reporting.
Are SQL or GUI transformations better at addressing these challenges and integrating into the modern data stack?
Graphical user interfaces (GUIs)
Instead of writing code, GUI tools offer users pre-designed components to drag and drop onto a canvas. The component parameters then need to be set by the user to configure the component.
Pros
- Easy to access: Low or no code is (seemingly) simple to use.
- Wider visibility: Since jobs are built out visually they are more easily understood by the wider business
Cons
- Steep learning curve: While code-free is perceived to be easier and faster, there is a steep learning curve to learn the new platform. These skills are also less transferable.
- SQL layer is obscured: The components generate SQL in the background, but this is obscured, so it can be difficult to understand what is going on under the hood.
- Vendor lock-in: GUIs are proprietary to each vendor, resulting in vendor lock-in as these jobs are not transferable from one application to another. If a user wants to switch their tool or method, they can expect to rebuild jobs from scratch.
- Limited capabilities: Users can only manipulate data in ways prescribed by the platform’s components. Without free scripting components, this limits what users can do with their data.
SQL scripting
Users can hand-code transformations using SQL, the universal language in databases and data analytics. Users can run SQL from a CLI connected to their cloud data warehouse or from directly within the warehouse.
Pros
- Universal language: Since SQL is so widely used, it is easy to find data professionals experienced with SQL transformations.
- Transferable: Not only are the skills transferable, but the jobs can be ported between platforms (though possibly with some modest translations between dialects). This helps to prevent vendor lock-in.
- Flexible: Unlike the prescripted transformations for components, as long as it is possible in SQL, data analysts and engineers can transform the data to solve for unique business logic.
Cons
- Complexity: As data models grow in complexity and interdependencies, it can be difficult to track hundreds of lines of code.
- Black box: Transformations that are coded in SQL are not easily understood by non-technical data consumers. Because of this, data reliability questions can arise.
- Slow to build jobs: Hand coding has been criticized for being slow to develop.
Modernizing SQL
Many of the cons associated with SQL reflect how SQL has been used in legacy ELT. Luckily, new tooling has been developed and released to alleviate these and address the current data analytics challenges being faced by data teams.
dbt Core, by dbt labs, is one of these modern data transformation providers. More than just a SQL tool, dbt Core comes with a set of advanced features to modernized SQL. These include industry and engineering best practices such as testing, version control and data lineage. Documentation includes directed acyclic graphs (DAGs) to help visualize SQL model dependencies to build trust in the data.
Because of these advancements, we find our customers favor a SQL-based approach to transformations using dbt Core. To support our customers, Fivetran uses dbt Core to power transformations for customers for an automated and reliable solution. We have also built Fivetran data models with prebuilt SQL logic to reduce the technical barrier to entry, and help data teams accelerate the time to insights.
Request a demo to learn more about how Fivetran Transformations for dbt Core* can help your data team modernize your data transformations.
*dbt Core is a trademark of dbt Labs, Inc. All rights therein are reserved to dbt Labs, Inc. Fivetran Transformations is not a product or service of or endorsed by dbt Labs, Inc.