Here’s how to choose the right relational database for your next project.
MySQL and PostgreSQL are the top two database sources that Fivetran replicates to data warehouses for data analytics. That’s not surprising — they’re also the top two open source relational databases, serving as the back ends of countless commercial, open source, and in-house applications.
Postgres (as it’s often known) and MySQL have both been around for a long time. They’re both secure RDBMSes with support for clustering and network fault tolerance. But for all they have in common, PostgreSQL and MySQL have a number of characteristics that set them apart from each other, as we’ll see in a moment.
But first, a bit of history.
In the beginning was the word, and the word was Ingres. Ingres was one of the first relational database management systems, starting out as a project at UC Berkeley in 1973. It was released under a Berkeley Source Distribution (BSD) license, which made it open source software, and it became the base for many commercial database products. The story of Ingres and all the databases that grew out of it is fascinating, but mostly because it lent (part of) its name to another open source database.
One of the original developers of Ingres returned to Berkeley in 1985 (after founding a company that commercialized Ingres) to develop a successor to Ingres that he named Postgres. The name was officially changed to PostgreSQL to take advantage of the reference to Structured Query Language, but the project uses both names. The first production release, PostgreSQL 6.0, came out in 1997. Now at version 14 (beta), Postgres is developed by an “unincorporated association of volunteers and companies who share code under the PostgreSQL Licence,” according to a project FAQ.
PostgreSQL was not the only option for an open source database back in the 20th century. The ‘90s were a golden age for relational databases. In Sweden, Michael “Monty” Widenius and associates began developing their own RDBMS, dubbed MySQL, in 1994. MySQL is now at version 8.0, and released under both the GNU Public License version 2 (GPLv2) and proprietary licenses.
Unlike PostgreSQL, MySQL has always been under corporate control. Original developer MySQL AB was acquired by Sun Microsystems in 2008, shortly before Sun was itself acquired by Oracle in 2010. The day that the Oracle acquisition was announced, Widenius forked MySQL and founded MariaDB Corp. to develop an RDBMS that wasn’t under the control of a corporation synonymous with a commercial database competitor. (Fun fact: MySQL was named after Widenius’s daughter My. MariaDB is named after his other daughter.)
Neither origin story is likely to become a blockbuster film anytime soon, but either database could be a superhero for your organization. Which is the better choice? The answer depends on your application-specific needs from back-end data infrastructure. Let’s compare the two most popular databases — MySQL and PostgreSQL — across a range of different application needs, first in this table and then in more depth in the following text.
network address, JSON,
XML, HSTORE, arrays,
character, spatial, JSON
|B-tree, hash, GiST, SP-GiST,|
GIN, and BRIN
|Primarily B-tree; R-tree, hash,|
and inverted indexes for
certain data types
|Performance||Suited for applications with|
high volume of both reads
|Suitable for applications with|
high volume of reads
|Security||Access control, multiple|
encrypted connection options
|Access control, encrypted|
Companies that have their
own release of PostgreSQL
may offer support around it.
|Community support, plus|
PostgreSQL is an object-relational database, while MySQL is purely relational. This means PostgreSQL offers more complex data types and allows objects to inherit properties, but it also makes working with PostgreSQL more complex.
PostgreSQL has a single, ACID-compliant storage engine. MySQL has support for 16 different storage engines suitable for different use cases. The default storage engine, InnoDB, provides index-organized tables.
PostgreSQL spawns a new system process with its own memory allocation for each client connection it establishes, so it requires a lot of memory on systems with a high number of client connections. MySQL uses a single process and maintains one thread (or path of execution) per connection, which works well for most applications of less than enterprise scope.
Three common database features are views, triggers, and stored procedures. PostgreSQL has more robust views, and supports materialized views, which can improve performance for complex queries.
Both databases support AFTER and BEFORE triggers for SQL INSERT, UPDATE, and DELETE statements; PostgreSQL also offers an INSTEAD OF trigger, and can execute complex SQL statements in a trigger using functions. Both databases support standard SQL stored procedures, but PostgreSQL offers in addition the ability to call procedures written in languages other than SQL.
Postgres offers a wider variety of data types than MySQL. If your application deals with any of the unique data types it has available, or unstructured data, PostgreSQL may be a better pick. If you’re using only basic character and numeric data types, both databases will suit you.
Databases use indexes to speed up queries. With multiple indexing options to pick from, you can fine-tune your database performance as your data grows to get faster query responses from your database and an improved user experience for your application users.
Both databases support user and group management and granting SQL privileges to roles. PostgreSQL supports IP-based client filtering and authentication using PAM and Kerberos, while MySQL supports PAM, native windows services, and LDAP for user authentication. In terms of security, the two databases have comparable options.
You can judge for yourself each vendor’s available documentation, since you can find each online. Postgres, because it’s not developed by a company, lacks support engineers, but its community support forums are unusually good. MySQL offers similar forums, as well as paid support plans you can buy from Oracle.
If you’re developing an application with a database back end, which of the two should you use? Consider PostgreSQL for any application that might grow to enterprise scope, with complex queries and frequent write operations. If you’re new to the world of databases and don’t expect your application to scale up, or you’re looking for a quick tool for prototyping, then consider MySQL.
That’s a good rule of thumb, but there are other considerations. Your cloud platform provider might offer benefits when it comes to running one database over the other, or the application framework you use might be better suited for one, or your fellow developers may have opinions. MySQL is more widely used than PostgreSQL, which means more developers and DBAs are familiar with it, and more third-party tools are available for it.
And of course Postgres and MySQL aren’t your only two database choices, or even your only two open source database choices.
Where do PostgreSQL and MySQL play in the data analytics world? Both are operational databases, meant for transaction processing and not for analytics.
To get a complete 360-degree view of your business, you should extract transactional data from your PostgreSQL or MySQL database and feed it into a data warehouse, where it can be used to power your data analytics and business intelligence (BI) tools like Tableau and Looker. This is where Fivetran can help — our automated, maintenance-free pipelines allow you to extract data from not only databases but also applications, files, and events, then securely load it into your data warehouse.
Extracting, synchronizing, and integrating data across an enterprise is a complex problem that requires serious effort to get right. Join us for a demo of Fivetran and learn how to bring your different data sets into one warehouse, speeding up your ability to unlock actionable insights and increase top-line revenue.