Which is the best open source database: MySQL or PostgreSQL?
Which is the best open source database: MySQL or PostgreSQL?
Which is the best open source database: MySQL or PostgreSQL?
If you plan to choose a free, open source database for your project, then you may be hesitating between MySQL and PostgreSQL. Both MySQL and PostgreSQL are free, open source, powerful, and feature-rich databases.
Your main question may be: Which is the best open source database, MySQL or PostgreSQL? Which open source database should I choose?
When choosing a database, you are making a long-term decision, because it will be very difficult and expensive to change the decision later.
You want to get it right in the first place. Two popular open source databases, MySQL and PostgreSQL, are often the products of last resort.
A high-level overview of these two open source databases will help you choose the one that best suits your needs.
MySQL
MySQL is relatively young, first appearing in 1994. It claims to be the most popular open source database.
MySQL is the M in LAMP (a software package for web development, including Linux, Apache, and Perl/PHP/Python).
MySQL is used by most applications built on top of the LAMP stack, including well-known applications such as WordPress, Drupal, Zend, and phpBB.
From the beginning, MySQL was designed to be a fast web server backend, using the fast Indexed Sequence Access Method (ISAM) and not supporting ACID.
After early rapid development, MySQL began to support more storage engines, and implemented ACID through the InnoDB engine.
MySQL also supports other storage engines, provides the function of temporary tables (using the MEMORY storage engine), and implements high-speed read databases through the MyISAM engine.
In addition, there are other core storage engines and third-party engines.
MySQL documentation is very rich, there are many free reference manuals, books and online documentation of good quality, as well as training and support from Oracle and third-party vendors.
MySQL has gone through ownership changes and some drama in recent years. It was originally developed by MySQL AB, then sold to Sun for $1 billion in 2008, and Sun was acquired by Oracle in 2010.
Oracle supports multiple versions of MySQL: Standard, Enterprise, Classic, Cluster, Embedded, and Community.
Some of them are free to download, while others cost a fee.
Its core code is based on the GPL license, and commercial licenses are available for developers and manufacturers who do not want to use the GPL license.
Now, there are even more databases to choose from based on the original MySQL code, as several core MySQL developers have released forks of MySQL.
Michael “Monty” Widenius, one of the original MySQL creators, seemed to regret selling MySQL to Sun, so he developed his own MySQL fork, MariaDB, which is free and licensed under the GPL.
Drizzle, a branch created by the well-known MySQL developer Brian Aker, has rewritten it a lot, especially optimized for multi-CPU, cloud, network applications and high concurrency.
PostgreSQL
PostgreSQL bills itself as the most advanced open source database in the world. Some fans of PostgreSQL say it is comparable to Oracle, but without the expensive price and arrogant customer service.
It has a long history, originally developed at the University of California, Berkeley in 1985 as a successor to the Ingres database.
PostgreSQL is a completely community-driven open source project maintained by more than 1,000 contributors around the world.
It provides a single full-featured edition, unlike MySQL, which provides multiple different community editions, commercial editions, and enterprise editions.
PostgreSQL is based on the free BSD/MIT license, and organizations can use, copy, modify, and redistribute the code, as long as they provide a copyright notice.
Reliability is PostgreSQL’s highest priority. Known for its rock-solid quality and good engineering, it supports high-transaction, mission-critical applications.
PostgreSQL is very well documented, with a large free online manual and an archived reference manual for older versions.
PostgreSQL’s community support is fantastic, as well as commercial support from independent vendors.
Data consistency and integrity are also high priority features of PostgreSQL. PostgreSQL fully supports ACID features.
It provides strong security guarantees for database access and makes full use of enterprise security tools, such as Kerberos and OpenSSL.
You can define your own checks to ensure data quality according to your own business rules.
Among the many management features, point-in-time recovery (PITR) is a great feature.
This is a flexible high-availability feature that provides capabilities such as creating hot backups for failure recovery and snapshotting and recovery. But this is not all about PostgreSQL.
The project also provides several methods to manage PostgreSQL to achieve high availability, load balancing and replication, etc., so that you can use functions that suit your specific needs.
Platform
Both MySQL and PostgreSQL are present on some high-traffic Web sites:
- MySQL: Slashdot, Twitter, Facebook, and Wikipedia
- PostgreSQL: Yahoo uses a modified PostgreSQL database to handle hundreds of millions of events per day, along with Reddit and Disqus
Both MySQL and PostgreSQL can run on multiple operating systems, such as Linux, Unix, Mac OS X and Windows.
They are all open source and free, so the only cost when testing them is your time and hardware.
They are flexible and scalable, and can be used on both small systems and large distributed systems.
One area where MySQL goes further than PostgreSQL is that its tentacles extend to the embedded area, which is achieved through libmysqld.
PostgreSQL does not support embedded applications and still sticks to the traditional client/server architecture.
MySQL is generally regarded as a fast database backend for websites and applications, capable of fast reads and high-volume query operations, but it is not satisfactory in terms of complex features and data integrity checks.
PostgreSQL is a no-nonsense, full-featured database for transactional enterprise applications, supporting strong ACID features and many data integrity checks.
Both of them are very fast on certain tasks, and the behavior of different MySQL storage engines is quite different.
The MyISAM engine is the fastest because it only performs few data integrity checks, which is suitable for sites with more back-end read operations, but it is a disaster for read/write databases containing sensitive data, because MyISAM tables may eventually be damaged.
MySQL provides tools to repair MySQL tables, but for sensitive data, InnoDB that supports ACID features is a better choice.
In contrast, PostgreSQL is a fully integrated database with a single storage engine. You can improve performance by adjusting parameters in the postgresql.conf file, as well as tuning queries and transactions.
The PostgreSQL documentation provides a very detailed introduction to performance tuning.
Both MySQL and PostgreSQL are highly configurable and can be optimized for different tasks. They both support extensions to add additional functionality.
A common misconception is that MySQL is easier to learn than PostgreSQL. Relational database systems are very complex, and the learning curves for these two databases are actually similar.
Standard Compatibility
PostgreSQL is designed for SQL compatibility (the current standard is ANSI-SQL:2008).
MySQL is compatible with most SQL, but it also has its own extensions to support NoSQL features, which are introduced in the reference manual.
Each approach has pros and cons. Compliant standards make DBAs, database developers, and application developers more comfortable because it means they only need to learn one set of standards, one set of features, and one set of commands.
This saves time, improves efficiency, and doesn’t lock you into a particular vendor.
Proponents of the use of non-standard customizations argue that this enables rapid adoption of new features without having to wait for the standards process to complete. ANSI/ISO standards are constantly evolving, so standard compatibility is also a moving target: the well-known relational databases Microsoft SQL Server, Oracle, and IBM DB2 are only partially compatible with the standard.
Conclusion
Although there are different histories, engines and tools, there is no clear reference as to which of the two databases is suitable for all situations.
Many organizations like to use PostgreSQL because it is reliable, it is good at protecting data, and it is a community project that does not lock itself into a vendor cage. MySQL is more flexible and provides more options to tailor it for different tasks.
Many times, proficiency in the use of a piece of software is more important to an organization than idiosyncratic reasons.
Advantages of PostgreSQL over MySQL
1. The standard implementation of SQL is more perfect than MySQL, and the function implementation is more rigorous;
2. The function support of stored procedures is better than that of MySQL, and it has the ability to cache execution plans locally;
3. The support for table joins is more complete, the function of the optimizer is more complete, the supported index types are many, and the complex query ability is strong;
4. The PG main table is stored in a heap table, and MySQL uses an index-organized table, which can support a larger data volume than MySQL.
5. PG’s primary and backup replication is physical replication. Compared with MySQL’s binlog-based logical replication, data consistency is more reliable, replication performance is higher, and the impact on host performance is smaller.
6. MySQL’s storage engine plug-in mechanism has the problem that the lock mechanism is complex and affects concurrency, but PG does not exist.
Advantages of MySQL over PG
1. InnoDB’s MVCC mechanism based on rollback segments is superior to the XID-based MVCC mechanism in which old and new data of PG are stored together. New and old data are stored together, and VACUUM needs to be triggered regularly, which will bring redundant IO and database object locking overhead, and cause the overall concurrency capability of the database to decline. Moreover, if VACUUM is not cleaned up in time, it may also cause data expansion;
2. MySQL uses index-organized tables. This storage method is very suitable for query and deletion operations based on primary key matching, but there are constraints on the design of the table structure;
3. The optimizer of MySQL is relatively simple, and the implementation of system tables, operators, and data types is very streamlined, which is very suitable for simple query operations;
4. The implementation of MySQL partition table is better than PG’s partition implementation based on inheritance table, which is mainly reflected in the large difference in processing performance after the number of partitions reaches tens of thousands.
5. MySQL’s storage engine plug-in mechanism makes its application scenarios more extensive. For example, in addition to innodb being suitable for transaction processing scenarios, myisam is suitable for static data query scenarios.
Summary
Open source databases are not very perfect, and the commercial database oracle is still a lot more perfect in terms of architecture and functions.
In terms of application scenarios, PG is more suitable for strict enterprise application scenarios (such as finance, telecommunications, ERP, CRM), while MySQL is more suitable for Internet scenarios with relatively simple business logic and low data reliability requirements (such as google, facebook, alibaba ).
Comparing MySQL and PostgreSQL, which one to choose
In order to understand the difference between PostgreSQL and MySQL, I searched for the keyword: MySQL vs PostgreSQL, and read a few articles on the first page. The following is a brief summary:
The difference between MySQL and PostgreSQL
MySQL is a DBMS created by application developers; PostgreSQL is a DBMS created by database developers.
In other words, MySQL tends to the user’s point of view, answering the question “what problem do you want to solve”; while PostgreSQL tends to the theoretical point of view, answering the question “how should the database solve the problem”.
MySQL generally hands over data legality verification to customers; PostgreSQL is stricter in terms of legality. For example, inserting the time “2012-02-30” in MySQL will succeed, but the result will be “0000-00-00”; PostgreSQL does not allow this value to be inserted.
Generally, PostgreSQL is considered feature-rich, while MySQL is considered faster. But this point of view is basically a matter of MySQL 4.x / PostgreSQL 7.x. Now the situation has changed. PostgreSQL has made great improvements in the speed of version 9.x, and MySQL features have also increased.
In terms of architecture, MySQL is divided into two layers: the upper SQL layer and several storage engines (such as InnoDB, MyISAM). PostgreSQL has only one storage engine that provides both functions.
Both of these database systems can be optimized and customized according to the application situation, and it is difficult to say which one has better performance. The MySQL project focused on speed from the beginning, while PostgreSQL focused on features and specification standards from the beginning.
Which one to choose?
Perhaps MySQL is more popular among developers for historical reasons. At least we didn’t hear about PostgreSQL when we were in school.
At that time, it was either MS SQL Server or MySQL, and MySQL was open source. In fact, PostgreSQL did not officially support Windows systems until 8.0.
If there are no historical reasons (for example, the system has been based on MySQL for many years), or technical accumulation reasons (there are many MySQL experts among colleagues), then I think there is nothing wrong with choosing PostgreSQL.
Interestingly, when I searched for “switch postgresql to mysql” on Google, the first page of the results was all “Switch to PostgreSQL from MySQL”, and the second page finally had from PostgreSQL to MySQL, but it was the only one, and The reason is not that PostgreSQL is bad, but because the author has more MySQL experience.
