Running MySQL/MariaDB at scale (1)


Defining scalability

MySQL is really popular. When it was first released in the mid-nineties as one of the first free open source SQL database systems, it took the computing world by storm. Since then it has held the number one position in its category (relational) uncontested. Not even Microsoft or Oracle, who now owns MySQL, were able to oust it with their competitor products. There is a good reason for this. MySQL is a workhorse. It is free, mature and feature-rich. It covers the requirements of typical business applications. The same is obviously true for MariaDB, which is essentially a different flavour of the same product. MariaDB was forked off the MySQL codebase in 2010 and is largely compatible and exchangeable with MySQL.

Because so many organisations use MySQL for a long time, it is natural that the requirements of some of these organisations have grown. MySQL does a lot of things very well, but does it also scale well? What does it even mean to scale MySQL?

MariaDB/MySQL Scaling

There are different dimensions of scalability which introduce different challenges. The first one is size and it relates to the amount of data stored by the database. The size dimension is complex, because it is multidimensional in itself. There are the number of rows and columns in a table, the table size, the row size, the amount of data in a single database and several other parameters that need to be considered. MySQL has hard limits for all of these, though they are very generous. For example, the theoretical limit for an InnoDB table at 4KB page size is 16TB. However, you will probably never have a table growing that large, because MySQL performance begins to degrade much earlier.

Hence, the real question about size is: what is practical with MySQL/MariaDB? Regarding table size the rule of thumb is that performance begins to degrade when the number of rows reaches eight digits and/or the table size reaches Gigabytes. Obviously, these limits are fuzzy and cannot be pinned down exactly, as they depend on data definitions, storage engine and configuration. But there is definitely a point when you need to break down tables into smaller units. This is done typically by sharding. Sharding a table means to partition it across rows into smaller units which are then distributed across different physical locations or even across different database servers. There is software that can do this automatically, such as MySQL Cluster, which provides auto-sharding. But let’s get back to the topic of scale dimensions before we go into clustering and sharding.

The second dimension of database scalability is load. This relates to the number of commands that can be executed on a database system in a given time. The key performance indicator here is QPS which stands for query per seconds. An average standalone MySQL database server can handle somewhere between a few hundred and a few thousand QPS. Once again, these limits are fuzzy as they depend on the data and queries themselves. The rule of thumb is that if your MySQL database load reaches multiple thousand QPS, you need to think about scaling it. The performance of some queries will degrade before this load level is reached, but probably not because of concurrency. The problem with scaling load is that with MySQL it is much harder to scale load horizontally than size. Therefore, many users choose to scale MySQL vertically instead, if they have to handle increased work loads.

Vertical scaling is the easiest and possibly also the cheapest option. However, there is a limit to how many processors and how much physical RAM you can add to a single machine. Once the database load exceeds these physical limits, there is very little you can do to prevent the system from degrading or even failing. Therefore, it is an attractive option only if you expect database load to be relatively constant over time. Another common approach to scale database load is to add some sort of in-memory processing. This is especially suitable for applications that need to deliver “hot data” that is to say a narrow subset of the entire data range that is requested/processed at high frequencies. For example, a server that delivers financial data may contain current data which is accessed at a much higher rate than historical data.

While MySQL and MariaDB have their own internal caching mechanisms, it is much more effective to add an external cache at the application layer to serve predictable queries. This could be Memcached, Redis, Ignite, GridGain, or a similar product. The performance gain from these technologies is substantial, however, it comes at a cost. First of all, you now need to synchronise the two locations where data is stored, which is not a trivial task. Second, your applications have to use an additional API and perform routing and decide whether to retrieve data from the cache or the database backed. The latter can possibly be taken over by an SQL router, such as ProxySQL which then effectively hides the caching implementation from the applications. Either way, the operational complexity increases significantly and with it, unfortunately, the possibility for bugs and failure points.

The third dimension of database scalability is availability. You may also have heard the terms resilience, redundancy, fault-tolerance, disaster recovery, five nines uptime and so forth. All of these relate to the percentage of time during which a database system is available to its users over its lifetime. In many scenarios, downtime is highly undesirable and costly and in some it is even catastrophic (think air traffic control). High availability (HA) is achieved by eliminating single points of failure, by correct and immediate failure detection, and by providing reliable crossover mechanisms. Fault-tolerant database systems operate on the principle of redundancy. Both data and physical components are duplicated. In case of failure, the duplicate replaces the function of the original component or respectively provides data storage replacement. This crossover must happen automatically in order to achieve a significant (99.9% or higher) HA. Why is HA included into the dimensions of scalability? Because it is not an architectural feature, but a functional requirement. As such, it is comparable to the requirement for increased size or increased processing capacity. All of these have implications for system design.

In the next section of this article we will look at the options you have for scaling size, load and availability with MySQL and MariaDB.