Content Copyright © 2012 Bloor. All Rights Reserved.
Also posted on: The IM Blog
Over the last decade or more a great many companies, then start-ups, have based their business on MySQL. And a lot of those companies have now grown up and MySQL is creaking at the seams: either it cannot handle the volume of transactions that the company is now handling or it is too slow to perform queries and analytics against the collected data, or both.
Not surprisingly, therefore, there are a host of vendors targeting MySQL environments and promising to solve your performance problems. These fall into four broad categories: adding more hardware, replacing MySQL with something else that still looks like MySQL but performs better for OLTP, replacing MySQL with something else that still looks like MySQL but performs better for analytics, and replacing the storage engine.
Hardware upgrades such as implementing SSDs (solid state disks) are expensive and, in my opinion, represent a short term solution at best. Implementing a Schooner appliance may fix your data loading issues but it focuses on just one aspect of the problems without resolving any others: it may make sense if that’s your only area where you have an issue, but that probably isn’t the case. You may also be considering sharding but this creates a significant management overhead.
Or you could opt for an OLTP replacement such as Xeround or an analytics replacement from the likes of Calpont or Infobright. All of these look like MySQL and allow you to port your MySQL schema into the new environment. However, none of them are like MySQL under the covers: Xeround is a NewSQL database and InfiniDB (Calpont) and Infobright are both columnar databases. Personally, I wouldn’t have any qualms about switching to any of these because of concerns about their architecture but I can understand how some people might feel differently.
But the big problem with these replacement databases is that they only solve one half of the equation. If you have a problem with both OLTP and analytic performance then you would need two replacement databases. So maybe a better solution would be to look for a new storage engine that will address both issues. I have been looking at one product, TokuDB from Tokutek, which aims to do just that.
The big difference between TokuDB and the standard storage engines (such as InnoDB) that underpin MySQL is that it uses Fractal Tree Indexes instead of B-Trees where a Fractal Tree Index is just a marketing (and trademarked) name for what is otherwise known as a cache oblivious streaming B-Tree (normal B-Trees are cache aware). The fundamental difference is that a B-Tree has a single cache for the whole tree whereas a Fractal Tree uses multiple caches. The result is that a B-Tree writes to disk much more frequently, and in smaller blocks, than a Fractal Tree-based approach, and is therefore much less efficient.
In particular, one of the problems with B-Trees, whether implemented in MySQL or any other database, is that when it writes data to disk, it mixes in old data with new, which means it needs a lot of writes to get all the new data written. And having more indexes makes it worse because every time you update the data on disk you also have to update all the relevant indexes so that you get a lot of writing to disk, which slows down load speeds. As a result, you have to limit the number of indexes you can support which in turn slows down query response. Fractal Tree Indexes get over this issue. When a Fractal Tree Index is used to write data, it’s all new, which means you do a lot less writing. Further, Fractal Tree Indexes write data in much larger blocks (measured in megabytes as opposed to 16k, which is typical for MySQL) which yields more effective compression . As a result you do not have to short-change your environment in terms of the number and richness of indexes you can support. So you get both better OLTP performance and better query performance. And when I say better I don’t mean just a little bit better but a whole lot better. So, if you’ve got a MySQL problem that spans your whole environment (or, actually, even just a part of it) then TokuDB is well worth looking at.
Some figures are worth including here. Tokutek claims 6 to 20 or even 25 times compression ratios against a practical maximum of around 2.5 for InnoDB. Similarly, the company claims 20 to 80 times faster write performance and, in its latest release, it has added XA compliant replication capability that, the company says “removes the slave lag bottleneck” that plagues most MySQL implementations.
Finally, one further point: Tokutek also supports MariaDB (a close cousin of MySQL) and it also has an interface to support BerkeleyDB. So if you use either of those and have performance issues then you might also think about investigating TokuDB. It will be interesting to see if the company ports its technology to any other database environments in the future.