Distributing MySQL databases

Written By:
Content Copyright © 2014 Bloor. All Rights Reserved.
Also posted on: The IM Blog

ScaleBase is a vendor and product for distributing MySQL databases. It is not a database but a layer of software that sits on top of MySQL databases. So it offers a different solution from those proposed by the likes of NuoDB and Clustrix, which require investment in their specific databases.

Like both of those mentioned, ScaleBase is ACID and XA compliant and offers continuous availability. Because it is independent of the underlying database instances it is irrelevant to ScaleBase as to what those instances are being used for: whether transaction processing or analytics. In either case, ScaleBase’s distributed MySQL database retains all relational characteristics.

Where ScaleBase is particularly cool is in what it calls the Analysis Genie. This is available as a free SaaS and download. The Analysis Genie helps you to decide (it provides out-of-the-box recommendations) how best to distribute the data and provides a percentage score (which will change as the environment changes—so it is an ongoing monitoring tool) for you as to how efficient your data distribution is. The Genie examines your data, data relationships and the use of your data to give you a data distribution policy that is tuned efficiently for your application’s unique requirements. There are facilities provided to examine and distribute any existing MySQL databases that you want to move into the ScaleBase environment, without requiring any change to their applications.

A major differentiator for ScaleBase is that distribution is policy-based and transparent. That is, with ScaleBase you can see and control the variables that impact the distribution policy, whereas competitors take a more black-box style approach. A second differentiator is that ScaleBase views a data distribution policy as dynamic. So the software provides policy lifecycle management capabilities to analyse, monitor, maintain and optimise data distributions (and any required re-distributions) to match your application’s ever-evolving requirements.

Distributed database technologies are growing in importance for a number of reasons. For example, there are many companies wanting to move to the cloud and, especially where this is to be a hybrid environment with some data retained on premise and some in the cloud, then a distributed approach makes sense. Another significant market is for geographically-dispersed environments and a third, which is especially suited to ScaleBase, is for small companies that have the potential to grow rapidly. An example of this last type would be companies building social apps and games for mobiles. Many companies start with MySQL because it is freely available and that’s fine as long as you don’t suddenly develop a mega hit game. Now you suddenly need to hugely expand your database capacity and distributed technology allows you to do that relatively inexpensively and rapidly and without changing your application. The same, of course, would apply to a lot of Web 2.0 companies.

ScaleBase is available in the Cloud (Amazon EC2 and RDS, Rackspace, IBM and so on) and also supports hybrid and on-premise environments.

Of course it is possible to take a DIY approach to distributing MySQL databases. I don’t recommend it. As the environment changes you continuously have to change your sharding, change your applications and test those changes. There is no measure of how efficient your distribution is and, overall, this is a high maintenance approach that cannot guarantee efficiency.

From my perspective, if you need to distribute a MySQL environment then ScaleBase is the obvious solution or, at least, one that you should look at seriously. You’d also do well to consider ScaleBase if you are looking to migrate from an on-premise (expensive) relational database to lower cost cloud-based distributed computing.