Database appliances in transaction processing

I wrote last month about the lessons one might learn from the use of appliances in a variety of circumstances, concluding that they improve performance and/or reduce complexity. And that these in turn lead to other benefits such as reduced initial price, faster time to value, simplified maintenance and administration, lower total cost of ownership, and so on.

I then went on to discuss a couple of potential areas where the concept of appliances might be usefully applied where they have not been to date, and I promised to discuss one of these possibilities in a further article: hence today’s scribbling. And the possibility I want to discuss is the application of the appliance approach to transactional database environments.

Now, if you consider what the data warehouse appliance vendors do to get better performance, the first thing that the likes of Netezza and DATAllegro have done is to put processing as close as possible to disk: this allows you to retrieve data more quickly and thereby get performance improvements. And because of the way this is done you also remove a lot of the complexity from the data warehousing environment.

Why hasn’t this same approach been adopted for transactional databases? Put simply, because a new database vendor in this market would have a tough time: there isn’t the same sort of gap in the market that Netezza and other appliance vendors have been able to exploit for data warehousing. The real issue, therefore, is not how you build an appliance to compete with Oracle, DB2 or SQL Server in a transactional environment but how you build an appliance to co-operate with these products to provide a more efficient and less complex operating environment when these are in use. So, how would you do that?

Let’s go back to the first principle of data warehouse appliances: putting processing close to the disk. In merchant databases such as Oracle there is a storage manager provided as a part of the database and this co-ordinates access to disk: not what you would call close to disk. Moreover, if a SAN (storage area network) or similar infrastructure is in place then this will add an extra layer of complexity.

In effect, what most environments have are one or more horizontal layers of software that are used to manage and co-ordinate access to disk which is, if you think about, effectively a vertical activity. A more efficient approach would be to put the processing close to disk, as in a warehouse appliance, but then you would have to persuade the existing software that you were looking after the disk access: in other words you would have to make your device look like a conventional environment to the SAN supplier and to Oracle storage manager.

What would be the benefits of such an approach? Well, you would get much faster disk access. This would mean reduced hardware costs. It would also mean that you could use bigger, intrinsically slower and cheaper disks but still get, probably, significantly better performance than you can right now. Which in turn would mean that some of the issues to do with information lifecycle management and archival would go away or at least be less pressing. Further, because you are cutting through two horizontal layers of management you automatically simplify the environment and thereby reduce maintenance and administration. In other words you would get all the sorts of benefits that derive from the application of appliance technology in other environments.

Sounds like a good idea to me.