Oracle and the X-Men

Written By:
Published:
Content Copyright © 2008 Bloor. All Rights Reserved.

Drawing of Sage from the X-MenOracle has just announced the Oracle Exadata Storage Server and the HP Oracle Database Machine as its answer to the likes of Netezza and other appliance vendors. The project went under the codename of Sage and, while Oracle didn’t tell me more than that, I am guessing that this actually relates to the Marvel character of the same name, pictured right, a member of the X-Men and X-treme X-Men. She is described on the official Marvel site as “a mutant who possesses a cyberpathic mind that functions like a computer with unlimited storage capacity. Sage is able to record and analyze vast amounts of data… and can also calculate complex statistics in mere seconds… like a computer, Sage is able to perform multiple tasks at once by allocating a partition of her brain to each task.”

Anyway, down to the serious stuff. Briefly, the Database Machine is the data warehouse offering and the Exadata Storage Server provides massively parallel capabilities that back-end onto your conventional Oracle database to enable the Database Machine. What happens is that when a query is processed, data is read from disk, unwanted rows and columns are filtered out by the Storage Server and the remaining data is passed to the database for processing. This will provide significantly better performance for queries where you retrieve a lot of extraneous data from disk but will have less impact where that is not the case.

Oracle is claiming up to 10x performance benefits and this seems reasonable. However, that doesn’t necessarily mean that Oracle will be able to compete effectively with other products. Take a query where you need a full table scan and suppose that that table has 1 million rows each consisting of 60 columns and suppose that you only need to retrieve data from 3 of those columns. Then a column-based database such as Sybase IQ or Vertica only reads those 3 columns so it has 20x less work to do than Oracle. And that doesn’t mean that Oracle will be only half as slow (assuming 10x performance enhancement) because the filtering process (unnecessary if using columns) is still required.

To take another example, Netezza doesn’t just filter the data close to the disk but processes it there too—it is only collation that is done centrally—so you would still expect appliance vendors to outperform the HP Oracle Database Machine.

The margin of performance benefit from appliance vendors will be reduced in some instances but you also have to consider the impact of the Oracle environment as a whole. The key to getting good performance out of Oracle is defined indexes, materialised views and so on. It is when you have unplanned queries or complex analytics where no such structures have been defined that you can run into a performance black hole when using Oracle and which appliance vendors are particularly good at. You may get some benefits from using the Database Machine in these environments but I expect them to pale in comparison to what the appliances offer.

It is noteworthy that no benchmarks have been presented by Oracle in terms of performance: I suspect that this is because, while it is much better than it was before, it still can’t compete across the board with all the new boys on the block. It could probably have put out good benchmarks against IBM and Microsoft but everybody would have spotted the absence of Greenplum, Netezza, ParAccel and the rest, so it wouldn’t have worked as a marketing tool.

Also worth bearing in mind is that while the database may have been pre-installed it will still require administration, and Oracle doesn’t have a reputation as the database requiring the most DBA attention for nothing. If you think that low/minimal administration is a feature of an appliance then this isn’t it.

Leaving that aside, this is certainly a significant step forward but it isn’t ground-breaking. It will encourage existing Oracle shops but I would recommend a proof of concept. In addition, I expect it to hurt IBM and Microsoft (because Oracle should now have clear performance advantages over these vendors in appropriate situations) more than it does the specialist data warehousing vendors. The latter may suffer where it is a close call between staying with Oracle or going elsewhere, but otherwise the appliance and column-based suppliers should still be able to beat Oracle hands down, at least where performance is a major issue.

Which only leaves one question: if the data warehouse is Sage who does that make Larry? Dr Xavier or Magneto?