Columns aren’t enough anymore

Written By:
Content Copyright © 2010 Bloor. All Rights Reserved.
Also posted on: Accessibility

I have been preaching the columnar message for data warehousing for the best part of ten years. That argument has been won. However it is now clear that columns on their own aren’t enough. Yes, they give you a great performance boost; yes, they are better able to support improved compression; and, yes they require less administration. But now that almost every mother’s son has a column-based approach it is clear that in today’s competitive environment you need something more than just a columnar database if you want to compete effectively.

You can see this if you look at the various vendor’s offerings. Thus Vertica has projections, Infobright has its knowledge grid and Calpont’s recently launched InfiniDB has an Extent Map (which is similar to the knowledge grid except that it is optimised for I/O). All of this to get better at answering queries.

However, such approaches are also limiting because they suit particular types of analytics. Of course, you could say that about columns in general but what these extra features do is to make products even more specialised. Thus a major focus for Infobright is the analysis of web data, whether in isolation or in conjunction with back-end data, while Vertica has enjoyed particular success in low latency analytics and Calpont reckons that the step map is best suited to data where there is some sort of inherent pattern to the data (for example, by time). Of course, these environments overlap but there are subtle distinctions.

This raises the spectre of vendors becoming specialised (I hesitate to say niche because this might be regarded as pejorative) in particular segments of the market. Of course this is nothing new: this is exactly how Sybase built up a head of steam with Sybase IQ though it is now broadening its attack rather then the reverse.

And then there are things like support for MapReduce or R, which can also be seen in the same light, while the vendors addressing the MySQL market with MySQL front-ends or MySQL compatibility is also a hot area of the market, with Infobright, Calpont and Kickfire all active in this sub-market. Open source is yet another such, again with Infobright and Calpont but also with Ingres waiting in the wings, not to mention MonetDB and LucidDB.

Thus we are witnessing a segmentation of the market. This is a classic scenario: you get a build up of momentum in the market (any market) with many new entrants coming into it; then the market starts to get saturated (in terms of numbers of vendors) so you start to see segmentation; and finally, once the segmentation phase has run its course, you get consolidation. So, I think we are now in the penultimate phase. The question is how long it will last before we see significant numbers of acquisitions? I think it will be a while as we are in the early days of segmentation but be in no doubt that the writing is on the wall.