We have just published our latest report into analytic warehousing. We've called it that because we are interested in environments where there is at least a significant analytic requirement that is either complex and/or unpredictable.
The report is in five parts. The first, and largest, is our Market Guide (available here) is for enterprise data warehouses that are not only required to support analytics but also a substantial amount of operational intelligence and/or master data management, where there is a significantly greater requirement for mixed query workload management.
The three Market Updates for analytic data marts are scale-dependent. That is, they cover up to 5Tb, from 5 to 50Tb and 50Tb+ respectively. We had originally thought to have a comparable breakdown for enterprise data warehouses but in practice the vendors that have sufficient mixed query workload capability also have sufficient scalability to cover this entire range with one or two exceptions at the low-end.
The reason that we have broken the market down into four categories is simple: a chart that shows all of the vendors in a single diagram (or should I say quadrant?) isn't a lot of use to a company wanting to make a buying decision. MySQL is not comparable to Teradata, for example, and we wanted to make sure we compared apples with apples.
Of course, reports like this are only a snapshot in time: with 20-odd vendors there are new releases coming out all time, so it should just be treated only as a starting point.
What's most interesting is to think about what will change between this report and the next in, say, 18 months time. By then IBM's Smart Analytics will be available as will Microsoft's Madison and, no doubt, we shall hear more from Oracle about Exadata. This will mean that the other vendors will be under much more pressure: there will be no more easy pickings. Sybase and Teradata are big enough that this shouldn't be a problem; while SAS tends not to market its warehouse per se, but uses it as the underpinning for its analytic applications, so it's sort of out of the equation. The remaining $1bn+ vendor, HP, whose results to-date could be summed up in the one word—disappointing—may be under more pressure.
Of the new guys (in which number I include Kognitio, event though it's not new), Netezza is clearly best positioned: it's got as many installations as all the others put together. Moreover, it's already starting to broaden its portfolio through acquisition and we expect more purchases as we go forward, so we expect Netezza to be fine.
For the rest it's more of a shooting match. If you'd asked me a year ago who was second behind Netezza I would have said Greenplum: now I think it's Vertica; and ParAccel is coming up on the rails. But each of these is going to be under pressure.
Non-British and UK companies like Exasol, illuminate and VectorNova have an advantage because people like to buy from compatriots but I reckon that they do not have much more than 18 months to two years, at most, to capitalise on that advantage. Then there is the market for analytics as a service or data warehousing as a service: the leading specialist in this area is 1010Data but the big boys, again, will be moving into this space. In yet another category, Aster Data clearly has a lead in the MapReduce arena (if you want to combine SQL and MapReduce) but, again, will get caught up.
The bottom line is that the shake-up is going to start soon. I don't count Dataupia here because its troubles have really been as a consequence of the recession rather than competition. How many companies will I feel I need to report on by 2011? My guess is it will be a lot less than now.