Greenplum 3

Written By:
Content Copyright © 2008 Bloor. All Rights Reserved.

Greenplum has just introduced version 3 of its eponymous data warehousing product, referring to it as the “first internet-scale database” at least in part because the company has a number of customers in this space, such as Skype and Comcast, though it also has a variety of others (it has around 40 in total) in other sectors.

Perhaps the most interesting thing is that Greenplum reports that it is competing for enterprise data warehousing (EDW) business as well as for analytic data marts and the company has enhanced its capabilities in this area with more workload optimisation (as well as other features) in this release.

However, what is really driving the success of Greenplum, to my mind, is its partnership with Sun and the use of the latter’s Thumper technology. Thumper was designed as a high performance computing platform that leverages streaming technology for taking data off disk and the combination with Greenplum is truly impressive. For example, using Thumper you can implement a 50Tb warehouse in a single rack, with all of the green implications that has for reduced footprint, reduced power and heating requirements and, of course, reduced costs. No wonder that half of Greenplum’s sales have been on Thumper platforms and that the company derives three quarters of its revenues from those sales. With the boot on the other foot, Greenplum is the biggest driver for sales of Thumper systems.

In fact, the Sun Data Warehouse Appliance is a pre-installed, pre-packaged implementation of both products (and Solaris) that is sold and supported by Sun. There are also joint marketing, sales and partner training efforts currently underway across Sun’s global field organisation.

On the technical front, Greenplum offers compression (just like more or less everybody else does nowadays), excellent parallel load rates, embedded analytics (in the latest release) with native support for relevant parallel functions such as those provided by SAS and (ditto) support for Bitmap indexing, in addition to the B-Tree, Hash and other methods that were already available. In addition, there is support for a variety of things such as Google’s MapReduce, SQL 2003 OLAP and multiple languages includes Python, Perl and R (which is an open source language). Further, in Greenplum 3 there is support for external data streams that allow queries to be executed against external data sources such as web pages, RSS feeds, web services, other databases and so on. This suggests a potential tie-up with CEP (complex event processing) vendors though this is not a market that the company is currently addressing.

At least in terms of the number of its users, I would put Greenplum in clear second place amongst the new data warehouse (appliance) vendors. One of the reasons for this is that the company decided early on to employ a sales person in Asia, where most of the other newer suppliers do not play: as a result it has gained a number of customers in that region. Conversely, the company has not been very active in Europe, where it currently has no presence. However, this will change and the company is already recruiting additional sales staff, one or more of whom will be located in Europe. Put this alongside Sun’s sales efforts and Greenplum looks like a success story that could run and run.