Data warehousing update 4: ParAccel

Written By:
Content Copyright © 2007 Bloor. All Rights Reserved.

ParAccel has now officially launched the ParAccel Analytic Database, along with initial customers, several TPC-H benchmarks and a number of industry partners.

Let’s start with the product, which comes in two flavours: ParAccel AMIGO and ParAccel MAVERICK (hereafter written in small letters). The former is designed to be an accelerator for SQL Server and Oracle environments with built-in query routing and synchronisation, and the latter to be a stand-alone, ANSI SQL product. However, the lines between Amigo and Maverick are not quite as clear cut as this. For example, LatiNode (which is a US-based telecommunications company) is one of ParAccel’s first customers. It originally looked at implementing Amigo in the days when this was the only product that ParAccel offered. It particularly liked that Amigo included syntax coverage for (in this case) SQL Server, which meant that its existing queries, which had been developed using Panorama NovaView, could run unchanged. However, the company concluded that it would prefer to have a stand-alone solution (and it was customer feedback such as this that prompted ParAccel to develop Maverick) albeit that it still wanted the syntax coverage from Amigo. So, to some extent, you can mix the two offerings.

In terms of the technical details, ParAccel is a column-based database running on an MPP (massively parallel processing) platform. The company is a software only provider but you can have the software pre-installed and configured on a relevant hardware system from Sun (a major partner), HP or whoever. As one might expect, ParAccel has no use for indexes (columns more or less equate to indexes) and implements significant compression ratios. However, it is significantly different from its rivals in its use of memory. In addition to conventional disk-based warehouses you can also have in-memory only solutions or a mixture of the two. Of course, a memory only based implementation is likely to be quite small even given the lack of indexes and compression: ParAccel quotes 16 Gb of memory holding around 40 Gb of raw data, so a total (raw) size of some hundreds of gigabytes is likely to be the sweet spot. Indeed, all of the TPC-H benchmarks that the company has just announced were run in-memory.

These new TPC-H benchmarks apply to both performance and price/performance at the 1Tb, 300GB, and 100GB levels. Regular readers will know that I am not persuaded by the merits of such figures but, nevertheless, they no doubt provide a warm feeling in potential users.

In terms of competition, ParAccel, in some senses, doesn’t have any. The only other vendor that has the ability to run SQL Server and Oracle queries unchanged (at least currently) is Dataupia and that company is more concerned with making very large data volumes available for analytics than it is in extreme query performance. Thus I expect ParAccel to concentrate on this SQL Server/Oracle market where it has a major differentiator against other vendors in this space.

My guess would be that there will be relatively few sales of Maverick that don’t start out as Amigo sales or at least requiring syntax coverage. This isn’t, incidentally, any criticism of Maverick: just an observation that you naturally take advantage of differentiators where you have them; and with relatively limited sales resources you target the opportunities that you are most likely to win. Anyway, until and unless other vendors introduce such capabilities, ParAccel is going to have a big advantage in this area, and if its technology can compete in performance terms (and there’s every reason to believe it can) then it represents an interesting addition to the market.