Aerospike

Written By:
Published:
Content Copyright © 2014 Bloor. All Rights Reserved.
Also posted on: The IM Blog

Aerospike is a NoSQL database. However, unlike Hadoop, it is focused not on storing petabytes of data for data scientists to explore but on extreme performance in transaction-oriented environments. I’ll explore that, and the potential uses cases for Aerospike, a little more later on.

Where Aerospike differs from other NoSQL databases is that it is an in-memory/in-flash database. In other words it uses memory as much as it can but is otherwise designed to run off flash disk. Typically, indexes will be in memory and data either in memory or flash. Now, there are plenty of traditional database vendors running on the same environment, however, where Aerospike differs from these as that it was designed from the outset to leverage flash. But flash isn’t organised the same way that hard disks are so the traditional vendors cannot get optimal performance out of the flash data.

Some other facts and figures: Aerospike was first deployed some three years ago but was only made generally available a little over a year ago. It has single row ACID properties, it is a key-value store (technically, a distributed hash table) and runs on low-cost clustered hardware with continuous availability (that is, there is no such thing as either planned or unplanned downtime), short of things like power cuts but, even then, you can have geographically distributed systems which will avert such problems. In addition, the database supports smart clients that know where data is located, secondary indexes so that you can run real-time (distributed) queries, and user-defined functions that run in-database. There is automatic load balancing and support for workload management in the sense that you can prioritise transactions over queries and background tasks. You can also store unstructured data such as documents, as well as structured data. Programmatic access is via C, PHP, C++, Java or a variety of other languages.

So, what does this all add up to? Well, the long and short of it is that Aerospike is blazingly fast: comparable to event processing but, of course, you actually get to store the data. And when I say comparable to event processing, I don’t just mean in performance terms but also with respect to scale. One of Aerospike’s customers is processing 2 million transactions per second. Compare that to, say, Cassandra, which is good for a few tens of thousands of transactions per second.

The major market where Aerospike has been successful to-date is in the Ad-Tech space though it has a number of customers in other areas. Its most notable client is eBay, where it beat off a number of other NoSQL vendors to win the account: no small feat given how much eBay knows about the NoSQL market. Moving forward though, I can see the product competing with the event processing vendors both in terms of cost and, especially, where you also need to store data and not just process it. The company has identified fraud as an area of opportunity. I also think that wherever you have ‘lookers’ and ‘bookers’ (for example, travel booking sites, price comparison sites and the like) then Aerospike will make a good solution.

I must say that I am impressed with Aerospike. There are hundreds (literally) of NoSQL vendors out in the market and I don’t have the time or inclination to speak to all of them. But I had heard good things about Aerospike from a couple of sources and decided to organise a briefing with them: I’m glad I did.