Objectivity and InfiniteGraph

Objectivity has been around for a long time: since 1988. It’s one of only two survivors (along with Versant) from the big hype about object-oriented databases in the mid 90s that are still independent and still developing and marketing databases.

Arguably, Objectivity’s main claim to fame is that it implemented the world’s first certified database that exceeded one petabyte. I have seen claims by other vendors that they were first but that’s because their marketing people haven’t done their research.

Perhaps the most important thing to know about Objectivity/DB, apart from the fact that it is an object database (but with a SQL interface) is that it is distributed database, which is how it could be talking about petabyte scale back in the last century. Actually, perhaps a better way to describe the architecture would be to say that it provides a federated database in much the same way that Composite Software or Denodo might provide a federated query environment across heterogeneous environments but, in this case of course, it is homogeneous; and this is leveraged by the product’s parallel query engine.

Historically, Objectivity/DB has been particularly strong within the telecommunications sector where its sweet spot is in managing switching networks. However, a couple of years ago the company was approached by one of its customers that had a problem with its fraud system that matched suspects with call detail records. Its existing provider could only detect relations between suspects of two or three degrees and even then it was taking several hours to do so; days if more distant relationships needed to be investigated. To cut a long story short, Objectivity developed an application for this customer that can easily handle seven or eight degrees of separation. This was the foundation of what subsequently became the InfiniteGraph graph database. While on this topic it’s worth observing that there are lots of these types of fraud applications where you need to discover relationships: benefits fraud, anti-money laundering and insurance fraud all involve similar requirements.

In order to understand how this was possible I need to describe a bit more about Objectivity/DB. In the database, objects can be linked to other objects using named uni-directional or bi-directional links. The links can have a cardinality that is 1:1, 1:many, many:1 or many:many and it uses object identifiers (OIDs) to speed up the navigation of networks of objects. Now, in a graph database you have triples 1:1:1 or 1:many:many and so on. It is not difficult to see that by extending (perhaps concatenating would be a better word, or even squaring) the existing facilities of Objectivity/DB that you could support graph-based capabilities. So, in effect, InfiniteGraph consists of Objectivity/DB with an added layer to support the triple store.

There are some significant differences between InfiniteGraph and the other graph databases I have previously discussed. I have already mentioned that it scales out rather than up (but not via the sort of cluster that you would get with Hadoop and which will destroy performance) but that’s not all. For example, instead of using SPARQL for queries Objectivity uses Google’s Pregel (the river in Konigsberg) or Gremlin. Further, it is integrated with third party query and visualisation tools, notably Tom Sawyer Perspectives and also supports TinkerPop Blueprints (sort of like JDBC for graph databases). It therefore has a richer ecosystem than other products I have looked at.