CEP and analytics

Written By:
Content Copyright © 2009 Bloor. All Rights Reserved.

The first time I wrote about Apama (now Progress Apama), back in May 2002, the first thing that I wrote about it was that the product had “been designed to address the need for real-time query environments”.

The fact that event processing is really about very low latency analytics seems to have largely got lost since then, in the scrabble for capital markets. And it hasn’t been helped by the fact that SeeWhy, which is clearly a product in the analytics space (now seeming to be focusing on abandoned on-line shopping baskets), didn’t like me referring to it as being a CEP (complex event processing) vendor in the first place, even though it clearly is. However, I am glad to say that I have been speaking with two companies recently that are both clear that their CEP products are in the real-time query and analytics space. These are Starview, which refers to itself as providing Analytical Event Processing, and IBM, which is positioning InfoSphere Streams as a real-time analytics platform (RTAP as opposed to OLAP or OLTP).

One other interesting thing about Streams is that it now supports PMML, which is the XML variant for porting data mining models. It allows you to take a model that you have built using SAS Enterprise Miner or SPSS (now IBM) Clementine (actually it isn’t called that anymore but Clementine is such a good name) or any other mining tool that supports PMML (not all of them do) and import it into Streams so that you can score incoming data against that model to detect anomalies, exceptions and so on. This is so obviously a sensible idea that I used to ask the CEP vendors if they could import mining models or if they had plans to support PMML. But the answers I got back were so dispiriting (mostly, completely blank looks) that I more or less gave up asking the question after about 2006. So, hooray for IBM! Finally, a company with some sense.

And so to Starview, which was founded in 2002. It’s not exactly a household name but that’s precisely because it focuses on applications in manufacturing, utilities, logistics, telecommunications fraud, and so on, rather than on capital markets. From a technical point of view one big difference in these environments is that the platform actually has to collect the data and you therefore need relevant connectors in order to support that, in much the same way that you do in a data integration environment. Of course, IBM has a separate product, WebSphere Premises Server, which does that, but in Starview’s case it is built into the product. The only other CEP vendor that does that, as far as I know, is the Australian Event Zero. Like Event Zero, Starview provides a distributed environment, which is more or less essential where you may have inputs coming from thousands of different sources.

All of this would suggest that a split is occurring in the CEP market between vendors that specialise in analytics and those that don’t. However, the latter market is also split between capital market applications (and to a lesser extent government applications) on the one hand and those using CEP for other types of purposes, such as AptSoft (IBM) being employed for event-driven business processes and AgentLogic (Informatica) for event-driven data integration processes. We could thus say that we have event-driven analytics, event-driven applications and event-driven processes each as separate markets with separate requirements. And we should also bear in mind that large parts of the security market is event-driven, as is compliance monitoring. All of which suggests that event-driven architectures are creeping up on us by stealth, except that there is no overall architecture involved, these all being separate and distinct. In other words we are heading for a series of siloed event-driven solutions none of which touch each other and all of which are based on different products and technologies. That is a) not a good recipe and b) an opportunity for someone to build a broader platform to support a multitude of these use cases.