CEP and Big Data 2

Philip Howard

Written By:
Published: 24th January, 2012
Content Copyright © 2012 Bloor. All Rights Reserved.

There have been a couple of things floating around in the ether about CEP (complex event processing) recently. The first is the question, supposedly credited to Curt Monash, of whether it should be called something different.

I've been going back through my records. When I first wrote a product evaluation of what is now Progress Apama in 2002, I stated that "the company's contention is that conventional approaches to real-time queries are only suitable for small scale environments or those in which limited numbers of data feeds are being monitored. In particular, its view is that these solutions cannot cope with environments where large numbers of data feeds need to be combined in a complex and dynamic fashion." That was the only use of the word "complex" in a total of nearly 3,000 words. Events were mentioned several times, streams not at all.

As an aside, look at Wikipedia and other sources about the development of CEP and you'll see lots of mentions of David Luckham, who coined the CEP term in his book "The Power of Events", published in 2001. You will also see various other attributions to American scholarship but no mention at all to Cambridge (that's UK not Harvard), which is where Apama came from. I guess all the writers are American.

Anyway to get back to the subject, I wrote the following in our report on CEP, published in 2006: "the subject under discussion is frequently referred to as either complex event processing (CEP) or as event stream processing (ESP). We believe that both of these names are misleading: the former suggests that the technology is not also suitable for processing simple events, while you could infer from the latter (processing streams) that this was only about high performance. We prefer event processing as a neutral term to cover all of these possibilities." Frankly, I gave up this argument years ago.

The second piece of discussion that has recently hit the blogosphere is from Chris Carlson at Informatica. He is suggesting, quite rightly, that CEP isn't simply about real-time processing and that, in fact, it is misleading to refer to it as such. Again, from our 2006 report, "event processing is suitable for use in a wide range of diverse environments. Some of these are more about event streaming (low latency) and some are more about complex processing and some potentially both" and "what event processing does is to reduce the data latency, insight latency and, sometimes, the decision latency involved in taking action when compared to traditional approaches." In other words, CEP is about processing real-time data - it isn't necessarily about making real-time decisions.

Finally, I want to make a point about big data and CEP. Back before Christmas I wrote about how only StreamBase and IBM, of the major vendors, are currently, as far as I can tell, targeting their products at general-purpose operational intelligence environments as opposed to specific areas such as capital markets and security services or applications environments such as SOA. I did mention that SAS will be bringing out a product later this year and that there is also Darkstar from Cloud Event Processing (CEP - ycch!) which, naturally, runs in the Cloud. What I didn't mention was that there are a number of companies/products that have been specifically designed to work in conjunction with Hadoop, namely HStreaming, S4 (from Yahoo!) and Storm (from Twitter). I haven't looked at any of these in detail so can't comment on them but that is definitely an area that is heating up.

Post a comment?

We welcome constructive criticism on all of our published content. Your name will be published against this comment after it has been moderated. We reserve the right to contact you by email if needed.

If you don't want to see the security question, please register and login.