Streaming analytics platforms
Analyst Coverage: Philip Howard
Streaming analytics is a branch of analytics that is required when a) analysis is required in real-time and b) the volume of events or transactions is so large that it cannot be effectively handled using conventional technology. Most commonly (but not always – it depends on volume) the data is processed prior to it being stored (so that the delays associated with database update and query are avoided); or the data may never be stored; or, possibly, only aggregated information is stored.
A streaming analytics platform is therefore an environment that provides suitable performance to process anything from tens of thousands of events per second to millions per second, depending on the platform.
Historically, streaming analytics platforms are a development of complex event processing. This effectively provided the same sorts of capability but targeted at algorithmic trading and similar environments within capital markets. What has happened over the last few years is that the technology has become more oriented towards query processing, especially in the light of big data and the Internet of Things.
Traditional query techniques involve storing the data and then running a query against that data. However, the process of ingesting and then storing the data takes time and when there are very large amounts of data to be processed and the query latency requirements are very low then the overhead involved in landing the data in a database, and then running a database query is too great. Streaming analytics works by having the data pass through a query during the ingestion process, thereby providing much better performance.
However, it is not always as simple as just passing the data through a query. It may be more a question of pattern recognition, whereby a series of correlated events together meet or fail to meet an expected pattern. For example, credit card fraud detection is a common use case for this technology and the same is true for the identification of “low and slow” attacks against corporate infrastructures.
Streaming analytics is about real-time analysis of large volumes of data. There are lots of potential use cases in telecommunications, smart meters and monitoring applications of various sorts, fraud detection and so on, so it is often of interest to governance and control departments.
Some solutions in this space (those that can store data) may be able to support functions such as real-time trending, which will be useful in some environments.
For a long time complex event processing was looking for significant numbers of use cases outside capital markets. The advent of big data and the Internet of Things has provided just such opportunities. Thus the main trend in the market is actually the shift away from complex event processing and towards streaming analytics. It is also notable that there are a number of NoSQL initiatives around streaming: for example, Apache Spark.
The early leaders in the complex event processing space have largely been acquired by the major vendors. However, newer companies such as SQLStream have emerged to challenge the hegemony of the 800lb gorillas. Nevertheless, it is likely that this market will be dominated by the major players.
Further resources to broaden your knowledge:
Streaming Analytics 2016
Streaming analytics enables businesses to respond appropriately and in real-time to context-aware insights delivered from fast data.
Streams, events and analytics
Stream processing is enjoying a resurgence thanks to big data and the emergence of the Internet of Things.
Big data integration
The fourth issue for big data is integration, where agility will be required (as it is for governance)
TIBCO transforms big data into big opportunity
TIBCO came to London for their user conference (transFORM2013). This year's theme was all about big data and TIBCO's senior executives outlined their strategy for their platform.
CEP and Big Data 2
Should it be called CEP? Is CEP only about real-time BI? These were questions we answered 6 years ago. Also, a mention of some Hadoop-based CEP engines.
The C in CEP may stand for "complex" but CEP itself is not complex
Untangling events part 3
There has been some delay in this, the third and final article on untangling events. And just as well. I initially asked the question as to whether the log and event management vendors were more...
Untangling events part 2
The purpose of this series of articles is to identify if all of the different approaches to handling events are part of a single market or whether they should be treated as separate. In the first...
Applying data warehousing principles to event processing
It is now pretty much agreed that in data warehousing environments you need a massively parallel processing (MPP) architecture in order to handle very large data volumes. The important factor is...
Complex event processing, business event processing, security event management, log management, data retention systems, event-driven architecture, event warehousing: are these topics and other...
IBM events: not the whole story
A couple of months ago I reported on the announcement of IBM InfoSphere Streams as the company's high end (complex) event processing platform. I also described IBM's approach to event processing...
(Complex) Event Processing
This paper seeks to explain what events are, why they are important to your business, and what the options are for processing and managing these events.