Real-time predictive analytics with Zementis

Written By:
Published:
Content Copyright © 2016 Bloor. All Rights Reserved.
Also posted on: The IM Blog

Zementis offers a predictive analytics scoring engine for the rapid deployment and real-time execution of predictive models built in any PMML-compliant data mining tool. The deployment of machine learning and statistical models into operational IT environments remains a manual and time consuming process in many IT organisations. The Zementis approach is to automate model deployment, eliminating much of the manual coding and transcription errors, and reducing the time to results from months to hours or minutes.

First, an aside on PMML, one of IT’s success stories. PMML (Predictive Model Markup Language) is an XML-based standard for the vendor-independent exchange of predictive analytics, data mining and machine learning models (call them what you like) between originating design tools and the production execution platform. Zementis parses the PMML to build and execute the pre-processing data transformations, apply the required analysis model (for example, neural network, decision tree, Bayes classifier), and execute any post-processing requirements (typically some form of thresholds or rules for mapping scores or probabilities into actual business decisions). Developed by the Data Mining Group in the late 1990s, PMML has matured quietly to the point where it now has extensive vendor support and has become the backbone of big data predictive analytics.

Zementis enables the data scientist to work in a wide range of tools and formats (for example IBM SPSS, R, SAS, KNIME, Python). Under the hood, Zementis reads and compiles the PMML with optimised, in-memory execution, and multi-server cluster scalability. Management tools are provided for logging and monitoring, as well as model management operations for deployment and delete.

Two deployment options are offered. ADAPA (Adaptive Decision and Predictive Analytics) delivers the Zementis real-time and batch-based scoring engine through a Web Services API for web and cloud applications. The ADAPA scoring engine is available on cloud platforms including Amazon Web Services (AWS) and Microsoft Azure or for on-premise deployment. UPPI (the snappily named Universal PMML Plug-In) delivers the scoring engine as an in-database plugin, and includes support for Hadoop-based data storage platforms / distributions including Cloudera, Hortonworks and MapR. It is worth noting that Zementis can score models against both batch and real-time data including Hive, Spark and Storm as well as the proprietary market leading stream processing engines.

The Zementis go to market approach is a healthy mix of direct and partner channel. Notable partners include IBM, Datameer, Teradata and Software AG. It has also targeted specific market verticals and use cases (notably risk management and financial services) which it has used to gain traction, and is using this platform to expand further into newer markets such as utilities and the wider Internet of Things.

Viewpoint

When Zementis set up in business in 2004, Big Data was barely a dot on the IT horizon, and the predictive analytics market, while growing steadily, was not exactly setting the world on fire. What a difference a few years can make. Businesses are now looking at their big data asset, looking beyond the early phase of first order analytics, and trying to understand how they can react earlier and faster to changing market and operational conditions. Zementis has held its nerve, and is now well positioned to lead this market, with a clear focus and understanding of its market positioning, a mature product suite, a standards-based approach, and a strong value proposition built on the operationalisation of model deployment and execution.