SAS is headquartered in North Carolina and was founded in 1976. The company started life by offering a 4GL-based statistical package for financial and economic analysis but has expanded to offer a multi-faceted product set, providing organisations with an information delivery system that spans all aspects of analytics and associated technologies. This includes business analysis, analytics and data science as well as practical applications thereof (customer intelligence and so forth). In addition, complementary capabilities such as data quality and data integration are also offered.
Since its foundation SAS has expanded primarily in an organic manner and is now a multi-billion-dollar company that has offices and partners worldwide. It remains privately owned. A notable point is that SAS consistently invests a significant proportion of its revenues into R&D, typically at around 25%. This emphasis on R&D, plus the company’s refusal to listen to the blandishments of the stock market (where short-term trends are more important than long-term profits), has meant that SAS is, by some margin, the largest pure-play vendor in the analytics market.
Last Updated: 29th January 2013
The SAS solution in the data discovery and profiling space is part of the SAS DataFlux Data Management Platform. DataFlux started life as a data quality vendor but, after it was acquired by SAS, has expanded its capabilities to include data integration (including both ETL and data virtualisation/federation), master data management and complex event processing along with a business glossary (see other SAS product pages). The company has been active in the data governance space for some years. Until recently it has acted as an independent subsidiary of SAS but that has now changed, with the brand now having been taken directly within the SAS umbrella. This will be good for SAS customers but will likely mean that the company loses traction as an independent vendor looking for sales in non-SAS accounts. Given the quality of the product this is unfortunate.
The DataFlux Data Management Platform is arguably the leading product in the market in technical terms and is certainly a leading product. It is especially strong in its technical competency for both profiling and profiling combined with discovery. In particular, DataFlux has very strong support for users, both at a business and technical level.
Historically, DataFlux has had a broad focus across industry sectors and across organisation sizes. SAS, on the other hand, is well-known for focusing on the top end of the market. It is common when companies are absorbed like this that the parent company will continue to claim that the relevant products will be marketed independently but in practice that never happens: salespeople prefer low hanging fruit and that is the existing client base. For all intents and purposes therefore we expect DataFlux sales to be concentrated on the SAS user base. This is also likely to mean that companies that are otherwise non-SAS users may move to other platforms in the fullness of time.
SAS has a rich partner program but it is difficult to isolate partners that are specific to Dataflux data profiling/discovery.
Customer numbers are measured in terms of thousands though how many of these are using profiling/discovery as opposed to other elements of the Data Management Platform is not clear. In any case it will be a significant number and many of these are household names.
Data profiling and discovery is a part of the DataFlux Data Management Platform, which is an integrated (truly integrated) platform with common metadata across the entire suite. It runs under Windows, Linux and AIX. JDBC and ODBC are supported and there are native drivers for Greenplum, DB2, Informix, MySQL, Oracle, Postgres, SQL Server and SAP Sybase. However, there are no facilities for supporting non-normalised data such as Oracle nested tables. On the other hand, COBOL copybooks are supported; along with text files; XML; and spreadsheets; as well as Hadoop and Hive.
SAS is a major worldwide organisation with offices all around the globe. DataFlux thus enjoys the consulting, support and training capabilities that one would expect from a company of the size of SAS.
SAS Event Stream Processing (2018)
Last Updated: 6th December 2018
Mutable Award: Gold 2018
SAS has been offering stream processing for a number of years, especially for fraud applications. However, a major driver is now the Internet of Things (IoT) and applications such as predictive maintenance and smart “anything”. The basic principle is to capture and apply analytics to data in motion.
In terms of the products themselves, SAS Event Stream Processing (ESP) comes with a variety of options and components, as illustrated in Figure 1. The plus signs indicate additional license options. CAS stands for Cloud Analytic Services. The ESP for Edge option will run on something as small as a Raspberry Pi and supports analytics on those small but powerful platforms. Both event-based and window-based (time sliced) options are available.
"Subsea surveillance of oil platforms to help avoid unplanned production interruptions, resulted in the prediction of a submersible pump failure that saved the company $3 million."
Major energy company
"Our transportation assets are equipped with edge devices that manage hundreds of data elements per second to optimize their operation. We use SAS Event Stream Processing in our edge devices to enable our customers with analytical insights on their data in real time to minimize operating costs, identify predictive maintenance issues before they become a problem."
Major rail transportation manufacturer
Perhaps the biggest differentiator for SAS in this space is its analytic capabilities, though it’s also notable for its performance. The options available are illustrated in Figure 2, many of which have only been introduced in the latest release (5.1). A key feature is the continuous improvement of in-stream models, using machine learning.
As far as Figure 2 is concerned, DS2 is a SAS specific language and ASTORE, also proprietary, is used to support algorithms that are trained offline but which are scored in-stream. Model Manager, which is a separate product, supports PMML (predictive modelling mark-up language) and will convert supported model types (more than a dozen currently) into SAS code for deployment on the ESP Server. There is also support for RESTful APIs to run other models. Python notebooks are supported to drive the ESP engine, publish events to ESP, and display results, and models written in Python and C are also supported, though not Java-based models. R will be supported later during 2018. Additional facilities in the ESP Server that are worth mentioning include in-stream geofencing, text analytics, in-stream time pattern recognition (including time-series similarity analysis and time-series clustering), and the ability to build data quality rules into the streaming process.
Of the other elements within the ESP environment, ESP Studio provides an environment for constructing visual models, designed for use by non-technical personnel, while Streamviewer provides a visual analytic dashboard environment that lets you combine real-time and historic data. Event Stream Manager is used to update or deploy analytic algorithms without requiring any downtime on the server, and similarly, add new ESP servers as required. It will automatically discover new servers using ESP agent technology for automated monitoring.
SAS provides connectivity to over 300 end points and supports a variety of standard protocols including MQTT, BACNet publisher connector and adaptor (for smart homes), OPC-UA connector and adaptor (for machine-to-machine communications), a UVC connector (Video4Linux), a WebSocket publisher connector, and a URL publisher connector (for RSS news feeds, JSON from a weather service or News from an HTML page). There are also facilities provided so that you can write your own connectors. In this context it is worth mentioning the new SAS ESP Community (communities.sas.com/IOT), which is moderated by SAS but user-driven.
Machine learning and the Internet of Things are top of mind for many chief data officers and SAS has significant expertise in both of these areas. It is the world’s largest business intelligence and analytics company. It should therefore not come as any great surprise that the company is exceptionally strong when it comes to the breadth of analytic capabilities it provides, regardless of whether this is for data at rest or data in motion. Put this together with the fact that Mckinsey has predicted that the Internet of Things will generate $11 trillion of economic impact by 2025 and you have significant opportunities for SAS to exploit its strength in analytics, across a range of IoT-based deployments. The company is also actively researching analytics for Blockchain, another environment which is set to have a major impact.
The Bottom Line
As a company, SAS is long-established and has a reputation for enterprise-level capability. At the same time it has been a leading light in the analytics market for 40 years. In our view, that makes SAS a major contender, if not a leader, in the market for streaming analytics platforms.
Mutable Award: Gold 2018
SAS Event Stream Processing (2021)
Last Updated: 14th December 2021
Mutable Award: Gold 2021
SAS Event Stream Processing (ESP) is an event stream processing platform that enables sophisticated streaming analytics by applying SAS’s existing, and very substantial, analytics capabilities to data in motion. It can either be deployed with Viya (“ESP Viya”) or without it (“ESP Lightweight”). The major difference is in the latter’s significantly smaller footprint; it also misses some functionality, most notably integration with SAS Model Manager. Regardless, ESP can be deployed on-premises or in the cloud, including public, private, and hybrid cloud deployments. Its architecture is shown in Figure 1.
ESP is available on AWS, GCP and Microsoft Azure, and it can leverage a number of cloud services – including native cloud analytics – provided by each of these. Vice versa, SAS offers various Azure applications that have been built off of ESP, including Intelligent Monitoring, Physical Distance Monitoring, and more. Container-based deployment is supported via Kubernetes, complete with automated cluster monitoring and optimisation, and additionally, the ESP edge offering is designed to run on the edge as a lightweight runtime with no functional compromises
On a final note, cloud pricing is based on either event consumption or total revenue. Edge support is purchased with an additional (but perpetual) access fee.
“Our engineers can now see issues before they impact customer operations and change the truck’s design, so we have the best product on the road.”
Volvo Trucks, North America
“With SAS, we’re working smarter – we’re seeing things that exist in our information that we couldn’t find before, so we can do things more efficiently and effectively, and drive better results for our customers.”
Perhaps the biggest differentiator for SAS is its analytic capabilities, though it’s also notable for its performance. A key feature is the continuous improvement of in-stream models using machine learning. To this end, ESP Viya (though not ESP Lightweight) integrates with SAS Model Manager, a separate product that supports PMML (Predictive Modelling Mark-up Language) and will convert supported model types into SAS code for deployment on ESP. There is also support for RESTful APIs to run other models. Python notebooks can be used to drive the ESP engine, publish events to ESP, and display results, and models written in Python, R, C and Java are also supported. In addition, the product supports ONNX models, which in turn permits the use of PyTorch, Tensorflow, and so on, as well as providing native support for (open source) deep learning. Universal inferencing is also available.
Additional facilities that are worth mentioning include in-stream geofencing, text analytics, in-stream time pattern recognition (including time-series similarity analysis and time-series clustering), and the ability to build data quality rules into the streaming process. Both event-based and window-based (time sliced) processing options are available, and the product boasts full lifecycle support, including health monitoring and high availability. The platform operates in-memory, and a built-in event load manager enables optimised and distributed processing.
Of the other elements within the ESP platform, ESP Studio (see Figure 2) provides an environment for constructing visual models, designed for use by non-technical personnel, while Streamviewer provides a visual analytic dashboard environment that lets you combine real-time and historic data. Event Stream Manager is used to update or deploy analytic algorithms without requiring any downtime on the server, and similarly, to add new ESP servers as required. Moreover, ESP Studio, Streamviewer and Event Stream Manager are all integrated into the ESP server Kubernetes environment, thereby supporting ESP deployment, monitoring and updating in that context.
Via built-in ESP Connectors, SAS provides connectivity to over 300 end points, and supports a variety of standard protocols including MQTT, BACNet publisher connector and adaptor (for smart homes), OPC-UA connector and adaptor (for machine-to-machine communications), a UVC connector (Video4Linux), a WebSocket publisher connector, and a URL publisher connector (for RSS news feeds, JSON from a weather service or News from an HTML page). There are also facilities provided so that you can write your own connectors. In this context it is worth mentioning the SAS ESP Community (communities.sas.com/IOT), which is moderated by SAS but user-driven.
SAS is the world’s largest business intelligence and analytics company. It should therefore not come as any great surprise that the company is exceptionally strong when it comes to the breadth of analytic capabilities it provides, regardless of whether it is for data at rest or data in motion. Simply put, SAS has very highly regarded analytics capabilities in general, and via ESP those capabilities can be applied to streaming data just as well as historic data. The company’s expertise when it comes to machine learning and the Internet of Things is a significant draw as well, considering their popularity within, and relevance to, the streaming analytics space.
The Bottom Line
We’ve said it before, and doubtless we’ll say it again: SAS has a reputation for providing enterprise-level quality, and is long-established as a leading light in the analytics market. In our view, this makes the company a major contender within streaming analytics.
Mutable Award: Gold 2021
SAS Operationalising Analytics
Last Updated: 17th January 2020
SAS offers a variety of products built around the idea of operationalising your analytics and Figure 1 provides a context for this. In practice, this generally means taking analytics and machine learning models that have already been created, curating them, governing them, and finally deploying them as part of an existing business process. What’s more, the capabilities on offer do not stop at deployment. Rather, models deployed via SAS are continually monitored, governed, updated, and, if necessary, replaced, as the situation warrants.
There are two principle products that lead the charge in this respect. The first of these is SAS Model Manager, a model management framework for centrally governing all of your models. This includes storing them, monitoring them, reporting on them, deploying them, and so on and so forth. The second is SAS Decision Manager, which provides a decisioning framework that allows you to integrate your analytics models into your business processes via deployable decision flows. Alternately, you can leverage your models from within workflows via SAS Workflow Manager, which is included with Model Manager.
Model Manager allows you to store, manage and generally govern all of your models in a single, central location. Emphasis on ‘all’: you do not have to develop your models in SAS to register them in Model Manager. Notably, this means that Model Manager is compatible with models created using open source tools such as R and Python. PMML (predictive modelling mark-up language) is also supported. The product includes model versioning and version control, as well as history tracking. Models can be grouped into projects, and both individual models and projects can be shared between users or team members to enable collaborative working.
Model Manager is accessible via a web browser, through which you can register new models and manage existing ones. ‘Manage’, in this case, includes monitoring (to ensure that models are performing as well as you would like), reporting, deployment, and retraining, among other things. To support these functions, a dashboard is provided to display a variety of analytics and metrics relating to your models, and how they are performing.
On the subject of (re)training, Model Studio, a secondary product available from SAS, allows you to build pipelines for training your models. When you send a model to be retrained, it is reinserted into the appropriate pipeline with new training data attached (alternatively, you can simply flag poorly performing models for a data scientist to look at, and leave it at that). As alluded to above, this can be done at will and ad-hoc.
Model Manager is also designed to promote model reuse. This is the idea that any given model, or variants thereof, should only be developed once, but can (and often should) be deployed multiple times to multiple locations. Accordingly, models in Model Manager can a) be deployed as many times as you want and b) can be deployed to a variety of targets, including databases, data lakes, and streams (including SAS Event Stream Processing), as well as within the SAS platform itself. Moreover, models can be deployed to multiple targets at once with a single click.
Having stored and governed your models within Model Manager, SAS provides two products for integrating those models within your business processes. The first of these is SAS Decision Manager, a framework for creating deployable decision flows that incorporate analytics models, policy rules and business logic. The second is SAS Workflow Manager, which allows you to create bespoke, automated workflows that include analytics models. Both decision flows and workflows, which are illustrated in Figure 2, are created using drag and drop interfaces within their respective products, and the latter in particular includes notifications and tasks to prompt expedient workflow completion.
SAS also provides functionality for creating and managing tests for your models, as well as model output validation and scoring. In particular, this includes ‘publishing validation’: testing (and therefore validating) your model as it runs on a particular environment with a particular data source. In other words, you can test in the exact conditions your model is going to be running in. This has obvious advantages.
Finally, SAS supplies a variety of visual analytics which are applicable to models and help with explainability. Most notably, this includes a decision tree diagram, as well a ‘root cause’ analysis of your model’s structure.
Analytics, however sophisticated, does not provide business value in and of itself. It must be put into practice in some way – whether via self-service, building it into workflows, or whatever – to influence and thereby enhance your business processes and decision making. This is just as true for (predictive) models as it is for any other kind of business intelligence. Hence, SAS’ model management offering is valuable for its ability to take the models you have already created and rapidly deploy them such that they are deeply integrated with your existing processes. This enables you to maximise the value you are getting out of the models that you have built.
Moreover, SAS provides a great deal of governance for your models. Although this is not necessarily glamorous, the fact remains that AI and machine learning are only becoming more popular. Consequently, analytical models are becoming more and more widespread and abundant: in due course, we expect large enterprises to be deploying thousands if not tens of thousands of models. What’s more, models can become out-of-date in a matter of months and need replacing or updating on a regular basis. There is simply no way to keep track of – let alone govern, deploy, and maintain – such a proliferation of models without proper tooling, such as the model management products provided by SAS.
The Bottom Line
SAS’ model management offering provides two essential functions for the analytics lifecycle: it manages and governs your models centrally, and it provides the means to operationalise them efficiently. SAS also has one of the most mature solutions in the space, and it shows.
SAS Risk Management for Insurance
Last Updated: 13th March 2013
SAS is one of very few companies that can provide both the data governance and risk management capabilities necessary to support a Solvency II solution. Of course, you could implement either part of this solution separately but addressing the whole problem with technology and support from a single vendor is likely to prove a winning combination.
SAS in general has a wide customer base across all industry sectors but, of course, we are here concerned with the insurance sector. Here, as well as in other markets, the company tends to focus on the largest companies.
Overall, SAS has thousands of customers. Relatively few Solvency II case studies are listed on the SAS web site but we suspect (actually we know) that this as much to do with organisations not wanting their names to be known as anything else. It is also a result of these case studies being specifically linked to the company's risk management solution and not also to its data governance.
To support Solvency II regulations you need both risk management and data governance to ensure that the data is timely, accurate and as complete as possible. You also need data governance because you need to be able to prove that you are actually using the relevant models to support your business, so you need some sort of monitoring capability.
The headline SAS product supporting Solvency II is SAS Risk Management for Insurance which includes features to support firm-wide risk (calculation of Pillar I requirements), market risk management, P&C underwriting risk, life underwriting risk, risk data management and risk reporting (Pillar 2 and 3 requirements). Relevant data models are included in the product along with relevant ETL (extract, transform and load) capabilities to support the creation of relevant data marts for analytic purposes.
The data governance aspect of Solvency II compliance is provided by the SAS DataFlux Data Management Platform. This is a an integrated suite of products, built on top of a single infrastructure (so this is truly integrated and not a bunch of acquisitions that have been bolted together) that provides data profiling, data cleansing, data enrichment, ETL and other data integration options, and, for those that require it, master data management. The platform also provides specific facilities to support data stewards and data governance processes.
SAS, as one might expect, provides all the services one could wish for and it also has a number of systems integrators as partners, such as Accenture, Capgemini and Deloitte Consulting, who can also provide relevant services.