Guavus, a Thales company, is an advanced analytics vendor that was formed in 2006. It is headquartered in the USA and operates globally. It specialises in enabling mobile network operators (MNOs) to transition to, and take full advantage of, 5G networks. That said, Guavus analytics is also used in other key verticals – such as transportation, aerospace, and manufacturing – both through its parent company, Thales, and its various partnerships.
To these ends, Guavus offers streaming analytics, NWDAF (Network Data Analytics Function), service assurance, device management analytics, mobile voice analytics, and more. Additionally, Guavus emphasises its ability to deliver massively scalable, real-time insight driven by AI/ML-enabled analytics, in part via its open MLOps architecture. It also promotes an “analyse first, store later” approach to data that stands in contrast to the traditional model of “speculate, acquire, store, then analyse”.
According to Guavus, customers choose its solution over its competitors due to its focus on pure-play data analytics, its expertise (and history of success) with helping organisations transition to 5G, its capacity for multi-vendor interoperability and integration, and its use of open, standards-based approaches to automation.
Guavus SQLstream (2021)
Last Updated: 14th December 2021
Mutable Award: Gold 2021
Guavus SQLstream offers efficient and cost-effective streaming analytics across a range of enterprise environments, including the edge and the cloud. It’s designed to enable real-time, data-driven decisioning and actioning, and to this end it provides live dashboards, time-sensitive alerting, “edge-to-core-to-cloud” data collection, and more. It is highly performant, can process a range of data types, and provides options for containerisation and automatic scale-out.
SQLstream includes s-Server as its stream processing engine, StreamLab for no-code app creation over live data streams, and s-Dashboard for building and deploying real-time visualisations. s-Studio, an Eclipse-based IDE for s-Server, and StreamApps, pre-built solution templates targeted at specific use cases, are also available. In addition, support for various third-party products, including Apache Kafka, Amazon Kinesis, and Azure Event Hubs, is provided.
“Reduced Kinesis ‘Shard Hour’ fees – directly related to SQLstream Blaze enabling them to reduce the number of shards from 600+ to 34 (14 input and 20 output shards).”
“From 180 nodes taking 3 hours to generate metrics on service quality and over 65,000 CPUs working on their real-time auction processes, Rubicon went down to using only 5 servers to run streaming analytics on their entire load (110B records/day): now getting insights continuously and in real time.”
Guavus SQLstream uses ANSI standard SQL to execute continuous queries over arriving data streams, with SQL queries automatically translated into executable processing and analytics pipelines. The design is such that you are encouraged to analyse data as it arrives, then store it if (and only if) you have a use for it. This allows you to reduce storage and maintenance costs, since there will be less stray data floating around your system, while keeping your data stores cleaner and less polluted – and therefore faster and more accessible – without losing any meaning. It also fits neatly with recent compliance mandates, such as GDPR.
The platform also supports Java, Python and Scala, in addition to SQL. This is notable in part because many competing vendors require the use of proprietary languages. In addition, the product provides automatic query optimisation, and with performance features such as lock-free scheduling of query execution, has removed the need for manual tuning whilst delivering excellent throughput and latency performance on a small hardware footprint. The architecture supports distributed processing over server clusters, redundancy and recovery options such as high availability, exactly once processing, and it scales well both up to large clusters (as shown in Figure 1) and down to IoT hardware. You can deploy the entire platform through Kubernetes, and you can even run it in-server as an external agent, limiting its functionality but further minimising its footprint. It also exposes a microservice API provided over Web Sockets.
In addition, the product includes a geospatial analytics library, as well as a library of data collection and enterprise integration connectors. This includes support for Hadoop, data warehouses and messaging middleware, among other things, and there is an SDK provided for building new connectors and data processing operators. Native support for operational data issues is also included, for example handling delayed, missing or out of time order data in a way that is invisible to the user or application. The architecture itself is also compatible with Java, Python and C++ plugins.
s-Studio and, more notably, StreamLab, provide you with the tools to interact directly with live data streams and build stream processing and streaming analytics pipelines on top of them. StreamLab, in particular, offers visual, no-code, drag-and-drop tooling for building these pipelines, as well as an intelligent recommendation engine: the Scrutiniser. The Scrutiniser parses your data as it arrives based on a predefined set of rules, then analyses it and makes suggestions accordingly. It can even be applied iteratively to generate fully wrangled data sets with little effort on your part. You can also apply the same kind of analysis to historical data, essentially analysing it as if it was just now entering your system. Live data streams can also be correlated against historical data or augmented with stored data sets to provide even richer analytical capability.
Data output from these pipelines can then be sent to integrated, real-time SQLstream dashboards (again shown in Figure 1), web apps, and external systems or storage platforms. Notably, these dashboards are designed to display streaming data, at speed, with a comprehensible and approachable UI. Although they not as fully featured as some dashboards, they are very well suited for streaming data.
SQLstream is based on SQL, rather than a bespoke language, and this offers a number of advantages: there is no need to re-train personnel; installation and compilation time are minimised; and it enables a much greater degree of performance optimisation. The latter in particular has paid dividends, and as a result the product has a reputation for high performance and is operable at a truly massive scale. It also boasts an impressively small footprint, and combined with its ability to scale down, can quite capably deploy on the edge. Together with the aforementioned performance benefits, it’s quite conceivable that SQLstream will reduce your total cost of ownership substantially.
We also appreciate SQLstream’s emphasis on the need to curate your data before you store it. This is important both for compliance reasons and to avoid the kind of “data swamp” scenario that has frustrated many big data deployments. What’s more, we expect the need for this kind of proactive solution to become increasingly important (and apparent) as the volume of streaming data continues grow. In this regard, we would not be surprised to find that SQLstream is significantly ahead of the curve.
Other notable capabilities include the ease of use provided by StreamLab (particularly in its catering to nontechnical users, and thus enabling self-service), the industry accelerators provided by the platform, and the auto-suggestion capabilities offered by the Scrutiniser. As far as machine learning models are concerned, in contrast to many of its competitors the product will host them (and feed training environments) but will not help you build them. However, it does integrate with various third-party model building environments (such as DataRobot), which will often be sufficient.
The Bottom Line
SQLstream is an impressive product, with a unique position in the market owing to its use of ANSI standard SQL. Its advantages in performance, and its ability to empower analysts as well as developers via StreamLab, do it credit as well.
Given the enduring popularity of SQL, the continuing growth of streaming data (particularly at the edge), and the need to build more and more applications that consume edge data, SQLstream should definitely be on your radar.
Mutable Award: Gold 2021
Last Updated: 6th December 2018
SQLstream Blaze, which is the name for the SQLstream suite of products, is a streaming analytics platform that is available for both on-premises and cloud-based deployments. Included with the product suite are s-Server, the core stream processing engine; StreamLab, a graphical platform (no coding required) for building streaming analytics applications over the live data streams; s-Dashboard, which is an HMTL5 platform for building push-based, real-time visualisation, and which is pre-configured with a standard set of panels, widgets and dashboards – see Figure 1); and s-Studio, which is Eclipse-based and provides stream inspection, application development and administration for s-Server, and enables dynamic updates to live applications. There are specific facilities to support Apache Kafka and Amazon Kinesis environment.
In addition, the company also offers StreamApps, which are pre-built solution accelerators that include data collection agents, analytics, rules, stream processing functions (aggregations, filtering and parsing), and real-time dashboards. At present there are two StreamApps available, for Telecommunications and Smart Cities respectively.
"From 180 nodes taking 3 hours to generate metrics on service quality and over 65,000 CPUs working on their real-time auction processes, Rubicon went down to using only 5 servers to run streaming analytics on their entire load (110B records/day): now getting insights continuously and in real time."
"Reduced Kinesis ‘Shard Hour’ fees – directly related to SQLstream Blaze enabling them to reduce the number of shards from 600+ to 34 (14 input and 20 output shards)."
The most notable thing about SQLStream is its use of ANSI SQL – though Java, Python and Scala are also supported – for executing continuous queries over arriving data streams, whereas most other vendors use proprietary languages. The query planner and optimizer is the powerhouse behind the offering. SQLstream has invested in automatic query optimisation, and with performance features such as lock-free scheduling of query execution, has removed the need for manual tuning whilst delivering excellent throughput and latency performance on a small hardware footprint. The architecture supports distributed processing over server clusters, with redundancy and recovery options.
A geospatial analytics library is included, as is a library of data collection and enterprise integration connectors, including support for Hadoop, data warehouses and messaging middleware. There is an SDK for third parties to build new connectors and data processing operators. Native support for operational data issues is included, for example handling delayed, missing or out of time order data in a way that is invisible to the user or application.
StreamLab is aimed at embracing the business analyst with a set of drag-and-drop and visual tools for data preparation and application development. No coding is required – SQL is generated dynamically by the software – and there is an intelligent recommendation engine that will make suggestions to the user to help in building applications. With StreamLab, the user interacts directly with live data streams (which may include Kafka – see Figure 2 – or Kinesis-based data, the latter being a significant differentiator), building pipelines of stream processing and streaming analytics operations, the results of which are pushed to integrated real-time dashboards or connected to external systems and storage platforms. Note, however, that as far as machine learning and artificial intelligence are concerned, SQLStream is a platform that will host these capabilities but it does not offer the option to build these functions within the SQLStream Blaze environment. The company partners with DataRobot but does not support PMML. There are no facilities to combine real-time with historic data.
The streaming analytics space is all about speed. By making the analytics process faster, organisations seek to make more responsive, more effective and more timely decisions, ultimately trying to react in seconds, not hours, to the data that is entering their system. This is necessary not just to meet the demands of consumers, but also in competitive environments.
Of course, the big advantage that SQLStream offers is precisely that it is based on SQL and therefore requires no re-training of personnel, who are relatively easy to find and hire. However, this is not the only advantage of focusing on a declarative language and the company’s work on SQL optimisation has paid off, with SQLStream Blaze having a reputation for high performance.
The industry accelerators also represent significant advantages over the company’s competitors and we like the data preparation and auto-suggestion capabilities built into the platform. However, many of SQLStream’s competitors directly offer support for building machine learning models, in some cases supporting the training of those models in-platform. Provided you are happy using a third-party environment (such as DataRobot) to support artificial intelligence, this shouldn’t be a problem.
The Bottom Line
SQLStream is in a position on its own within the streaming market. Its use of SQL makes it unique and, given the enduring popularity of SQL, is a significant strength compared to the offerings provided by rival vendors.