RisingWave targets stream processing with SQL

RisingWave is a relatively recent entry to the streaming space, having been established in early 2021 and with its enterprise release – RisingWave Cloud – less than a month old, as of writing. It is a distributed system for stream processing that differentiates itself in large part by the fact that it is SQL-driven – a rarity in the stream processing space.

This means that a) RisingWave provides traditional database functionality (i.e. storage) as well as real-time stream processing, and b) RisingWave queries are written in SQL. The latter point in particular is a considerable advantage, given that SQL is one of, if not the most familiar and widely-used query languages out there. It is hard to deny the benefits using it provides to onboarding, integration, and ease of use due to this fact.

The core of RisingWave’s functionality is open source under the Apache license, while its premium release, RisingWave Cloud, offers SaaS cloud deployment (with options for either hosted or bring-your-own cloud) and enterprise support. Features particularly worth mentioning include exactly-once semantics, out-of-order processing, various (managed) security features, built-in observability and health checks, and the ability to directly query your data as it is streamed in or store it (using the built-in storage capabilities mentioned above, for instance) and query it at a later time.

In terms of performance, RisingWave uses tiered storage caches to optimise performance against cost. It also offers elastic scalability, and separates storage from compute. According to RisingWave themselves (so take the following as you will), performance-wise it compares favourably to Apache Flink. This is, in part, credited to RisingWave’s engine being built on Rust rather than, in Flink’s case, Java (and Scala, technically).

Various prefab connectors are built-in, much as you would expect, and a no-code wizard for establishing connections is also available. The product features UDF support, wire compatibility with PostgreSQL, and a native SQL API. Notably, the fact that it is SQL-based means that, unlike many of its competitors, it sits as part of the SQL ecosystem rather than the big data ecosystem. In a world where data lakes, Hadoop and so on have largely fallen out of favour, this can be a pretty sizable advantage. It is also notable that RisingWave can stand alone as an offering – it does not depend on any external software to function. This is worth mentioning because several of its competitors do rely on external software, such as Apache Zookeeper, in exactly this way. This has the obvious benefit that you only have a single product to maintain, monitor, update and so on, rather than two or more.

To sum up, RisingWave is one of the more interesting vendors to enter the stream processing space in quite some time. It is clearly doing a number of things differently from the more established vendors in the space, even as some things (such as its licensing) remain, basically, the same as ever. That said, the most significant difference from its contemporaries is actually one of timing: RisingWave has been built in and for the world of the cloud, whereas older streaming products largely grew up in the shadow of big data (or even before then) and were developed accordingly, with cloud functionality mostly being added on top of what was already there. This doesn’t necessarily mean that the one will be better equipped for the modern-day demands of stream processing than the other, but it is more than enough to make RisingWave worth paying attention to.