RedisGraph
Update solution on September 11, 2020
RedisGraph is the graph database module for Redis where by “module” the company means functionality embedded into the product as opposed to something tacked on top. It is available via a Docker container, downloadable software, and as an optional part of Redis Enterprise. As a relatively young product, it currently lacks some of the advanced features of more mature products (for example, strong consistency is not yet available). That said, it offers a major point of distinction from other graph products: it represents and stores graphs as sparse adjacency matrices instead of adjacency lists. This enables much faster ingestion and query performance.
The product itself is a property graph. Queries are written in (a subset of) the Cypher query language and, further, Redis Labs is participating in the GQL project to create a standardised graph querying language.
Customer Quotes
“We tried several graph database technologies and we really found that RedisGraph is the one that gave us the speed to solve instant real-time problems, yielding a minimum 5x improvement in query speed.”
IBM
Generally speaking, there are two ways to store and represent graphs. The first of these is the adjacency list, which consists of a list of all the nodes in the graph it represents, each paired with the set of nodes with which it shares an edge. This is effectively the industry standard. The second is the adjacency matrix, which represents its graph as a matrix with one row and column for each node within the graph. Within this matrix, nonzero values indicate the presence of an edge between the nodes represented by the corresponding row and column. An example of an adjacency matrix, alongside the graph it represents, is shown in Figure 1.
Fig 01 – A graph and corresponding adjacency matrix
This method has several advantages in terms of performance. For example, determining whether two nodes share an edge is significantly faster when using matrix representation. Queries can be performed using direct mathematical operations such as matrix multiplication, which is often much faster than the traditional approach using adjacency lists. Performing a self-join, for instance, is simply a matter of multiplying a matrix by itself. Similarly, ingestion rates can be improved using adjacency matrices.
The vast majority of matrices representing real world graphs are ‘sparse’ – meaning that almost every value is zero – and RedisGraph stores adjacency matrices in Compressed Sparse Row (or CSR) format, meaning that they are, effectively, only storing nonzero values. This almost always results in a very large saving in terms of memory. Notably, storing matrices in CSR format does not impact RedisGraph’s ability to use them in mathematical operations.
Fig 02 – Breadth-First-Search via adjacency lists and linear algebra
Going further, RedisGraph implements the GraphBLAS engine. GraphBLAS is an open effort whereby ‘BLAS’ stands for Basic Linear Algebra Subprograms. It provides the ability to use linear algebra running against sparse (compressed) matrices and this combination of optimises and simplifies many different graph queries and algorithms. For example, a comparison between implementations of Breadth-First- Search using linear algebra and the standard approach using adjacency lists is shown in Figure 2. It should be clear that the algebraic approach is easier to write and to understand. It is also computationally simpler.
There is one more advantage of the matrix representation that is worth noting. Matrix operations (such as multiplication) are extremely and easily parallelisable, and this property carries over to queries and graph algorithms based on linear algebra. This means that RedisGraph benefits very significantly from parallelisation. It will, in the future, be able to take full advantage of the massive parallelisation offered by GPU based processing.
Fig 03 – Graph visualisation in RedisInsight
The 2.0 and 2.2 releases of RedisGraph offer a number of new features over previous versions. This includes various performance improvements, enhanced support for Cypher features, full-text search via RediSearch, and graph visualisation leveraging either RedisInsight (see Figure 3), Linkurious or Graphileon (Redis is partnered with the latter two). RedisGraph has also adopted the SuiteSparse implementation of GraphBLAS, which has positive implications for performance, as well LAGraph, an open source collection of GraphBLAS algorithms developed primarily for academia. A growing number of community created drivers and connectors are also available.
Firstly, we should mention the recently release of RedisAI and RedisGears, which are other Redis modules in the same way that RedisGraph is. As may be imagined, RedisAI is designed to serve ML/DL models that were trained over standard platforms like TensorFlow and PyTorch, while RedisGears is a fully programmable engine that enables orchestration and data-flow across modules, data-structures and cluster shards. In context to this paper that means that RedisGraph should be able to interoperate with RedisAI and, for that matter, with RedisStreams.
Secondly, RedisGraph’s CSR storage format mitigates the problems with storage and memory usage that the matrix representation of graphs has had in the past, providing a sixty to seventy percent reduction in memory usage.
The Bottom Line
Although RedisGraph has matured significantly since its release in 2018, it is still very much a product in its infancy. That said, while it is still early days in terms of features, the theoretical advantages of using adjacency matrices are considerable. Even if you are not already using Redis, it is certainly worth looking into.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community