One of the key differentiators for Memgraph is its high performance. There are a number of ways it achieves this. For starters, it is written in C/C++. Consequently, the product enjoys an extremely small footprint: on start-up, it only consumes approximately 30MB of RAM, which means that Memgraph can easily run on edge devices, whether in IoT (Internet of Things) or mobile environments. The fact that Memgraph is an in-memory database is also significant since it will often mean that the entire graph can be held in memory. Not only will this aid performance in general but it will be particularly useful when the database needs to support mixed workloads.
Memgraph’s focus is on algorithm scalability and extensibility. In other words, you can extend and implement high-performance user customised algorithms and procedures. This is enabled through integration with the data science and machine learning ecosystem. Specifically, Memgraph allows you to extend its query language and implement your own custom procedures. These procedures are grouped into ‘Query Modules’, which can be loaded on start-up. Although the most performant and scalable way to implement these procedures is by using the Memgraph C Query Module API, in an effort to make quick development and iteration possible for data scientists, Memgraph also exposes a Python Query Module API. With an embedded Python interpreter inside the database to make it easy for data scientists to leverage libraries like Scikit Learn, TensorFlow and PyTorch, and run analytics directly on data stored inside Memgraph. Finally, Memgraph can be combined with more than 300 graph algorithms from NetworkX and works with machine learning libraries such as www.stellargraph.io.
Another way in which the product enables high performance is concurrency. Memgraph data structures are lock-free. For concurrency, Memgraph has implemented MVCC (Multi-Version Concurrency Control) with snapshot isolation to ensure that, for example, reads never block writes and writes never block reads. Not only does this contribute to performance, but the snapshotting used within MVCC combines with write-ahead logging to prevent data loss from occurring during system failure, hence providing a guarantee of durability. Together with the company’s extensive investment in testing and test-driven development, this makes for an eminently robust solution.
Fig 02 - The Memgraph Lab user interface
We should also mention Memgraph Lab, illustrated in Figure 2. This is a lightweight visual user interface for developers, designed to help openCypher query and graph development. It provides visualisation (of both graphs and schema), exploration capabilities, and the ability to tune queries through query profiling (with diagnostics and query plan details).
Finally, we must comment on the fact that high availability replication is not yet available in the product. The company has announced that this will be available later during 2020 for both its Enterprise Edition and its managed service.