TigerGraph
Last Updated:
Analyst Coverage: Philip Howard and Daniel Howard
TigerGraph is based in California and has been in existence since 2012 but was primarily in stealth mode, until mid-2017. It has some prestigious users, including Visa, Uber, Citrix, and Alipay, amongst others that span a variety of industries including banking, media and entertainment, healthcare, automotive and retail and hospitality. Energy efficiency analytics and the Internet of Things are also major areas of focus. TigerGraph is VC-backed.
TigerGraph is a native graph parallel database that is available both in on-premises and cloud versions. The company also offers TigerGraph Cloud, which is provided as a managed service. There is a free trial program for enterprises and a free developer edition for non-commercial use. Also available is GraphStudio, which is a visual query builder. The product features one-click deployment to several major cloud marketplaces, including AWS and Microsoft Azure; it supports Docker and Kubernetes containers; and includes direct integration with a number of popular data storage systems, including relational databases (Snowflake, Teradata et al), Hadoop, object storage and various types of file systems, as well as both Kafka and Spark.
TigerGraph (2020)
Last Updated: 11th September 2020
TigerGraph uses a property graph paradigm and has been designed specifically to support real-time (less than one second) analytics. The keys to achieving this are parallelism, compression and the way that, in TigerGraph, graph edges and vertices are not just units of storage but also computational units. The engine supports the processing of these in parallel, and the product also includes a parallel loader, as the product’s architecture, shown in Figure 1, illustrates. Compression can be more than 10x, according to TigerGraph, and compression is also used as a part of the loading and transformation processes, to further improve performance. Also relevant is the graph partitioning, which supports application-specific partitioning as well as mixed partitioning strategies. This is all handled automatically within TigerGraph Cloud. There is also the ability to run multiple graph engines, with each engine hosting identical graphs with different partitioning algorithms tailored for different types of application queries. The front-end server will route application queries to the relevant engines based on the query type.
Other significant features include security (single sign-on, support for LDAP and Active Directory, encryption – both in motion and at rest – and role-based access control); more than 20 starter kits for TigerGraph Cloud (examples include data lineage, financial services fraud detection, and in-database machine learning for real-time recommendations); user-defined indexing; and a collaboration service whereby multiple groups can share a single master database, with each having their own view into the database. This has important implications for compliance (not least GDPR) because this service allows you to manage and monitor data access, data lineage and personal data. This includes where a point of data was first acquired, whether consent was given in obtaining it, where it moved over time, where it resides in each system, and how it gets used.
In the latest release (3.0) GraphStudio has been extended to provide a no-code migration capability from relational databases. At present this is limited to supporting PostgreSQL and MySQL, but this is likely to be extended. The company estimates that around 80% of the effort involved in migration will be automated through the use of this tool.
Customer Quotes
“We selected TigerGraph for its superior data warehousing speed and computational processing capacity, which improved performance by an order of magnitude.”
IceKredit
“Alipay streams 2B+ daily events in real time to a graph with 100B+ vertices and 600B+ edges on a cluster of only 20 commodity machines.”
TigerGraph is about real-time analytics for anomaly detection, pattern recognition, IoT applications, making recommendations (next best offer) and similar environments where low latency is required. It supports both supervised and unsupervised machine learning and a target market for the company is in leveraging its graph models to generate training data for machine learning purposes. TigerGraph also supports geolocation capabilities, which are important in many IoT and similar environments. However, it does not offer support for shape files and polygon processing, which is why we refer to it as supporting geolocation rather than geospatial capabilities.
You can access the database via GSQL. As its name suggests, this is “SQL like”. However, the company also offers a browser-based capability called GraphStudio that can be used to create graph models, queries and so forth. This has been built on top of GSQL to make the environment more user friendly, allowing ad hoc exploration of your data. Indeed, in the latest release TigerGraph has added a visual query builder – see Figure 2 – to GraphStudio, which means that anybody can build queries without having any knowledge of GSQL. We expect this to become the de facto standard method for working with TigerGraph.
In addition, a migration toolkit is provided to port queries from Cypher into GSQL, allowing you to easily reuse queries written in that language. There is also a GSQL software developer’s kit (SDK) that third party graph specialists could use to integrate with TigerGraph, and there is a RESTful API capability, which means that it should be relatively easy to integrate with third party tools such as Tableau. We would like to see the company supporting GraphQL as an alternative API. A user extensible library of graph algorithms is also provided. Several algorithms (such as PageRank) are available out of the box.
The key point about TigerGraph is its performance. Most other graph databases were built originally to support operational environments and were not intended to be used for complex large-scale and real-time analytics, though they may have been extended in that direction since they were originally designed. TigerGraph, on the other hand, was designed specifically for these environments.
We are also particularly pleased by the introduction of the visual query builder, which should help to democratise the use of TigerGraph by providing self-service capabilities for business analysts and others that do not, and do not want, to understand GSQL.
The Bottom Line
We should emphasise “complex, large-scale and real-time” as well as “analytics” from the previous section. Add in the ability to process operational data in real-time and you should understand where and why TigerGraph has significant advantages.
TigerGraph (January 2019)
Last Updated: 28th January 2019
TigerGraph is a native graph parallel database that is available both in on-premises and cloud (AWS and Azure) versions. The company has also announced TigerGraph Cloud through which the product will be available as a service. TigerGraph uses a property graph paradigm and its strengths are with processing structured rather than semantically oriented data. Its main areas of focus are anti-fraud, customer intelligence, supply chain intelligence and energy efficient analytics. The Internet of Things (IoT) is also of increasing interest, assisted by the product’s direct integration with Kafka (among other things – see below). Unlike many graph database products, TigerGraph has been designed specifically to support real-time (less than one second) analytics. The keys to achieving this are parallelism, compression and the way that, in TigerGraph, graph edges and vertices are not just units of storage but also computational units. The engine supports the processing of these in parallel, and the product also includes a parallel loader. Compression can be more than 10x, according to TigerGraph, and compression is also used as a part of the loading and transformation processes, to further improve performance. Also relevant is the graph partitioning, which supports application-specific partitioning and mixed partitioning strategies, as well as automated partitioning. Aligned with this, but not shown in Figure 1, is the ability to run multiple graph engines, with each engine hosting identical graphs with different partitioning algorithms tailored for different types of application queries. The front-end server will route application queries to the relevant engines based on the query type.
TigerGraph 2.0, released in February 2018, added various security enhancements, including single sign-on, support for LDAP and Active Directory, encryption (both in motion and at rest) and role-based access control. Most notably, the company introduced a collaboration service whereby multiple groups can share a single master database, with each having their own view into the database. This has important implications for compliance (not least GDPR) because this service allows you to manage and monitor data access, data lineage and personal data. This includes where a point of data was first acquired, whether consent was given in obtaining it, where it moved over time, where it resides in each system, and how it gets used.
There is a free trial program for enterprises and a free developer edition for non-commercial use. The company has also introduced Graph Gurus, which is a free, educational webinar series. The product features one-click deployment to several major cloud marketplaces, including AWS and Microsoft Azure; supports Docker and Kubernetes containers; and includes direct integration with a number of popular data storage systems, as shown in Figure 1.
Customer Quotes
“We selected TigerGraph for its superior data warehousing speed and computational processing capacity, which improved performance by an order of magnitude.”
IceKredit
“Alipay streams 2B+ daily events in real time to a graph with 100B+ vertices and 600B+ edges on a cluster of only 20 commodity machines.”
TigerGraph is about real-time analytics for anomaly detection, pattern recognition, IoT applications, making recommendations (next best offer) and similar environments where low latency is required. As an example of its use, Figure 2 illustrates the logical architecture deployed for anti-money laundering (AML) at one of TigerGraph’s users. The blue brackets indicate sub-graphs. Note the support for machine learning, both supervised and unsupervised. Note too the fact that this sort of application requires support for real-time processing of operational data, not just analytics.
You can access the database via GSQL. As its name suggests, this is “SQL like”. In addition, a migration toolkit is provided to port queries from Cypher into GSQL, allowing you to easily reuse queries written for Neo4j. The company is planning to add support for Gremlin, part of the Apache Tinkerpop project, in a future release. TigerGraph provides its own graph visualisation capabilities, and also offers a browser-based capability called GraphStudio that can be used to create graph models, queries and so forth. This has been built on top of GSQL to make the environment more user-friendly, allowing ad hoc exploration of your data. There is also a GSQL software developer’s kit (SDK) that third-party graph specialists could use to integrate with TigerGraph. In addition, there is a RESTful API capability, which means that it should be relatively easy to integrate with third-party tools such as Tableau. A user extensible library of graph algorithms is also provided. Several algorithms (such as PageRank) are available out of the box, with more forthcoming as development continues.
TigerGraph also supports machine learning via the ability to generate training data en masse, which can then be extracted into your machine learning solution and used to train models. The training data itself is derived from your graph, and can be exported into your machine learning solution on a continuous basis, for example, every two hours. Due to the complexity of the graph structure on which it is based, this exposes a large quantity of information (particularly relationships) which can then be analysed deeply for connections and patterns. This can dramatically improve the accuracy of your models, particularly compared to simple analysis of relatively uncomplicated training data.
The key point about TigerGraph is its performance. Most other graph databases were built originally to support operational environments and were not intended to be used for complex large-scale and real-time analytics, though they may have been extended in that direction since they were originally designed. TigerGraph, on the other hand, was designed specifically for these environments and it is therefore not surprising that benchmarks suggest that TigerGraph outperforms leading rival products.
The Bottom Line
We should emphasise “complex, large-scale and real-time” as well as “analytics” from the previous section. Add in the ability to process operational data in real-time and you should understand where and why TigerGraph has significant advantages.
TigerGraph (June 2019)
Last Updated: 27th June 2019
Mutable Award: Gold 2019
TigerGraph is a native graph parallel database that is available both in on-premises and cloud (AWS and Azure) versions. The company has also announced TigerGraph Cloud through which the product will be available as a service. TigerGraph uses a property graph paradigm and its strengths are with processing structured rather than semantically oriented data. Unlike many graph database products, TigerGraph has been designed specifically to support real-time analytics. The keys to achieving this are parallelism, compression and the way that, in TigerGraph, graph edges and vertices are not just units of storage but also computational units. The engine supports the processing of these in parallel, and the product also includes a parallel loader. Compression can be more than 10x, according to TigerGraph, and compression is also used as a part of the loading and transformation processes, to further improve performance. Also relevant is the graph partitioning, which supports application-specific partitioning and mixed partitioning strategies, as well as automated partitioning.
TigerGraph is about real-time analytics for anomaly detection, pattern recognition, IoT applications, making recommendations (next best offer) and similar environments where low latency is required. As an example of its use, Figure 1 illustrates the logical architecture deployed for anti-money laundering (AML) at one of TigerGraph’s users. The blue brackets indicate where TigerGraph is being used: one instance being used for four different purposes. Note the support for machine learning, both supervised and unsupervised. Note too, the fact that this sort of application requires support for real-time processing of transactional data – TigerGraph is ACID compliant – not just analytics, so that the product supports hybrid transactional and analytic processing environments.
Figure 2 shows the logical architecture of TigerGraph, though it doesn’t illustrate the many partitioning and replication options available, such as the fact that you can run multiple graph engines, with each engine hosting identical graphs with different partitioning algorithms tailored for different types of application queries. The front-end server will route application queries to the relevant engines based on the query type.
TigerGraph supports single sign-on, LDAP and Active Directory, encryption (both in motion and at rest) and role-based access control. There is also a collaboration service whereby multiple groups can share a single master database, with each having their own view into the database. This has important implications for compliance (not least GDPR) because this service allows you to manage and monitor data access, data lineage and personal data. This includes where a point of data was first acquired, whether consent was given in obtaining it, where it moved over time, where it resides in each system, and how it gets used. The product features one-click deployment to several major cloud marketplaces, including AWS and Microsoft Azure; supports Docker and Kubernetes containers; and includes direct integration with a number of popular data storage systems, as shown in Figure 2.
You can access the database via GSQL. As its name suggests, this is “SQL like”. The company is also planning to add Cypher and Gremlin migration kits in a future release. TigerGraph provides its own graph visualisation capability as part of a browser-based GraphStudio that can be used to create graph models, queries and so forth. This has been built on top of GSQL to make the environment more user friendly, allowing ad hoc exploration of your data. There is also a GSQL software developer’s kit (SDK) that third party graph specialists could use to integrate with TigerGraph. In addition, there is a RESTful API capability, which means that it should be relatively easy to integrate with third party tools such as Tableau. A user extensible library of graph algorithms is also provided. Several algorithms (such as Shortest Path, Similarity, PageRank, Label Propagation, Centrality, and Community Detection) are available out of the box, with more forthcoming as development continues.
Most databases targeted at supporting hybrid environments that need real-time analytics in conjunction with transaction processing require either that you store the data twice (once in row format and then again in columnar format) and/or they work on the basis of supporting “multi-temperature” data with “hot”, “warm” and “cold” data stored separately. Apart from any other considerations, one of the big advantages of graph databases in supporting these hybrid environments is that you only need to store the data once and you don’t need to worry about the temperature of the data if you don’t want to. This represents a significant saving in both costs and complexity. Of course, these benefits are generic to graph technology and not specific to TigerGraph. However, as we have stated, TigerGraph is rare amongst graph database products – especially property graph databases – in targeting real-time analytics.
The Bottom Line
TigerGraph focuses on complex, large-scale and real-time analytics while also supporting transaction processing through its ACID compliance. With the underlying benefits of using a graph-based approach, this makes TigerGraph a serious contender for hybrid processing environments.
Commentary
Coming soon.