Denodo and data governance

Written By:
Content Copyright © 2023 Bloor. All Rights Reserved.
Also posted on: Bloor blogs

Denodo and data governance banner

Denodo is a company founded back in 1999 in Spain but in 2005 it migrated its headquarters to Palo Alto. The core of its technology is the ability to produce a data virtualisation layer between source systems and end users. You can think of it like a virtual data warehouse, where queries across sources can be made but without the physical bulk copying of the data from transaction systems sources into a physical data warehouse.

By late 2023 the privately held company had grown to over 600 employees and had over a thousand customers, including Sanofi, Coca Cola and the EU Commission. Recent years have seen annual growth of around 40%, with the company being profitable.

The Denodo platform does data and metadata discovery, data lineage, data governance and change impact analysis. It is cloud based, running on AWS, Azure, and Google Cloud. It has a wide range of connectors to assorted databases, files and sources, both on-premise and cloud based.

It acts as a data virtualisation layer sitting above any existing data warehouses and data lakes and other source systems. It acts as a kind of semantic layer, displaying data in abstracted form and providing unified data access across multiple sources and locations, with the idea of democratising visibility to data assets. With the average organisation having over 400 separate data sources (according to an IDG study in 2021) there is an evident need to consolidate data in one way or another in order to answer questions that span the enterprise. For example, to understand “who is my most profitable customer’ will require access to customer data and well as cost data that may be scattered across several different sales, marketing and finance systems.

Clearly it is a non-trivial task to connect data sources in this way. As well as the physical issue of dealing with distributed or federated queries across differing data sources and formats, there is the issue of meaning. What is a “customer” is one system may have multiple definitions, and different versions of customer data may exist across various systems. Denodo is complementary to master data management and physical data warehouses in that it is does not focus on resolving such definition inconsistencies or mapping them. Instead, it concentrates on being to run queries across multiple source systems in an efficient way. It uses machine learning to understand data usage patterns and can make recommendations, for example to suggest calculating certain aggregates, and can cache results to speed things up.  It can enforce consistently in policy definitions and also has some data quality functionality. Denodo can be regarded as complementary to master data management solutions, for example being able to use a master data management hub as a data source, if such a thing already exists.

In the latest software version, there is a natural language capability that leverages OpenAI’s ChatGPT AI, allowing customers to phrase queries in a natural language form rather than having to understand data structures. Denodo has an embedded query optimisation engine in order to resolve these queries as efficiently as possible.

In terms of data governance functions, Denodo has have role-based access control, and can enforce policies like data masking based on a user profile (e.g. for GDPR compliance) and metadata. The Denodo “Design studio” shows a logical model of the data, and the platform also allows query scheduling and monitoring.

In terms of data virtualisation Denodo competes with vendors like TIBCO and Starburst. Although Denodo can serve as a data governance platform too, it comes at this problem from a different angle than products like Collibra and Alation, which start with a business glossary and metadata. Indeed, although there is overlap, it is possible for Denodo to work collaboratively with Collibra, for example, treating it as just another data source.

Although there are clearly issues in being able to successfully deploy truly federated queries in terms of handling inconsistent data definitions as well as matters of query efficiency for large or complex queries, Denodo has evidently prospered in this emerging space, growing rapidly and having built up an impressive roster of customers.