Data Governance
Last Updated:
Analyst Coverage: Andy Hayler and Daniel Howard
Data governance is the set of processes in an organisation that support high-quality data. This includes policies, standards and procedures that ensure that an organisation can trust its data, and include business ownership of data. There are many vendors in the space, with the market size depending on who exactly you include, but in 2023 the data governance market was around $3 billion in size.
Large organisations struggle to maintain consistent and high-quality data – a Deloitte 2022 survey found that two thirds of executives were “not comfortable” with their data, a finding that has been echoed in numerous other surveys over the years. In a global company, data is scattered amongst many systems, with a typical large company having an average of 400 applications as data sources. There have been various approaches to this over the years, including the consolidation of core transaction systems (ERP) and the introduction of master data management hubs. Nonetheless, the problem persists, now with the added complication of data being scattered between on-premise applications and private and public clouds. Experience has shown that assigning data ownership to business staff with executive backing, rather than leaving things up to the IT department, is necessary for any successful attempt at wrangling corporate data, and so data governance came into vogue. Initially this was a stand-alone set of processes and methodologies, but in time more and more vendors have added capabilities to their products to support data governance to a lesser or greater degree.
A data governance solution should be able to handle:
- data quality
- policy management
- data cataloguing and discovery
- data stewardship
- data lineage
- data security management.
Some vendors have solutions built around a data catalogue that address all data, and some choose to specialise in a certain subset of this. Specifically, there are some vendors that specialise in data security management and policy enforcement and do not attempt to handle all forms of business metadata or general data quality.
Any company that relies on data, and, let’s face it, these days that is almost all companies, should worry about the underlying quality of that data, and data governance is a major underpinning of controlling corporate data. Some industries like finance and pharmaceuticals have regulators that insist on it, but any company that files its accounts or monitors its operational performance should be concerned that its data is trustworthy, up-to-date, complete and accurate. Experience has shown that IT-driven attempts to consolidate data often founder without business support. IT departments rarely have the political clout to get business units to change their ways, and separate business units will typically be protective over “their” data and be unwilling to change their processes to fit in with other departments unless there is high-level executive ownership of the issue. Once you accept that, then business-led data governance becomes a necessary step to getting control of the data of a corporation or any large organisation.
Data governance processes and organisations can be put in place without much in the way of software support. You can set up a governance executive council, employ data stewards to be embedded within business lines and assign data ownership so that for example it is clear who ultimately is responsible for particular data like “customer and product” and classification. However, in many cases it will be beneficial to deploy a software solution in support, of this, one that can help you discover relationships between data, catalogue business definitions in a glossary and keep track of ownership and version control of that data, for example keeping a record of changes to product hierarchies, as well as who has the ability to make changes to sensitive data. Businesses need to decide what software will best meet their needs, and how well that software will integrate not just with their technology landscape, but how well it is suited to the particular organisational structure; for example some organisations are highly centralised, some very decentralised, so one size may not fit all when it comes to data governance solutions.
In recent years there has been a tendency for vendors to offer broader and broader suites of data management solutions. A decade ago, data integration, master data management, data quality and data governance were regarded as largely separate markets with some loose connections between them. More and more, vendors are now offering suits of products that encompass most or all of these elements, with a data governance catalogue at the heart. Whereas data governance used to focus mainly on structured data like financials, more and more important corporate data is now contained in semi-structured or unstructured form like spreadsheets, websites, documents and image libraries. Much of this is now stored in the cloud, whether that is a private cloud or a public cloud like AWS or Azure, and not just in a corporate data centre.
One specific issue companies face is increased pressure to adapt their data governance approaches in response to forthcoming AI regulation. The EU’s draft AI regulations promise to impose considerable fines on companies who fail to comply, up to 6% of their global revenue. Other similar legislative initiatives are occurring at various levels of government in many countries, including the US and China. Data governance initiatives (and the software that supports these) need to adapt to the ongoing wave of artificial intelligence applications that most companies are adopting, as the use of these tools brings new issues of data security, privacy and reliability.
One emerging trend is the notion of decentralised data governance models. Rather than trying to control everything from a central point, some responsibilities are delegated to business units and subsidiaries, which may suit some types of companies much better than a purely centralised approach.
Data governance is just as affected by the recent wave of interest in artificial intelligence as any industry sector. Data quality solutions have long used machine learning to help with data record matching, presenting human domain experts with candidate matches and learning from their selections. Now every vendor’s PowerPoint sales deck claims some level of artificial intelligence within their solutions, but in many cases these claims are skin deep. However, there are a few innovative solutions emerging that were genuinely designed from scratch around artificial intelligence, and these may be of value in the right circumstances.
The data governance space continues to mature and the recent years have seen a steady set of mergers and acquisitions. Quest Software bought Erwin in January 2021, Precisely bought Infogix in May 2021, Collibra acquired Husprey (a SQL notebook product) in September 2023 and Alation bought DataGroomr (a data preparation tool) in February 2022. Informatica acquired Privitar in June 2023. IBM acquired data lineage specialist Manta in October 2023. Data governance is connected to data quality and to a degree with master data management, and indeed many data governance vendors either have partnerships with data quality vendors or have built or acquired their own capabilities.
In general, the data management space is showing signs of consolidation, with vendors offering a range of solutions of which data governance is one part. Some vendors have data governance, data quality and master data management capabilities, and some also have data movement/migration tools as well. Customers need to decide whether they prefer to deal with a smaller footprint of vendors with broad capabilities or whether to seek out best-of-breed solutions that may be better for specific purposes but will require some level of integration with other technologies.
The bottom line
The data governance market continues to grow and consolidate, as large organisations realise that business ownership of data is a prerequisite to trustworthy, high-quality data on which to base business decisions. The market now offers a wide range of data governance software solutions, ranging from highly specialised ones that focus on a single area like data security to broad platforms that handle business data and processes and go beyond this, offering solutions for data quality, master data management and sometimes data integration too. In order to navigate this labyrinth of solutions it is recommended that you carry out detailed investigation and evaluation using your own data rather than relying on vendor demonstrations (which are carefully designed to always work fine) based on the specific needs of your own organisation. You may find it useful to engage a third-party expert experienced in evaluating these technologies to help you.