From a purely theoretical perspective data governance is about how you govern your organisation's information assets where the governance in question is based on corporate policies with respect to the accuracy, timeliness, completeness and appropriateness of that data, as well as security and retention policies with respect to the data. Some of these policies will be dictated by regulatory requirements and others will be determined internally.
In practice, most data governance initiatives are with respect to relational data only. A few companies have extended this to cover content such as documents. To date, virtually no-one is applying it to big data and where there is any spreadsheet governance, this is typically treated as entirely separate from data governance. In principle, all sorts of information, in whatever format, should fall within the remit of data governance.
Data governance is about the policies already discussed, and the people and processes that implement and monitor adherence to those policies. In other words you define an appropriate policy, then set up the processes that you will use to ensure that that policy is met, and assign responsibility to relevant individuals (often data stewards) who will monitor and assure the results.
Data governance is not a technology. There are technologies that support data governance in its monitoring and preventative aspects, such as master data management, data quality, and data profiling and discovery but each of these can be deployed without any attempt at data governance and it is entirely possible, if difficult, to implement data governance without some or all of these technologies.
The only true products that are directly aimed at data governance (and nothing else), and these are few and far between, are those that aim to capture and manage the policy and process aspects of data governance.
Data governance is driven by a confluence of interests: the CMO may want to have more accurate information about customers so that he can market to them more effectively, the CSO wants to ensure that data masking is applied to sensitive data, the compliance officer wants to make sure that data archival and retention policies are adhered to, and so on. We also know of cases where the head of personnel has been actively involved in data governance and, indeed, any C level executive may be actively involved, depending on circumstances. Because of the nature of the technologies involved the CIO and others within IT will have a significant input into any decision making.
Typically, data governance comes under the aegis of a data governance council that reports at board level, often with a C level executive on the council. In terms of actual implementation and maintenance the person most likely to be involved are business analysts and data stewards who will often work closely together.
Historically most compliance requirements have been around the processes used to manipulate data. The accuracy of the data itself was of no concern. Sarbanes-Oxley and its derivatives are a classic example of this. However, this is starting to change. Solvency II, MiFID II, Basel III and Dodd-Frank are all examples of legislation that apply to data as well processes. The words used by both Solvency II and MiFID II are telling: "data should be accurate, complete and appropriate". While these acts do not actually mandate data governance they come as close to doing so as possible without actually saying so. And note that they do not limit themselves to data in your databases: it equally applies to data in, say, spreadsheets.
The reason this is important is because we expect more governments to introduce more legislation that is focused, at least in part, on data accuracy and completeness. Of course, there is already a significant focus on data privacy.
Vendors have been slow to introduce features specifically designed to support data governance, as opposed to complementary technologies such as data quality. The only common exception is support for a "Data Steward" interface but this does not actually provide additional capabilities.
In terms of more functionality the exceptions are Ataccama, which provides a tracking facility to monitor that processes are being followed; and Kalido, which offers a policy control and management suite that supports the definition of the policies to be enforced together with monitoring thereof. More recently, IBM has introduced its Business Information Exchange product, which is a policy hub not just for data governance policies but also for use in business, security and privacy requirements that go beyond pure governance.