Update solution on September 22, 2023

Aim

DataBelt® is a single platform data governance and management software product that can be used in various ways on an organisation’s data. It is unusual in that, rather than just a catalog that stored metadata and business rules, dataBelt was designed and built from the ground up to enable the latest software technologies under the banner of artificial intelligence. At the core is a neural network based on an open-source technology called TensorFlow, originally developed by Google. This technology allows a natural language interface to be provided to end users so that they can explore their corporate data landscape. The map of the landscape itself is built up in dataBelt® in an exploratory stage using software agents, rather similar to the manner in which Google methodically trawls the world wide web looking for websites to index. The dataBelt® technology does not just examine metadata in database catalogs, as some tools do. It is able to read text, images and video, and can classify the data that it finds. For example, it might spot a car number plate in an image, or a national insurance number in a document.

Customer Quotes

“DataBelt has enabled our organisation to completely transform itself in its approach to data, how it uses that data and what the data teaches us. It supports all our data initiatives initiatives from information disclosure, records management to counter fraud and forensic investigations. It really has introduced data science to the organisation in a single system.” 
Law enforcement organisation

Mutable Award: Gold 2023

The dataBelt® technology builds a map of a company’s data by way of a virtualised data lake so does not in itself copy or move the data around. This means that master data is left entirely unchanged. It endeavours to classify the data based on its usage, quality and sensitivity in this profiling and discovery stage. Once the map is built, the natural language interface of the technology allows users to interrogate and query the data without having to understand its underlying physical structure. The product uses OpenAI’s ChatGPT to provide answers to users about their data, for example answering questions about sales or HR data within the company. In principle, this raises privacy issues, since answers to such questions reside on the servers of Open AI. This problem can be circumvented by using the solutions’ API that restricts usage just to a secure instance of ChatGPT as a virtual machine within the Microsoft Azure cloud. This is an environment that has been established as secure, and has access limited to the users of a particular company that are granted security access, but it can use other generative AI products such as Google Bard through its open API design. In such cases, a similarly secure environment could be set up as a secure instance within the Google Cloud infrastructure.

Fig 1 – An overview of dataBelt

Once the setup stage is complete, dataBelt® can assist with a range of data quality functionality, including information disclosure, cyber security, records management, forensic investigations, merge/matching of data records, and spotting patterns in the data itself, which could be used for fraud detection, or general access to data via a natural language interface. Figure 1 shows the range of functionality.

The dataBelt® technology provides an unusual approach to data governance. Its artificial intelligence underpinnings allow end users to interface with their data via a natural language interface, without them needing to understand physical data structures or data access languages such as SQL. This should help engagement with business users, who traditionally have struggled to deal with data models and often resort to having canned reports pre-built by their IT departments. Clearly this approach requires some care, since queries generated by an AI tool from a natural language prompt could result in answers that may not be precisely what was intended by the business user formulating the query. Nonetheless, the additional accessibility to data allowed by this approach could be of considerable value in improving business user engagement with their data. The technology certainly offers a different approach to the data governance field compared to traditional catalog products from Collibra and Ablation, or indeed to other tools with data governance capabilities, such as Informatica or Ataccama.

The Bottom Line

DataBelt is a genuinely differentiated product, an interesting example of using artificial intelligence to attack the problem of data governance. Companies looking for a modern data governance solution that has some innovative features should investigate it further.

Related Company

Aim

Connect with Us

Ready to Get Started

Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."

Connect with us Join Our Community