Self-service data preparation
By: Philip Howard
Free Download (subject to terms)
Data preparation is the meat in the sandwich between data sources on the one hand and business intelligence, (predictive) analytics and visualisation on the other. For some time, there has been an increasing trend towards self-service business intelligence and analytics, with user departments licensing tools from companies such as Tableau and Qlik so that business analysts can undertake analysis without needing to resort to IT. This self-service approach is fine when it is limited to information that is internal to the relevant department. However, it is increasingly the case that users want to analyse multiple sources of data, often of different types and in different formats, which derive from a variety of internal and external sources. There are two issues with such data. Firstly, you need to know what data is available to you and where it is and, secondly, such data needs “preparation” in order to put it into a form that is suitable for analysis. Moreover, both of these capabilities need to be provided in a self-service manner that is not dependent on IT. Over the last couple of years, a new market segment that enables data preparation has grown up and, more recently, we have seen the emergence of what we refer to as “data cataloguing” tools that enable you to discover and catalogue your data assets. As we expect these two disciplines to merge over the coming years they are both discussed in this paper. We will discuss what such products offer, why they are important and what sort of features to look for. We will start with data discovery, which we will discuss briefly, but the main focus of this paper is on data preparation per se.