Analyst Coverage: Philip Howard
Data integration is set of capabilities that allow data that is in one place to be used in another place regardless of how they are formatted. This may be done by physically moving the data or not. Physical movement technologies include ETL (extract, transform and load) and ELT (load the data before transforming it) and variations thereof; data replication and B2B exchange (which is essentially a use case). These may be supported by change data capture and other associated techniques. Data virtualisation is the technology used for accessing data without moving it.
Here we specifically define data integration as being technologies that encompass more than just ELT or a variety thereof but which will include replication, B2B exchange and/or data virtualisation as additional capabilities. It is also frequently the case that data integration providers also offer data quality, data governance and/or master data management (MDM) capabilities (amongst others) as integral parts of their data integration platform.
In order to combine and use disparate data that is in different formats you need to transform the data so that it is in a consistent format. When data virtualisation is being used this means creating a virtual view of the data and then using the tool’s abstraction layer to provide that consistent format. However, when physically moving the data the data itself has to be transformed as a separate process. Even in the case of replication this is often the case, depending on the purpose for which replication is being used (for back-up purposes or to support real-time business intelligence).
One notable use case for data integration technologies is data migration, although this will often require other technologies as well, such as data archival, data discovery, data quality and data masking.
Data integration, in all of its forms, is an enabling technology rather than a solution in its own right: it is used to create data warehouses and to exchange information with business partners and between applications. Thus it is most likely to be of interest to CIOs and IT architects.
However, there is an increasing trend towards the deployment of SaaS (software as a service) applications and this is often done at the behest of line of business managers. Surveys have suggested that 51% of companies deploying new SaaS applications have had issues with data integration as part of the implementation process and 19% of projects were cancelled for the same reason. So, increasingly, business executives need to care about data integration.
Historically, replication, ETL and data virtualisation have been regarded as separate technologies and, indeed, there remain many vendors that specialise in just one of these areas. Even platform vendors in this space often effectively have separate solutions across these areas. However, the move towards logical data warehouse mandates that these technologies become more closely aligned as replication, virtualisation and ETL will all be required to support such environments.
In addition, the trend topwards SaaS deployments creates a major headache, not just for loading data into the new application environment but also in supporting cross-application integration and information sharing. There needs to be a major shift towards the automation of connectivity in these sports of environments. Vendors such as Dell (Boomi) and Pervasive are certainly talking about this but to what extent it is actually happening remains to be seen.
Companies such as IBM, Informatica and Talend continue to make acquisitions to broaden their data integration portfolios. In the case of Talend this means extending beyond data integration and into application integration and business process management.
However, broader product portfolios do not necessarily mean that the products themselves are integrated and this space remains bedevilled (there are exceptions) with loosely connected, diverse products that are marketed as a ‘platform’.
In general, we believe that data integration tools are not easy enough to use, are not automated enough and do not perform as well as they should. The market is ripe for some innovative developments.
Further resources to broaden your knowledge:
What’s Hot in Data
In this paper, we have identified the potential significance of a wide range of data-based technologies that impact on the move to a data-driven environment.
Comparative costs and uses for Data Integration Platforms in Agile Enterprises
Data Integration should be at the heart of data architecture modernisation initiatives such as next generation analytics, application modernisation etc.
Getting value out of data integration
I am going to be having a live discussion about getting value from data integration tools: come and listen!
Comparative costs and uses for data integration platforms - research and survey results 2014
There is a surprising lack of primary research to support any kind of claim with regards to the cost of, or cost-effectiveness of, data integration solutions.
Integration Enables Instant Gratification in SaaS Adoption
Today's fast-paced, socially connected world has led consumers to crave instant gratification, and your customers and prospects expect just this. If you are a software vendor with a SaaS application, or if you adopt SaaS applications to run your business,
Next steps for Data Integration
This paper examines the issues facing data integration tools today and in the near future.
The economics of cloud managed integration
In this paper we will consider what is required to enable cloud-managed integration from a technical perspective.
Data Migration Survey 2011
The full report is now available. We identify how to save an average of $170,000 per project.
Informatica B2B Data Exchange and Data Sharing - Scalable Multi-Enterprise Data Integration
Join Philip Howard and Informatica in this free webinar.
Data exchange and information sharing
This paper is is about data exchange and the ability to share information.
Comparative costs and uses of Data Integration Platforms - research and survey results 2010
Our results show dramatic, and surprising, differences between vendors and products in both overall TCO and in cost per project and cost per source and target s
Avoiding the Integration Tar Pit - Agile Integration for fast results
Almost all projects involve data integration these days, because nearly every IT project involves multiple applications and databases.