Informatica Data Replication

Written By:
Content Copyright © 2011 Bloor. All Rights Reserved.
Also posted on: The IM Blog

Informatica acquired WisdomForce within the last couple of months. WisdomForce was a pure-play provider of data replication solutions. While little known to the world at large, it had around 100 customers. However, these were not just any customers, many of them are blue chip household names: Wal-Mart and Coca Cola to name just two.

WisdomForce had two products: WisdomForce Fast Reader and WisdomForce Database Sync, which have now been renamed Informatica Fast Clone and Informatica Data Replication. Briefly, the former rapidly clones (or extracts and loads) Oracle data into a variety of heterogeneous targets including databases, data warehouse appliances, and flat files, and there is an additional option to directly stream Oracle data into EMC Greenplum or Teradata for optimal performance. Informatica Data Replication has more generic, heterogeneous capabilities and supports a greater range of source systems beyond Oracle. Both products have initial synchronisation capabilities.

The interesting question is why Informatica acquired WisdomForce. Of course, one answer is that it was under pressure from its user base to provide a replication solution. Particularly within data warehousing, conventional data integration (ETL/ELT) and data replication are increasingly seen as complementary: you want the former for bulk loading batch data and the latter for iteratively loading the data that you want to query in real-time. However, it could have built such a product or it could have acquired a different vendor: what was so special about WisdomForce?

The answer lies in the company’s customer list. Not just that these are major enterprises, though that’s nice, but why these major enterprises should choose a small and relatively unknown vendor as an important supplier of a key technology. So, why did they? According to Informatica there are three major reasons: performance, scalability and ease of use.

Performance and scalability are linked: in this context it means the ability to replicate large volumes of data very quickly, and by large I mean huge (you could call it ‘big data’ though I wouldn’t – the term is too vague to be useful, but that’s a discussion for another day). In terms of ease of use I mean that these new Informatica products do not require you to write scripts, as has been traditional with data replication tools, but instead allows you to define your topology and mappings via a graphical user interface, and to use the same interface to monitor live processes.

Data replication was first introduced to support the replication of live trading data and today it has many other operational applications such as in online booking systems, to support ATM networks and in B2B environments. It is also widely used for high and continuous availability, zero-downtime migrations and various other functions. However, the fastest growing area for data replication is in supporting operational BI and data warehousing, thanks to the increasing demand for real-time information.

Informatica Data Replication supports all of these environments and the company is planning to produce a series of short papers discussing the use of its technology for each of these use cases. The first of these, which focuses on data warehousing (Informatica supports IBM Netezza, HP Vertica and EMC Greenplum as well as more traditional sources) has been written by Bloor Research and is available here for those that want more detail.