AnalytiX Mapping Manager, the missing link in moving data around

It is remarkable, despite of the maturity of the market for data migration and the presence of all of the big players in this market, that when you watch the process taking place on the ground in organisations big and small there is one omnipresent component that is key to the whole process – the Excel spreadsheet! The spreadsheet is all but universally used to hold the pre-ETL mappings. If any process is only as good as its weakest link, this is it in data migration.

The people behind AnalytiX come from a background of having actually built data warehouses and have seen this issue where you have to employ highly skilled data analysts to go out and identify where the data for each field you require in your target systems should be sourced from, and the only tool you give them to support this vital activity is a spreadsheet. Some 6 years ago they started to build a tool that would fill that gap, and today they have built a base of over 400 customers and have 20 resellers, including, since November 2011, a relationship with HP whereby HP use AnalytiX as a standard on all of their internal data transfers and introduce their customer base to the advantages that the tool can bring. The tool has been recognised by the Informatica world as being the best complimentary technology in the market.

When you see AnalytiX’s Mapping Manager you just wish that on all those projects that you have had in the past you could have been provided with a tool like this. It is simple, elegant, very powerful, and meets a pressing need with a pragmatic solution that does just what is required. It is a web-based tool, so it is universally accessible. It scans meta data and builds a repository, creating the first cut columns and data maps and then the analyst can refine theme using a drop and drag interface. The scan captures around 50 different elements of metadata. AnalytiX, very conservatively, in my opinion, having managed this process many times, claim the toll will provide a 70% productivity improvement. That claim must be a low end and just relates to the basic mapping, because the tool also manages versioning at a mapping and a project level – something that is impossible with spreadsheets – so the ability to avoid error and track change must improve on the 70% figure on nearly every project by a significant margin.

This tool must be essential for everyone who is looking at data warehouses, data migrations, master data management projects and all aspects of data quality. It is self-documenting and is fully auditable, so it is, to my mind, just essential and I know of no other tool that addresses the requirement.

The tool has three component parts. Firstly there is the Resource Manager where system users, with their security permissions and their projects, and who is assigned to them, are created and managed. Then there is the Systems Manager where the sources and targets are defined, and from where the metadata scan is initiated to create the baseline metadata from source and target systems. The third component is the Mapping Manager, the drag and drop element where the baseline is refined to build a complete picture of source to target mapping. The tool supports the needs of the data analyst to create the mappings and the ETL developer to have their input defined. At present the tool will interface directly with Informatica PowerCenter, IBM DataStage, and Microsoft SSIS and, through their relationship with HP, an interface to other ETL tools are in their early stages. The tool is also a vital aid to the QA tester, providing the ability to audit ETL jobs for accuracy with a reliable repository of mapping data.

Another feature that is worth mentioning is that pre-existing spreadsheets can be uploaded into the tool, which is wizard driven, and I am sure many people will find this of immense value.

Business rules can be built up in the tool. These are reusable transformations; a base set are supplied and they can be configured and augmented as required. The tool allows for impact analysis, identifying who uses data, where it resides, names for data that have been used within systems etc. so all of the things that you need are there.

Other features of note are that the tool, in edit mode, locks out other users as you map complex business rues so you avoid conflicts. The self-documentation feature ensures that you always have a base level of documentation that is version controlled, auditable, and ticks most of the boxes that I can think of that I would require. Testing notes can be attached to the mappings. Supporting documentation like Visio diagrams can be stored in the repository attached to the relevant element. There is a workflow system to ensure that all the relevant bodies must provide approval before a document can advance to be used. Finally from my list of notable features, the system can record and report on effort planned versus actual level of effort, provide status reports, and provide more oversight than is the case on most projects and without the need to take big chunks out of the working week to provide.

I was so impressed with the tool that I spoke with executive leadership at AnalytiX about what is next on the roadmap with the tool. They explained they have introduced a new licensing model – “Free” for the first user of the tool. They have extended the integration with ETL Tools and, with version 4 (due out early June 2012), they will introduce the world’s first ETL Conversion tool, where they will be able to reverse engineer ETL jobs loosely coupled via XML and convert them as mappings in the Mapping Manager and then export them back out as an XML file in the required format of the ETL tool (agnostically) to auto-generate the ETL Jobs.

I am very impressed by this tool; it is just what is required, and it deserves to be adopted by everyone who works with data. How this has not been addressed before is a mystery, but now we have a great solution and I can heartily recommend it.