Data migration snippets

We have recently completed our 2011 survey of the data migration market. I have not analysed all the data yet and it will be some time before we are ready to publish a full report. In the meantime, here are a few highlights.

In the previous survey, in 2007, 84% of projects were running over time or budget. Today that figure is 38%. The two surveys did not talk to the same users so this is not a direct comparison. What is also noticeable is that in 2007 only 10% of projects involved the use of data profiling tools, now that figure is over 70%. Similar figures apply to data cleansing. Also, more than 90% use a formal methodology compared to some 70% in 2007 and those methodologies are much more likely to be independently derived rather than something that has been cobbled together in-house. While we cannot prove a relationship between the increased use of tools and reduced overruns it would be surprising if this were not the case.
We asked respondents to rate their three most critical success factors. By far, the most commonly cited was business engagement. This only confirms that data migration is a business issue.
The most commonly cited reason for overruns was “poor data quality” or “lack of visibility into data quality issues”. Presumably from those companies still not using tools.
The most commonly cited reason for the project was legacy migration (27%). On this topic there was an interesting presentation at the recent DMM4 (Data Migration Matters 4) conference in London whereby Bull was discussing some work it had done for the State of Oklahoma. They wanted to migrate from a mainframe-based navigational database to a relational database (in this case PostgreSQL) on a non-mainframe platform. It used Rever’s toolset to migrate the existing COBOL code so that it runs directly and without change, against the PostgreSQL database. The next phase will be to re-develop the application software. This sort of two-staged process is good practice for large legacy migrations.
60% of all projects involved archival as well as migration.
44% of projects involved sensitive data. 10% of companies simply ignored the issues (naughty, naughty—we know who you are) and are opening themselves up to potential fines. A similar percentage used formal masking tools, which means that most used hand coding. I don’t recommend this: it is too easy to miss data that you should de-sensitise, to use inappropriate methods or to break relationships when they need to be maintained for testing purposes.
Half of all companies profiled the data before estimating timelines and budgets. Of those that did 68% brought in their projects on schedule. Of those that didn’t the figure was 55%.
5% of projects involved migration to the cloud. I don’t know the significance of this but it is an interesting factoid. As I said at the beginning we haven’t analysed all the data yet and the figures detailed above are provisional.

No doubt there will be more insights to be garnered once we look at the figures in detail. In the meantime I hope this is enough for you to chew on.