Where do you store master data?

Written By:
Content Copyright © 2006 Bloor. All Rights Reserved.

At IBM’s recent Information on Demand conference (which was
excellent, incidentally) the company presented its view of master
data management (MDM). I am glad to say that this has advanced
significantly since its Barcelona conference in May and the company
has now recognised that you need to take a flexible approach to

The company had already appreciated that MDM needs to be treated
holistically rather than as siloed solutions but it has now
realised that different companies want to implement MDM for a
variety of different reasons. In Bloor Research’s
report on MDM
we defined three such categories: analytical MDM,
whereby the emphasis is on understanding customers, products,
suppliers and so forth; synchronisation, where the focus is on
enabling data flow between applications based on unified entity
definitions; and operational MDM, where these definitions are to be
used as an SOA foundation for introducing new functional
capabilities. Of course, some companies may have more than of these
business drivers underpinning their use of MDM.

IBM has now adopted a similar model although it refers to
analytical, operational and collaborative MDM, where the last of
these is about promoting collaborative authoring environments and
it uses “operational” as a term where we would use
“synchronisation”. Alongside this more flexible
understanding the company is also now more aware of the fact that
if you are not going to do analytics against your master data then
you are unlikely to need a hub-based approach. As a result, IBM is
also now being more proactive in explaining how you can use its
solutions within a registry or repository-based environment.

So, good marks all round for IBM.

However, this brings me to the title of this article. I suspect
that there is a sort of assumption that all master data will be
stored within your data warehouse. This view has been fostered by
the hub-based approach espoused by the likes of Oracle, SAP and
still, to a certain extent, by IBM. But does this approach make

Clearly, if you want to calculate customer lifetime value, for
example, then it makes sense to hold the relevant master data in
your warehouse, because this is exactly the sort of analytic
function for which it was designed. But does this still apply if
you only want one of the other styles of MDM? In this case, the
only sort of queries you are going to be running against the master
data is look-up queries. Moreover, you are probably going to be
running a lot of such queries. Is the warehouse the right place to
support such functionality?

I am inclined to think that the answer to this question is no.
It may be convenient to put master data in the warehouse but I am
not sure that this is the most efficient or cost effective way to
do this: wouldn’t it be better to have a dedicated database
optimised for this purpose? Further, if that is a reasonable
proposition then, in a scenario that combines analytical MDM with
either or both of the other approaches, would it be better to still
have a separate MDM server and then replicate that data into the
warehouse (or federate it) for analysis rather than simply
relying on the warehouse?

I am not saying that I know the answers to these questions but I
don’t think that this is an issue that has been much discussed, and
it needs to be.