Privacy Analytics is a Canadian company that specialises in what it calls the de-identification of data. You and I might call this data masking. The company has been in the market for some time - 8 years - and historically it has specialised in the healthcare market though it is now expanding into other areas.
What's special about healthcare is that the patient data collected by hospitals and health authorities is also used for research purposes. Researchers want to know how a particular treatment regime or diagnostic technique works on different segments of the patient population. In order to support this, researchers will often need to know the age of the patient, the area within which he or she lives, and other information that would normally be masked because it is part of personally identifiable information. In other scenarios you might simply mask all of this information but then researchers could not do their work. So Privacy Analytics needed to be able to satisfy compliance standards such as HIPAA while still supporting research. As a result, the company built its product around two components: a risk monitor and a de-identification tool. What happens is that the Privacy Analytics Risk Monitor measures the level of risk contained in the data, and the de-identification tool applies the appropriate amount of alterations on the data. The more you mask, the lower the risk. The key issue is what is a safe level of risk from a privacy standpoint while still allowing researchers to do their job?
So far so good, but late last year Privacy Analytics unbundled its Risk Monitor from its de-identification software so that each can be licensed separately. This means that you can use the Risk Monitor to measure the risk associated with conventional data masking practices. And, perhaps more importantly, it means that you can license Risk Monitor to be used in conjunction with third party data masking tools. This, it seems to me, is much more significant. Not least because I don't know of any other data masking vendor that can offer risk monitoring.
So, why - or where - is risk monitoring important? The most obvious answer is not testing and development, where data masking is most commonly deployed, but in the work done by data scientists. Data scientists, whether they use self-service data preparation tools or not, do things like create customer segments for marketing purposes, using a variety of statistical and analytical algorithms so that you can make predictions about next best action or customer churn. They are, in fact, analogous to researchers in healthcare, in the sense that they are researching into customer behaviour. They, too, will not normally be allowed to see personally identifiable information, yet they need to see enough information to do their jobs. Using Risk Monitor means that they can verify that their data masking practices are compliant with legal requirements, and the software can also generate appropriate documentation to prove this from an auditing perspective.
As it happens I have talked to a number of data preparation vendors about data masking and although one or two have it on the roadmap it does not seem to be a priority. What they really need is risk monitoring and the ability to ensure that they minimise the risk of re-identification.
No doubt there are other use cases for Risk Monitor but readers will know that data preparation is a particular interest of mine at present (a new Market Update on the subject is already in process), hence my focus on this area. In any case, I think the availability of Risk Monitor is a significant step forward for the data masking market as a whole.