I met someone at a recent CMSG meeting who suggested that one aspect of the Big Data opportunity was about to hit major problems because it was collecting data for targeted marketing to individuals and no-one is paying attention to the EU data protection implications of this. Philip Howard, however, points out that this isn't really a Big Data issue but something that needs to be considered by the Business Intelligence and Analytics applications further on down the line.
OK, that makes sense; but reading what lawyers are saying about emerging EU data protection legislation makes me wonder whether that is the whole story, from a governance rather than from an IT point of view.
Looking at the current UK Data Protection Act (DPA) guidance, it seems likely that whatever Big Data is, if it is being "recorded with the intention that it should be processed by means of such [computer] equipment" then if it contains "personal data" it is subject to the DPA provisions even if it isn't actually being processed yet. And why would you go to the expense of collecting and managing this data it if you don't intend to use it for analysis and decision support?
So according to an article in Outlaw.com, a Pinsent Masons online legal news service, the "individuals' consent is 'almost always' required by firms when using personal data in big data projects centred on profiling" and "such consent should be required, for example, for tracking and profiling for purposes of direct marketing, behavioural advertisement, data-brokering, location-based advertising or tracking-based digital market research." OutLaw.com is reporting on an EU working party report that is available in full here.
Now, is implied consent from signing up to a loyalty card, say, or signing a 28 page document no-one reads, sufficient? That's a question for the courts, perhaps, rather than for an IT specialist. And since 'data subjects' can request access to their personal data and even have it changed if necessary, have Big Data projects ensured that such requests can be actioned efficiently and effectively - or is this an unanticipated expense that will bite once the project goes live?
One risk is that disgruntled customers use data protection legislation as a weapon against a company, partly because the regulations have increasingly serious teeth anyway: there's a whole range of new penalties being proposed and anticipated (future) sanctions include fines starting at 0.5% of annual company turnover worldwide for minor breaches and rising to 2% of annual worldwide turnover for intentional or negligent breaches. So, having to satisfy requests (malicious or not) from data subjects may be a significant cost, as may being prosecuted—to say nothing of the reputation risk involved.
Is collecting large amounts of data and personal IDs (potentially allowing the identification of individuals and their buying patterns), even in advance of actually having any BI or analytics systems processing it, a DPA risk? Well, ask a lawyer, don't ask me—or even (probably) your IT group—but it might be good to find out in advance of doing it. And the answer might affect the anticipated ROI for the project.
I don't want to scare-monger; and there are exclusions "where businesses engage in big data projects that involve trying to "detect trends and correlations" from personal information, they may not require individuals' consent to process their data for that purpose providing they put in place certain safeguards". However, do people in your organisation know what these safeguards are and have they been costed in? They probably include ensuring that the information is kept confidential and secure and that "all necessary technical and organisational measures" have been taken to ensure that "this 'functional separation' of the data [thus preventing identification of individuals] is maintained"—how would you prove in court that you've done this?
If you are embarking on a Big Data project (whatever that means—and that could be part of the problem), even a 'proof of concept' using live data, shouldn't you be looking at the governance implications, especially around EU Data Protection directives, as a matter of urgency?
In part 2 of this paper, I talk to some experts in the field about these and associated issues; and locate some more resources.