Big Data bias - An analysis of recent research from Varonis

Written By:
Content Copyright © 2012 Bloor. All Rights Reserved.

Varonis has just published some interesting research into big data. Now, Varonis specialises in data governance and not just regular data governance but the governance (and security) of unstructured data. I’m all for that as far too many data governance projects are just about the quality of structured data.

Its research was conducted at InfoSec, the security exhibition recently held in London. Now, as you might imagine this is a pretty specific audience, primarily consisting of IT security specialists, so the results of any research conducted with such a population is bound to be biased. Moreover, Varonis’ focus on governance and security means that the questions it was asking in its survey were also geared towards this market.

You wouldn’t have guessed this from the press release that the company put out. It baldly states that: “when asked how they would like to use Big Data, the respondents had clear ideas – the top three most selected applications were: finding at risk sensitive data, identifying possible malicious activity and finding users with excessive access rights.” Frankly, when I first read this my gast was flabbered. Yes of course there are applications of big data in this space but the idea that these are the top three use cases, without qualification, is ridiculous. This is why I went to the Varonis web site and downloaded the actual report, to discover what was behind this absurd assertion. And, once you realise that this was a survey of people visiting Varonis’ stand at an exhibition then it all starts to make sense; and when you see that all the applications people had to pick from were security applications then it makes even more sense.

Anyway, apart from saying that I don’t think this press release does Varonis any favours by not qualifying its survey more clearly, I will move on to the results themselves, as some of the non-security specific results are quite interesting.

To begin with almost 60% of respondents felt that “there is a clear definition of big data and its uses for IT“. Frankly, I’m surprised it’s that high but then these are IT people so perhaps they have a better understanding than business people? Less surprising was that almost three quarters of the people surveyed rated themselves at 5 or less on a scale of 1 to 10 when it came to “awareness of and visibility into the big data products currently in the market” and 22% scored themselves as scoring only 1. Given that all the hype is around Hadoop with barely a nod to Cassandra and MongoDB (for example) and no mention at all of HPCC (article forthcoming) or of graph databases then I expect that even those that think they are fully aware of the big data market (a little over 5%) are over-estimating what they think they know.

When asked whether “big data should be a key strategic priority for IT” 69% agreed with this statement and the more people thought they understood big data the more likely they were to agree with it, which is encouraging. However, the caveat is about big data being strategic “for IT”. When you see a question like this do you answer on the basis that IT supports the business and if social media analysis is important for the business then it’s important for IT, or are people just thinking about IT functions like security? Without getting into people’s heads I don’t know the answer to that.

Perhaps the most interesting set of results were in response to the questions as to whether “big data is a strategic initiative currently within your organisation or within 5 years?” and with respect to whether “your organisation is currently implementing big data projects?” Note that both of these questions relate to the organisation rather than IT per se. For the first question the answers were 44% and 57% respectively and it was 46% for the second. The discrepancy between 44% and 46% suggests that some of the initial projects are not yet regarded as strategic. Most notably, even small companies are investigating big data: more than 20% of them have existing projects.

So, some interesting statistics albeit with some caveats.