Big Data is getting the IT industry very excited. But I see a lot less excitement when it comes to business people. Part of the problem is that Big Data really does not mean a lot to most people and, secondly, when most discussions centre on Hadoop clusters and MapReduce functions I would guess that most people outside of IT have switched off already.
I am also of the opinion that far too much of this seems to just be technology for the sake of technology, that is partly because many of the big vendors do not have much above the level of a basic capacity to sell big lumps of tin and wire but do not want to be seen as laggards.
But if you get past the techie outer shell, Big Data should be something that is on the agenda of everyone. Big Data is about branching out from the traditional IT world of structured data stored and accessed by rigid key structures to include all forms of data-so that's text, speech, and video-in fact if you can communicate it electronically it can be captured and analysed.
The reason that this is important is that it should allow companies to really start to understand what their customers really think, want and are prepared to pay for. The killer app is a precise vertical market customer and market behaviour understanding, really knowing what is good behaviour and bad and why it is occurring. But to do that you have to be able to find what you are looking for.
So far most of what I have read suggests that you will use Search, just like going into Google, Yahoo or Bing to find areas of interest in the unstructured data, and then having found an area of interest swap to standard BI tools to drill down and explore the structured data. That seems to me to miss the point. What we need is simple intuitive interfaces which everyone can use, i.e. search looking at all of the data-structured and unstructured together-and allowing it to be explored together. But how could you do that?
I have now seen Endeca Latitude and for the first time I have seen something that really hits the mark. The user can put in a search term and it will bring back all of the references in the structured and unstructured data that are related to it. How it does that is by in the background they have an index layer that has worked out what is related to what, in what sequence, with what frequency etc.
They call this Agile BI, as it is flexible, responsive and very intuitive; a somewhat misleading term but you have to start somewhere to describe something that is quite radically different, and unbelievably effective.
The real key to the agility is that search terms do not have to be a precise match. You insert a term, as used in common parlance, and the software will find all of the links, reflecting not a structured key sequence but their real life occurrences, so it's fuzzy matching. Suddenly you can imagine marketers, call centre operatives, line of business mangers suddenly being able to find the gems of information held in the mountains of corporate data that have previously been hidden from view by the limitations of traditional BI and its use of highly structured keys and schemas to address something which is far less structured in the real world.
I am not going to pretend that Endeca is yet a complete solution, I believe that it can handle quite large volumes but not the enormous ones that will soon be commonplace, but it is the nearest thing yet to getting there.