IBM JSON

Readers will be aware that there has been some confusion about the support for JSON (JavaScript Object notation) documents in the latest release of DB2.

I had originally understood – when I was pre-briefed on this back in October last year – that JSON was going to be supported via a separate storage engine, as is the case with the XML and graph stores in DB2. I was then informed that this was not the case, and they were going to be stored in tables. I was not impressed by this as the implication was that the documents would need to be shredded (and then un-shredded on retrieval), which is precisely the reason why an XML storage engine, as in DB2, is so much better than storing XML in tables.

However, on further investigation, this turns out not to be the case: JSON documents are not shredded. What in fact happens is that name-value pairs are compiled into binary format and then stored within a single column within a table. This enables such things as predicate evaluation in addition to the searching that you can do thanks to the indexes that are created.

In terms of how you access and retrieve documents, IBM has built it so that it looks like MongoDB, which has pretty much been the big data preferred option for JSON documents up until now (though MarkLogic is also making a push for this market). But, of course, DB2 offers ACID compliance, recovery and so on (as does MarkLogic).

This JSON capability is currently in “technical preview”, which means that anyone can use it. It’s what you might call in gamma (as opposed to in beta). It will probably be launched formally in the Autumn and it will be available both for distributed and mainframe platforms, though not at the same time (probably a two or three month interval).