NoHow is the antidote to NoSQL

There has been a lot happening with respect to declarative programming recently. That is, telling the computer system what to do but not how to do it (hence NoHow). Of course, the classic case is SQL and in the NoSQL world there have been a number of Hadoop-based SQL initiatives from Cloudera, Hortonworks, IBM and so on, but recent events are arguably more interesting.

First, there was a book: “The algebra of data” by our erstwhile founder, Robin Bloor, and Dr Gary Sherman of Algebraix. The book delves into a lot of set theory but the nub of the argument is that you need an algebra – upon which you can build an appropriate declarative language – that will work with all sorts of data, not simply data that is constrained to live in tables that conform to the various normalised forms. From a mathematical point of view, I entirely agree. From a practical perspective I am not sure that this will make any difference to the world at large: I am not convinced that the authors are correct in saying that our industry will move in the direction of the algebra outlined. Perhaps it should, but experience suggests that it won’t. Nevertheless, interesting stuff and worth a read if you are mathematically inclined.

Anyway, enough of that. The next interesting thing is that Couchbase has introduced N1QL (pronounced “nickel”). This is, in effect, SQL for JSON and it is mostly SQL 92 compliant. Not entirely of course, because it lacks table expressions (there are no tables) and because it also adds new syntax, specifically NEST and UNNEST. This is what the company has called it N1QL because the N1 refers to “not first normal form” – in other words, nested structures (documents) are supported. One notable feature of N1QL is that it supports joins.

Also just announced, Neo4j is open sourcing its Cypher language (the OpenCypher project) with a view to it becoming the standard declarative language for graph databases. The potential competitor to Cypher is SPARQL. But SPARQL was designed for RDF databases so I need to explain the difference between an RDF database and a graph database. Put simply, an RDF database consists of entities and relationships. Period. In a graph database, relationships can have properties (attributes) and entities (at least in Neo4j) can have labels, where a label applies context to an entity (for example, Ronald Reagan the president, Ronald Reagan the actor, Ronald Reagan the person). SPARQL doesn’t have any syntax for handling things like properties and labels so anybody who wants to offer property graph capabilities (and that is the direction of the trend) and a declarative language either has to significantly extend SPARQL beyond the current standard, or you use Cypher or, of course, you build your own.

The interesting thing about this is that wherever you look vendors people are moving towards declarative languages. What’s more, SQL is the foundation upon which they are all building (Cypher uses SQL syntax where that makes sense). The NoSQL movement is finally learning that NoHow makes a lot more sense.