skip to Main Content

NoSQL Databases

Last Updated:
Analyst Coverage:

Originally a NoSQL database was conceived to be a database that was not relational and did not support SQL. Practice has determined that people want SQL, so the “No” in NoSQL now stands for “not only”. Thus, a NoSQL database is one that does not store data in relational tables (either row-based or columnar). Unfortunately, this definition is not especially useful: it means that databases such as Adabas or IMS, not to mention object oriented and XML databases (just to give another couple of examples) are, technically, NoSQL databases.

Leaving aside this question it is perhaps better to define what NoSQL databases are, rather than what they are not. The database architectures that are commonly thought of as falling under the NoSQL banner are key-value stores, document stores and column stores (not be confused with column-based relational databases). In addition, Graph and RDF databases may be considered to be NoSQL databases but they have some very distinct characteristics that differentiate them from other NoSQL products and they are treated as a separate subject in their own right by Bloor Research. There are also hybrid databases that share features of different NoSQL models such as Document-Graph stores.

There are NoSQL databases targeted at batch-based analytics, real-time analytics and transaction processing so it is not possible to generalise about relevant use cases except to say that products aimed at transaction processing will be ACID compliant and the others won’t be. Also in this context many NoSQL databases are “eventually consistent”: this is not good enough for true OLTP environments.

One feature that characterises what most people think of as NoSQL databases (but excluding a number of graph database products) is that they are designed to scale out rather than up. That is, they run across distributed, usually low cost, clustered environments. If you want more capacity you add new nodes rather than expanding a particular server. Commonly this process involves sharding, which is the way that you distribute data across the nodes in the cluster in order to optimise performance. However, unless additional measures are provided, sharding is only useful if you know in advance how you are going to access the data. Note that some (column-based) relational databases also use sharding: these should not be confused with NoSQL databases.

Depending on the type of NoSQL database there are several potential benefits that can be derived from using them. The first is that they run on low cost hardware. However, the expertise and management costs of administering and programming a NoSQL deployment may exceed that of a conventional environment to the extent that any savings are more than swallowed up.

Secondly, NoSQL databases are better suited to storing a variety of structured and unstructured data types, that traditional relational environments have not supported in the past. However, the major relational vendors are adding things such as support for JSON documents so this advantage is shrinking and may eventually disappear.

Thirdly, many (not all) NoSQL databases are schema-free. This makes them significantly more flexible when it comes to adding new types of information: you don’t have to change the schema definition.

Finally, there are specific advantages related to NoSQL databases that address the transaction processing and real-time analytics markets. In the case of the former these are akin to NewSQL databases: true distributed capability, much smaller footprint, easier management and so on. With respect to real-time analytics, NoSQL databases simply do something that you can’t do with a conventional database. In practice, these sort of solutions can be thought of as low-end streaming analytics platforms. They won’t cope with the sort of volumes (millions of events per second) that true event streaming products but will do very nicely thank you, for tens of thousands of events per second, which is more than you could expect a comparably priced relational product to achieve.

Perhaps the biggest trend in the NoSQL space is towards Apache Spark and away from Hadoop and MapReduce (though Spark will run with the Hadoop distributed file system, among others). Spark provides significantly better performance (orders of magnitude) for some applications as well as supporting SQL, graph analytics and streaming analytics.

It has been suggested elsewhere that the term “NoSQL” will no longer be a differentiating factor by 2017, as the major database players add more capabilities to their own database products. While the NoSQL tag will no doubt remain, we agree with this view.

More generally we have now got to the stage where vendors are starting to disappear – either going out of business altogether, being acquired by larger companies either to leverage the technology internally or for incorporation into their own product stack.

Everybody and his uncle is playing in the NoSQL space in one way or another but we are already starting to see consolidation. For example, Apple acquired Acunu (Cassandra based) and Experian acquired 4Store (a graph database). In both cases these were for internal use only so these are just two examples of products that have now disappeared. We expect many more to follow.

Solutions

  • Actian logo
  • AEROSPIKE logo
  • AWS logo
  • Cambridge Semantics (logo)
  • COUCHBASE logo
  • DataStax (logo)
  • ESGYN DB logo
  • INFLUXDATA logo
  • MONGO DB logo
  • N5 logo
  • Neo4j (logo)
  • Objectivity (logo)
  • Progress logo
  • Redis Labs (logo)
  • SCYLLA logo
  • SPARSITY logo
  • TigerGraph (logo)

These organisations are also known to offer solutions:

  • Accumulo
  • Berkeley DB
  • Cloudata
  • Cloudera
  • CouchDB
  • Datameer
  • Dynomite
  • Franz Inc
  • GenieDB
  • GlobalsDB
  • Hadapt
  • Hadoop
  • HamsterDB
  • Hbase
  • HortonWorks
  • Hypertable
  • IBM
  • InterSystems
  • Lexis Nexis
  • MapR
  • Memgraph
  • MonetDB
  • Ontotext
  • Oracle
  • Pivotal
  • RaptorDB
  • RIAK
  • Scalien
  • Splice Machine
  • Stardog
  • Teradata
  • Tokyo Cabinet
  • Vaticle
  • Voldemort
  • Zettaset

Research

N5 RUMI InBrief (cover thumbnail)

N5 Rumi

Rumi is a software platform that enables enterprises to embed rich, real-time analytical data processing directly into their transactional applications.
GRAPH DATABASE MU cover thumbnail

Graph Database (2020)

This is Bloor's fourth Market Update in this space, which discusses the state of the graph database market as of early 2020.
00002583 - AMAZON WEB SERVICES InBrief cover thumbnail

Amazon Neptune

In Amazon Neptune, both RDF graphs and Property Graphs are stored in a “quad” representation using a custom data model.
CAMBRIDGE SEMANTICS InBrief cover thumbnail

Cambridge Semantics AnzoGraph (2020)

AnzoGraph is a massively parallel graph database that runs on HDFS, NFS and other big data platforms and is ACID compliant.
DATASTAX InBrief - cover thumbnail

DataStax Enterprise (DSE) (Graph Engine)

The DSE Graph Engine is a property graph that is built into DSE and leverages DSE’s capabilities for storage, search and analytics.
FRANZ ALLEGROGRAPH InBrief cover thumbnail

AllegroGraph (2020)

AllegroGraph from Franz Inc. is a semantic graph database focused on generating semantic knowledge graphs.
00002587 - GRAKN InBrief cover thumbnail

Grakn Core and Grakn KGMS (2020)

Grakn consists of a database, an abstraction layer and a knowledge graph, which is used to organise complex networks of data and make them queryable.
MARK LOGIC InBrief cover thumbnail

MarkLogic Data Hub Service and MarkLogic Server

MarkLogic Server is a multi-model database that can be used to store documents, relational data via tables, rows and columns, and graph data.
Back To Top