Incorta: Not easy to position

Incorta is one of those companies whose product does not fall into an easily defined niche. The Company describes it as a “hyperconverged data warehousing platform”. However, it isn’t obvious what this actually means. This is mostly because “hyperconverged” isn’t a commonly used adjective and partly because even “data warehousing” doesn’t mean what it used to.

If we consider the latter point first, it used to be the case that a data warehouse was the system of record for supporting analytics, but then the concept of data marts arose, so that what used to be called simply a data warehouse became referred to as an enterprise data warehouse (EDW) with the general environment spanning both EDWs and marts being referred to as data warehousing. And then there was NoSQL. There are direct parallels here, because both Incorta and Hadoop could be used, at least in theory, as an EDW. However, in most cases, either or both of them are more likely to be used running alongside an established EDW rather than as a replacement for an EDW. Not least because of the sorts of investment that large organisations have already put into their EDWs.

The parallels with Hadoop are, however, limited. Not least because Incorta is essentially about structured data (though it supports CSV files and JSON) and is not a data lake. More particularly, however, because the secret sauce behind Incorta is hyperconvergence. What this achieves is that when you load data into Incorta from a database (Oracle, dB2, SQL Server, MySQL, Vertica, SAP HANA, Netezza, Teradata or Redshift), cloud application (Salesforce, NetSuite, ServiceNow and so on) or otherwise (Kafka, Google BigQuery, Hive) the platform automatically captures all of the relationships that exist within and across all of those data sources. Exactly how it does this the company is unwilling to say and it has refrained from submitting any patents precisely to prevent anyone else finding out. There are several characteristics of Incorta that derive from this. One is that the software automatically recognises and implements the security provisions defined within the source database. A second is that, in effect, Incorta acts its own data warehouse automation tool, so that you don’t need to worry about things like defining a warehouse schema because that’s done for you. Which in turn means that you get something much closer to a self-service environment, where ad hoc queries don’t get bogged down in IT backlogs. Thirdly, Incorta claims that its use of “direct data mapping”, which is what we are talking about, means that you get linear scalability on join performance. And that’s a big deal. Actually, it’s a very big deal: you won’t get that from a traditional data warehouse.

There are, of course, some other relevant details, such as highly compressed storage (Parquet), support for Spark, integration with BI tools such as Tableau, elastic scaling (Incorta is cloud-based) and so on. If you are looking for a cloud-based database then Incorta is certainly worth evaluation, not least because it resolves problems with respect to ad hoc query processing that other in-cloud data warehousing products like Redshift or Snowflake don’t address well.