Exasol
Update solution on July 1, 2020
Exasol is a massively parallel, shared-nothing, columnar (with compression), in-memory data warehousing solution that is available on-premises, in the cloud (AWS, Azure and GCP) or in hybrid deployments. Three editions are offered: Exasol Community Edition, which is a single node, free to use solution supporting up to 200Gb of raw data (that is, before compression) with community support; Exasol One which also deploys on a single node but with up to 1Tb of data and standard support; and Exasol Enterprise Cluster, which is unlimited. There is also an ExaCloud private cloud offering.
In the early days of the company it used its own clustered operating system (based on Linux) but it now runs on standard versions of Linux and is based on the use of Docker containers. The company provides auto-indexing, auto-parallelisation of processes, automatic hot/cold memory management, a parallel loader, resource management and self-tuning. The last of these claims might be considered controversial but it is borne out by customer experience. Customer run benchmarks also bear out the TPC-H results mentioned above, with Exasol offering significantly superior performance versus Amazon Redshift, Snowflake, Oracle, Teradata and others, though it should be borne in mind that these are for specific use cases.
Figure 1 – Example of Exasol virtual schemas
A notable feature is the product’s use of virtual schemas, as illustrated in Figure 1, to support virtualised queries across multiple data sources. There are two things to note about this capability. Firstly, it is genuinely data virtualisation rather than data federation, in that it supports push-down query optimisation, with the database optimiser being aware of how to orchestrate the push-down. This is relatively rare: a lot of database vendors are adding query federation (without push-down) but not many support full virtualisation. Secondly, the range of third-party sources supported is more extensive than is typically offered by rivals who normally limit themselves to Amazon S3 and Hadoop and don’t support environments such as MongoDB and Redis. It is also worth remarking on the fact that there are facilities to create your own virtual schemas for other sources you may wish to support. These virtual schemas can also be used for loading data into Exasol.
Customer Quotes
“We are an extremely data-driven company and Exasol was a game changer for us. Queries that took hours now complete in seconds. People gain more trust in data when using it on a daily basis. Today, almost every department is relying on Exasol.”
Revolut
“With Exasol you can run anything. It handles MicroStrategy’s SQL with ease and its extensibility is impressive, we use it to run Python, Java and C++ right there in the database.”
Badoo
Figure 2 – The Exasol architecture
Figure 2 provides more explicit detail of the Exasol architecture. As far as data science is concerned, note the various languages shown. There is also an Apache Spark integrator and support for TensorFlow as well as open source integration for other languages such as Scala. All data science (and analytic) calculations are performed in-database.
Various Geospatial capabilities are provided, but not shown is the support for time-series, for which the company provides a number of capabilities, including generic functions for complex windowing, cumulative sums, moving averages and so forth. These capabilities will be especially important within Internet of Things environments.
Also not shown is support for streaming services such as Kafka.
From a more technical perspective, two notable features are planned for release in mid-2020. Firstly, the company will be offering compute scalability separately from storage scalability, as an option for cloud-based deployments. And secondly, it will be introducing support for graphical processing units (GPUs) to support machine learning algorithms that can benefit from this technology.
There are several reasons why you might choose Exasol. One is the product’s performance. Another is the broader sweep of capabilities when it comes to supporting third-party storage environments, which is more extensive than most other products we have reviewed. A third reflects the product’s strengths in supporting both machine learning and conventional business intelligence and analytics (all in-database), not least through its support for time-series and geospatial processing. While the latter is by no means unique, these capabilities are also lacking in a number of competitive products.
In addition to these technical benefits, Exasol also claims significant cost of ownership benefits. This is partly based on performance considerations – you need less hardware to get the performance you need – and compression; on the self-tuning nature of the database, meaning that administrative costs are reduced; and, not least, by the fact that Exasol licences its product by volume rather than by usage.
The Bottom Line
We have a lot of time for Exasol and we believe that it should be better known than is actually the case. It is difficult to imagine a use case for which you should not be at least considering Exasol.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community