Vertica
Last Updated:
Analyst Coverage: Philip Howard and Daniel Howard
Vertica was founded by Turing Award winner Michael Stonebraker and Andrew Palmer. It is now part of Micro Focus. Most recently, Vertica acquired Full 360 – a long-time strategic partner – to better support the (public) cloud and facilitate the creation of Vertica Accelerator, the company’s managed service offering.
Vertica operates as an independent entity within Micro Focus, with its own salesforce and support organisation, but leverages its parent company’s global reach and presence. Vertica is also embedded within a number of Micro Focus products as well as third-party vendors, including Domo, GoodData, Sky IT Group, Cerner, and others.
Vertica Analytics Platform (2020)
Last Updated: 1st July 2020
The architecture of Vertica is illustrated in Figure 1, though it needs some explanation. In particular, Vertica is available in two modes: Enterprise Mode and Eon Mode. The former provides a conventional tightly coupled environment for deployment on-premises, in the cloud (AWS, Azure and GCP) or in hybrid environments. Eon Mode, on the other hand is available on-premises or in an AWS cloud (Google is in beta and Azure is planned) where compute is separated from storage, thanks to an intermediate caching layer. This is essentially a mechanism for overcoming network latencies and, in on-premises deployments where this is an issue you may (wish to) turn this off to improve performance. The other thing we should say about Eon mode is that it relies on Amazon S3 storage, in conjunction with Vertica’s read-optimized storage model (as opposed to its write-optimized storage). For on-premises implementations of Eon mode, Vertica has partnered with Pure Storage to use that company’s Fast Object Storage on Flashblade, making it the only advanced analytics and data warehouse platform that separates compute from storage for on-premises data centres.
Customer Quotes
“It only takes one to two minutes to generate reports in Vertica, instead of three hours previously. It’s huge to be able to move that time up, to have that information about what happened yesterday and to better track sales.”
GUESS
“Currently, it (Zoined) is a strictly Cloud-based product running on Amazon Web Services, however… Vertica allows us to be multi-cloud and to offer Google, Azure, and on-premise hosting. That flexibility will be essential as we grow.”
Zoined
Regardless of the mode, Vertica is a massively parallel, columnar database with advanced compression capabilities. It supports workload isolation and uses projections to speed up the performance of frequently run queries. As illustrated in Figure 1, the environment uses external tables to act as a SQL engine with which you can query these other data sources. Thus, you might deploy Vertica simply as a data lake engine or Vertica might be implemented as a conventional data warehouse with federated query capabilities. Note, however that this is data federation rather than data virtualisation, as there is no push-down optimisation, which you would expect from a data virtualisation vendor. Note also that projections do not apply to these external tables.
Going beyond traditional data warehousing capabilities, Vertica supports in-database machine learning and analytics, along with support for both R and Python. Rather than provide a list of all the relevant features, Figure 2 provides a snapshot of supported capabilities, though it is not complete. It does not mention, for example, that Vertica supports geospatial data types or that you can run create, train and deploy K-means clustering. It is also worth noting that although Figure 2 shows that you can sample data sets using Vertica, this is merely an option. Indeed, the company regards it as a differentiator that you will rarely need to sample the data when using Vertica.
Two additional resources that are worth discussing are the Vertica Academy and Vertica Advisor. The former is a free e-learning platform (with certifications) while the latter is a tool that, amongst other things, uses machine learning to monitor and advise on query performance. It is currently only available as a part of support engagements but the company plans to introduce further automation into the tool and embed it within the product with capabilities that extend to cluster health, and the database configuration, as well as various performance characteristics. While on the subject of roadmap items it is also noteworthy that Vertica is intending to introduce model management capabilities, along with support for PMML (predictive modelling mark-up language).
There are several notable reasons why you might choose Vertica. The company would claim that performance is one of these though this will depend on the use case. We like the choice between a traditional architecture and one that separates compute from storage. And we especially like the fact that this is available both on-premises and in the cloud. We also appreciate that Vertica can be deployed as a data warehouse, as a SQL engine, and/or as a federation engine spanning multiple (types of) data stores. Another major strength is the in-database analytics and the depth of support for machine learning.
Also notable is Vertica’s success as an embedded database within OEM environments. The fact that Vertica has had significant success in this area is a testament, not only to its capabilities, but also to the robustness of the product and its ease of use.
The Bottom Line
Vertica exists within a crowded market. While it has significant strengths as detailed in the previous section, none of these – with the exception of on-premises storage/compute separation – is unique. It is the combination of all of these factors that makes Vertica attractive.
Vertica Data Warehouse and Analytics Platform (2022)
Last Updated: 6th June 2022
Mutable Award: Gold 2022
Vertica is a unified analytics platform built on top of a SQL data warehouse. It offers flexible deployment options, a choice between two distinct operational architectures, extensive support for the cloud, and built-in machine learning capabilities. It is designed to handle to complex use cases, large quantities of data, and high-performance requirements, and it competes across all industry verticals (although telco, finance, and tech are given particular emphasis).
Moreover, Vertica can be deployed as a self- or fully managed solution to on-prem, cloud, and hybrid environments, and it offers a choice of consumption models and service metrics (either usage based or committed spend for the former, and either per TB or per node for the latter). Vertica Accelerator, in particular, is a fully managed and fully-featured Vertica deployment that runs from within your own AWS account.
Customer Quotes
“It only takes one to two minutes to generate reports in Vertica, instead of three hours previously. It’s huge to be able to move that time up, to have that information about what happened yesterday and to better track sales.”
GUESS
Vertica comes in two distinct operational architectures, or modes: Enterprise Mode and Eon Mode. The former provides a conventional tightly coupled environment for deployment on-premises, in the cloud, or in hybrid environments. Eon Mode, on the other hand, is available on-premises or in an AWS/GCP/Azure cloud in which compute is separated from storage thanks to an intermediate caching layer. This is essentially a mechanism for overcoming network latencies, and in on-premises deployments where this is an issue you may wish to turn this off to improve performance. Notably, Eon Mode generally relies on third-party cloud object storage (such as Amazon S3 or Google Cloud Storage) in conjunction with its own read-optimised storage model (as opposed to its write-optimised storage). For on-premises implementations of Eon Mode, Vertica has partnered with Dell/EMC, NetApp, Pure Storage, MinIO, H3C, and Scality to use their object storage offerings, making it one of the few – if not, the only – advanced analytics and data warehouse platforms to separate compute from storage for on-premises data centres.
Regardless of the mode, Vertica is a massively parallel columnar database with advanced compression capabilities at its core. It supports workload isolation, uses projections to speed up the performance of frequently run queries, and leverages elastic autoscaling to handle dynamic workloads. It provides in-database machine learning and analytics (potentially via interactive workflows), along with support for R and Python. Support for Python, in particular, is enhanced via the VerticaPy library. It can also work with PMML and Jupyter notebooks. Bidirectional connectors for Spark and Kafka are available, and the company has partnered with Confluent in aid of the latter. In addition, a variety (read: more than 650) of built-in analytics functions are provided out of the box.
Further capabilities are listed in Figure 2, though even this does not paint a complete picture. It doesn’t mention, for example, that Vertica supports geospatial data types, or that you can create, train and deploy K-means clustering. It is also worth noting that although Figure 2 shows that you can sample data sets using Vertica, this is merely an option. Indeed, the company regards it as a differentiator that you will rarely need to sample the data when using Vertica.
Two additional resources worth discussing are the Vertica Academy and Vertica Advisor. The former is a free e-learning platform (with certifications) while the latter is a tool that, amongst other things, uses machine learning to monitor and advise on query performance (although it is currently only available as part of support engagements).
There are several reasons why you might want to employ Vertica as a data warehouse and analytics solution. Performance and scalability, for starters, although these will vary by use case. Cloud (and cloud migration) support is another, as is the flexibility and degree of choice offered during deployment. We are also told that Vertica boasts a relatively small footprint for large-scale deployments, resulting in a platform that’s easy to manage, even at scale. The product’s extensive in-database analytics and the depth of support it offers for machine learning form another major strength: in effect, it allows you to construct a complete data science workflow within your data warehouse.
Vertica’s success as an embedded database within OEM environments is also notable. The fact that Vertica has had significant success in this area is a testament not only to its capabilities, but also to the robustness of the product and its ease of use.
The Bottom Line
Vertica offers a data warehouse and analytics platform that can be deployed in numerous different ways. Although its extremely flexible deployment methodology is perhaps its most obvious differentiator – with Vertica Accelerator, in particular, standing out – you would be remiss not to take additional note of its extensive analytics capabilities, multi-cloud support, and high performance.