Vertica Update

Written By:
Content Copyright © 2022 Bloor. All Rights Reserved.
Also posted on: Bloor blogs

Vertica banner

Vertica is currently a part of Micro Focus, a UK public company with annual revenues of just under $3 billion and 11,000 employees. However, in August it was announced that Micro Focus itself would be acquired by OpenText, a Canadian company (listed on the NASDAQ stock market) with $3.5 billion annual revenues and 14,500 employees. The deal is expected to close in Q1 2023.

OpenText has Magellan Data Discovery, a visualization tool, so there may be some synergies with Vertica, which already partners with other analytic tool vendors like Tableau.

Vertica is a columnar database so designed for analytic use cases. It has separation of compute and storage and can run either on-premise or on public clouds. As well as normal structured data, Vertica supports geospatial data, time series and Kafka integration for real time data loading. It can work with S3 storage directly, and Vertica has support for Python, built in machine learning, support for Predictive Model Markup Language (PMML), and also has an SDK and support for C++ in-database execution.

Data lake customers can use Vertica to access data in Parquet and ORC files, which avoids moving large amounts of data but still enables it to be queried. You can even execute a join between your database and the data lake.

The newest software release is Vertica 12. This version broadens Vertica deployment options to include have hybrid, containerized, or cloud deployments, while allowing deployments to move and change as business requirements change, rather than restricting customers to only cloud, only containerized, or even only a specific cloud as many other databases do… Vertica Accelerator is a DBaaS solution with all the functionality of Vertica that is quick to set up for new customers. It provides the convenience of SaaS with the high performance and optimization options of Vertica and no hardware markup, to keep costs reasonable and predictable.

Snowflake, Amazon Redshift and Google BigQuery all feature in Vertica’s competitive landscape. A recent win against Snowflake was reportedly because Vertica showed up to 10x faster performance when concurrent users was high. Snowflake is easy to install, but expensive due to the need to keep adding servers when concurrency is high. There are no other optimization options. Google Big Query also has cost/ performance issues, and extra cost of licenses for dev and test. By contrast, Vertica charges only for the production cluster. Dev and test clusters are assumed and included in the license.

Vertica is best suited for complex analytic workloads. Redshift is a tough competitor as it is genuinely fast but is only a SQL database, without the support for various semi-structured, geospatial, and time series data types. If you want a broader solution then you may have to add other products or services like Amazon SageMaker or Athena, each at an additional cost, to equal Vertica functionality. Customer examples. Vertica has over a thousand customers. Around 40% of Vertica’s business is via OEM e.g. CISCO embed them as do some others.

An example is Climate LLC, a digital agricultural company that analyses weather and soil for farmers to help increase crop yield (now acquired by Bayer). They integrate weather data, drone footage soil analysis etc. This customer started using Postgres but hit scale issues. Many of the unusual data types had to be converted by hand, while Vertica allows automated onboarding of these semi-structured and time series formats. They use Vertica on AWS Cloud.

Another customer deployment is an AIOps application in Asia Pacific, a company called EOITek that does analytics for banks in China. They use Vertica on-premise for fraud detection, analysing network operations. They initially used Hadoop, Spark, and ElasticSearch but found that Vertica was 10 times faster. The customer has a substantial 15 terabytes a day of log data, and Vertica has up to 10:1 data compression ratio, which. Just reading the data off the disc on Spark was prohibitively slow. Customer went from using 50 nodes per client company with Spark to 10 nodes with Vertica. In another case, a customer had 278 Spark nodes and Vertica did the whole query suite on 9 nodes.

Another case study is Nimble Storage, another AIOps use case. They replaced an earlier open-source database with Vertica and achieved 5x data compression. The main advantage for them was being able to proactively solve over 85% of support cases, making their highly skilled, expensive to hire, and hard to find support engineers three times as productive. A third-party study showed that Nimble Storage had a USD 33 million benefit from Vertica. The startup was purchased by HPE for over a billion USD, and the HPE Infosight application based on Vertica helps customers do predictive maintenance to improve uptime.

On admin of the database, some customers want a lot of control but other customers want things to be easy. This can be something of a tug of war. Self-managed Vertica is quite flexible but is also arguably a bit complex to administer. Now they have a management console user interface to simplify most administrative tasks. For those customers who want the management simplicity of a Database as a Service (DBaaS), Vertica now has the Vertica Accelerator SaaS deployment, which makes things even simpler.

Now you can spin up nodes in the cloud easily, or even automatically up to guardrails you define, to bring in extra processing for peak usage, and quickly spin nodes down when not needed to save money. Most Vertica customers now are hybrid rather than either pure on-premise or pure cloud. Since 2018 almost all customers start on the cloud. Vertica estimates that about half of all customers are on the cloud now but some customers insist on on-premise for various reasons, including regulatory compliance, security, and cost savings.

Overall, Vertica is a mature columnar database that offers high performance for large, demanding use cases. It is well worth considering as a candidate if that is the situation that you face.