What was originally Relational Technologies Incorporated, founded in 1980, became Ingres Corporation in 1989. It was acquired by the ASK Group in 1990 which was itself acquired by CA in 1994. Ingres was spun out of CA in 2005 and was renamed as Actian in 2011. In 2018 the company was 80% acquired by HCL Technologies, which is a company headquartered in India but with offices around the globe. The company has more than 120,000 employees.
Actian, which operates separately from HCL, also has a significant history of acquisitions including VectorWise (relational database) in 2011, Versant (object-oriented database) in 2012, and ParAccel (data warehousing) and Pervasive Software (databases and data integration software) in 2013.
The company’s current focus is on “hybrid” data management, by which it means both hybrid (not just structured) data and hybrid (cloud and on-premises) data management. It has four major products: Actian Avalanche which is a cloud-based data warehouse, which leverages the Actian Vector database; Actian Zen, which is an edge-focused database based on what was previously Pervasive PSQL (and, before that, Btrieve); Actian DataConnect, which is a data integration offering based on Pervasive Dataflow and other Pervasive capabilities; and Actian X, which is a transactional database with analytic capabilities aimed at the HTAP market. Actian X is, in effect, the latest release of Ingres but it also includes the Actian Vector columnar database engine to support analytic queries. In addition to these four major offerings the company continues to market and support the OpenROAD 4GL.
Actian Analytics Platform
Last Updated: 18th December 2014
- Be able to ingest all sorts of data—whether structured, unstructured or machine generated—and from whatever source is appropriate, whether inside or outside the organisation.
- Be able to prepare data for analysis efficiently. This is particularly important where a discovery process (the sort of thing that data scientists do) is required, because data preparation can take as much as 80% of this time.
- Have a platform for the initial discovery process and data science analytics.
- Be able to then conduct low latency analytics that are time-sensitive using results from the discovery process. Requirements 3 and 4 may or may not be in the same place.
- Once you have performed your analytics you need to be able to distribute the results to users and applications that require the information produced by your analyses.
Actian delivers capabilities for requirements 1 and 2 through the DataConnect offering, it provides 1, 2, and 3 via Actian DataFlow, and for 4 it has three offerings, which may be used separately or in conjunction. Most commonly, you would use either Vector or Matrix for low latency analytics depending on the sophistication of the analytics workload and scalability requirements. Or, if you have a combination of traditional and complex analytics workloads, then you might want to deploy both, depending on the scale at which you are operating. And, last but not least, Vector in Hadoop would be used to provide high performance SQL access natively in Hadoop to data and analytic results stored in HDFS.
Actian has both an extensive channel marketing program (primarily for its historic database products) and a significant direct sales force. The company is not limited to any particular vertical sector or geography. DataConnect has a particular emphasis on integration with cloud-based (SaaS) applications but this is by no means uncommon today.
We doubt whether even Actian knows exactly how many companies are using its technologies. This is because Ingres and PSQL in particular are widely embedded by OEM partners and ISVs. The total number of users (many of which are no doubt unaware that they are using these products) will certainly run into tens of thousands and could easily be in six figures.
There are three elements to the Actian Analytics Platform: data integration, data preparation and what is sometimes referred to as the Extended Data Warehouse (XDW).
A traditional data warehousing environment has been considered to consist of a conventional data warehouse plus data marts. The key element that has changed with big data and The Internet of Things is the requirement for a discovery platform where large amounts of data can be explored for interesting insights but in a relatively inexpensive manner. In the case of the Actian Analytics Platform this would be its new Hadoop SQL edition offering, which is a combination of DataConnect for high performance connectivity, DataFlow for visual data science and analytics, and the implementation of Vector on HDFS (the storage system for Hadoop) via YARN. Vector is a column-based analytic database for single server implementations that owes its performance advantages (it currently holds the TPC-H benchmark record for the fastest non-clustered database solution) to the fact that it uses vector processing that exploits CPU parallelism. What Actian has now done is to implement this on top of HDFS so that it will run on low cost Hadoop clustered hardware. Initial benchmarks suggest that Vector in Hadoop is better than an order of magnitude faster than Impala running on Hadoop.
Once interesting nuggets of information are discovered the typical process is that relevant data will be loaded into a high performance analytics database for low latency analytic processing and operational business intelligence. Here Actian offers Vector, Matrix and SPARQLverse. Vector we have already discussed. Matrix (formerly ParAccel) is an MPP (massively parallel processing) column-based database. In technical terms the product’s biggest strength is its optimiser, which has a number of advanced capabilities (for example, for processing correlated sub-queries). It is worth noting that Matrix is what powers Amazon Redshift. SPARQLverse is an MPP-based graph database running on top of HDFS developed by SPARQL City (of which Actian is a major shareholder).
For data preparation, blending and enrichment, and visual discovery analytics and data science, the company offers Actian DataFlow (previously Pervasive DataRush), which is a massively parallel engine (it is extremely fast) that allows you to profile, cleanse and run other data preparation tasks, and/or execute data mining algorithms on the fly: that is, while loading data. It is worth commenting that the market for data preparation and analysing data while in motion is burgeoning and rapidly evolving into a sub-market of its own (several new and existing vendors are introducing markets in this area) but Actian got here first.
Finally, Actian DataConnect not only includes the data profiling and data quality capabilities that may be instantiated on Actian DataFlow but it also provides the data integration (ETL: extract, transform and load) requirements necessary to ingest and output data from and to both external and internal sources and targets. It is noteworthy that Pervasive, prior to its acquisition, had a history of developing connectors to a very wide range of sources and targets and, over the last few years in particular, it has focused on developing adapters to connect to cloud-based and SaaS (software as a service) providers. It has a particularly close relationship with Salesforce.com.
Actian provides conventional training, consulting and support services. Some of its partners also offer consulting services.
Last Updated: 1st July 2020
Mutable Award: One to Watch 2020
The most recently introduced of Actian’s database products is Actian Avalanche, which is a hybrid cloud/on-premises, columnar (with compression) data warehouse offering provided as a managed service when in the cloud. It is ANSI SQL 2016 compliant and currently runs on the Azure platform and AWS. From a go-to-market perspective the company’s main message is that you can migrate – see Figure 1 – both your data and your applications to Avalanche from your current environment at your own pace. This is backed up by the fact that compute is separated from storage (on Azure only at present) and that the analytics engine (which is based on Actian’s Vector database technology) is completely compatible across environments.
The company’s migration offering is supported by a toolkit, which includes migration capabilities for tables, views, users and so forth as well as SQL conversion functionality. The company is targeting existing legacy on-premise data warehouse appliances and offers automated tools for migrating from IBM Netezza, Oracle Exadata, and Teradata. In this context, and more generally, it is also worth noting the connectivity and integration capabilities provided by the Pervasive assets that Actian acquired in 2013.
“We provisioned Actian Avalanche without any manual database tuning –
to see what its on-chip caching and smart compression could do. Actian Avalanche performed 2x to 200x faster than Oracle. It performed so well that we did not have to look at competing solutions.”
Jean-Francois Rompais, Head of IT Architecture, Kiabi
“Our past experiences with relational databases often involved major limitations that prevented us from meeting our customers’ big data needs. Actian provides the only solution to combine the exceptional performance and manageability we require with the affordability our customers demand.”
Clarence Rozario, Senior Product Manager, Zoho
Figure 2 illustrates the Actian Avalanche architecture. However, this will take some explanation. For example, DataFlow is a high-performance integration engine that was originally developed by Pervasive, while Zen (which was previously PSQL and before that Btrieve) is used in Internet of Things environments as an edge or mobile database, hence the gateway provided. In this context, it is worth commenting that time-series data is not currently supported by Avalanche, though it is supported in Zen. Similarly, geospatial data has not yet been implemented in Avalanche though it is supported in the Actian X database (which is targeted at hybrid transactional and analytic environments).
Vector is the database engine, and it has some notable characteristics. For instance, it exploits Intel’s vector instruction set (hence the Vector name) to process more data elements per instruction (SIMD: single instruction, multiple data) and optimises for L1 and L2 cache. Indeed, there has been considerable academic work involved in the development of Vector, which was created initially at CWI, the Dutch National Research Institute for Mathematics and Computer Science, and the technology incorporates patented algorithms for optimising the use of memory. A further patented technology known as Positional Delta Trees enables real-time updates without impact to query performance. For what it is worth benchmarks suggest that Avalanche has significant performance benefits compared to other cloud data warehouses.
The warehouse understands Spark datatypes and the optimiser treats third-party storage such as Amazon S3 and Hadoop as external tables with the ability to push-down predicates so that Avalanche supports full data virtualisation across these sources rather than mere query federation. User defined functions are available to support machine learning and there is support for R, Scala, Python, and all the usual business intelligence and analytics tools, and TensorFlow will be supported later this year. The ingestion of JSON documents is supported and the company plans to introduce features to optimise performance for the analysis of these.
Avalanche is a relatively new offering, having only been released in March 2019. As such it lacks some features that we would like to see, notably geospatial and time-series capabilities, which will be important for some IoT applications. That said, this sort of functionality are on the roadmap and expected to be released in the second half of 2020. Leaving that aside, the product shows significant promise. The company has a significant history in the database market and knows what it is talking about when it comes to performance, database optimisation, administration and management (Actian FlexPath cloud resource management) disaster recovery, high availability, security (multi-factor authentication) and so forth.
We especially like the approach Actian has taken towards migration from legacy data warehousing implementations, both in the tools that the company makes available to ease this process, and in the way that the hybrid nature of the environment will allow you to migrate to the cloud at your own pace.
The Bottom Line
Actian has always had excellent technology. It has, however, struggled with marketing and visibility. With HCL fully committed to Actian Avalanche, and the ability to leverage that company’s worldwide salesforce, there is every possibility of Avalanche being recognised as a market-leading offering within the cloud/hybrid data warehousing space.
Mutable Award: One to Watch 2020
Actian DataConnect / AvalancheConnect
Last Updated: 4th February 2021
Mutable Award: One to watch 2020
Actian DataConnect is a (data) integration offering the intent of which is to allow anyone (that is, multiple personas) to connect anything, anywhere (cloud, on-premises, hybrid), at any time. There are three different options: DataConnect Inside, which is targeted primarily at partners who want to embed connectivity within their own applications; DataConnect Integration Platform, which offers traditional ETL (extract, transform and load) capabilities as well as ELT and other integration options; and DataConnect iPaaS, which is powered by Actian Avalanche to provide access to cloud-based resources.
Avalanche Connect includes Actian DataConnect connectors and the two share a single management console. It is available as a managed service with support for Kafka streaming, and connectivity to semi-structured data such as JSON and RESTful web services, without coding. Its features are illustrated in Figure 1. Actian does not provide change data capture (CDC) itself but instead partners with HVR for this purpose. Note the support for various third-party machine learning technologies though Actian is somewhat late to the party with respect to the use of machine learning within DataConnect, with the automation that derives from it very much on the company’s roadmap, as opposed to something currently within the product.
“With Actian, we turned it on and it worked. After months of switching over to NetSuite, the Actian solution took only 8 hours and we were up and running smoothly without errors. All processes are now automated and data flows instantaneously between platforms.”
“We deal with data from disparate insurance clients, the data types are diverse. Files range from small and manageable to complex and large… we needed to pay special attention to the integration process because the more sources, the more complicated the process. Our data model needed to… provide a consistent data structure for any downstream process… The power of Actian DataConnect enabled us to manage all this very efficiently.”
Hannover Life Reassurance
Company of America
DataConnect has four main components, as can be seen in Figure 2. What this doesn’t mention is Actian’s UniversalConnect, which is patented technology for developing and implementing connectors, either those that are available directly from Actian (more than 200 of them) or that you can develop yourself.
Integration Manager is being enhanced (2021) to support containerised deployments via Kubernetes. Its capabilities are illustrated in Figure 3. In addition there is Actian Studio, which is an Eclipse-based environment for developers, while the company should also have released Web Designer by the time this report is published. This is a browser-based no-code interface intended for use by citizen developers that will not require IT input. Finally, Actian plans (2021) the release of an actionable metadata repository with support for streaming data.
Actian DataConnect has been available, under a variety of names, for more than twenty years. It has the sort of reliability that you would expect from a product with this sort of pedigree. It is also one of the few products in this space that has focused, at least in part, in providing embedded capability. Historically – and this does not appear to have changed – it has had significant total cost of ownership benefits when compared to many of its competitors. However, data integration is a mature technology and DataConnect has languished somewhat over the last few years without significant new development. But it is now clear that that has changed and that Actian is putting significant efforts into extending DataConnect’s functionality and reach. This is especially associated with the launch, earlier in 2020, of Actian Avalanche.
In this context it is worth commenting on the potential of Actian Avalanche with Avalanche Connect, which will allow its data integration capabilities, as well as the functionality of the forthcoming Actian DataFlow, to leverage the elastic scaling offered by Avalanche. Moreover, as far as we know, this will represent the only cloud-based data warehouse on the market that also includes its own integration capabilities built into the environment: something that must be appealing to potential users.
The Bottom Line
Actian DataConnect, and particularly Avalanche Connect, have significant potential. The former is a comprehensive and competitive data integration product. Though it appears to be lacking any remarkable feature or functionality, Actian Data Connect has a strong following of ISVs that embed the technology including ADP, Xactly and FICO. The company’s future plans for extending further with its integration hub service, where has been proven in its Healthcare Claims and Service exchanges, make the company a definite “one to watch”.
Mutable Award: One to watch 2020
Last Updated: 28th June 2019
Actian X is a hybrid database that combines what used to be known as Ingres and what was previously VectorWise. It is SMP (symmetric multi-processing) based with parallel processing of the data based on a configurable number of CPUs. Its architecture is illustrated in Figure 1 where the X100 query engine is derived from VectorWise.
As one might expect from a product with its pedigree, the Ingres part of this equation provides ACID guarantees and support for two-phase commit within transactional processing environments. The X100 engine, on the other hand, exploits Intel’s vector instruction set (hence the Vector name) to process more data elements per instruction (SIMD: single instruction, multiple data). It is ANSI SQL 2003 compliant and maintains ACID compliance. Columnar storage exploits compression capabilities to minimise storage requirements.
The “Query Processing” element shown in Figure 1 is aware of where data is held and directs queries, or parts of queries, to the appropriate data store. Synchronisation between the two data stores is either defined through rules using stored procedures, or you can use conventional replication technology if latency is not an issue.
“Ingres has a long-established pedigree for supporting mission critical transactional environments, while VectorWise has proven high-performance characteristics in supporting analytics, so we have no question marks over the performance and capabilities of the two acting in combination.”
One notable feature of Actian X is its support for a patented technology known as Positional Delta Trees. The following is an extract from the original academic research paper: “our goal is that read-only queries always see the latest database state yet are not (significantly) slowed down by the update processing. To this end, we propose the Positional Delta Tree (PDT), that is designed to minimize the overhead of on-the-fly merging of differential updates into (index) scans on stale disk-based data.” In other words, this is about improved query performance regardless of whatever updates are being made, while preserving consistency. It is also symptomatic of Actian’s approach, in that it leverages a number of patented technologies to maximise performance. In particular, there has been considerable academic work involved in the development of Vector, which was created initially at CWI, the Dutch National Research Institute for Mathematics and Computer Science. As another example, Actian X (the X100 engine) incorporates patented algorithms for optimising the use of memory.
The broader ecosystem in which Actian X operates is illustrated in Figure 2. Not shown is support for Kafka and for PMML (predictive modelling mark-up language) though TensorFlow is not supported. Also not shown in this diagram is the fact that there is a managed cloud service available for storing and managing Active X backups; there are a number of geospatial capabilities, including 3D support for R-Tree indexes; and there are database health monitoring and capabilities. Also included with Actian X is DataConnect for Actian X, which provides a development engine for designing and testing integrations, plus a production engine for deployment purposes.
Ingres has a long-established pedigree for supporting mission critical transactional environments, while VectorWise has proven high-performance characteristics in supporting analytics, so we have no question marks over the performance and capabilities of the two acting in combination. And the integration tools and other functions provided by Actian are attractive. For existing Ingres users wanting to add analytic capabilities to their existing environment, Actian X is an obvious choice.
Non-Ingres users fall into two camps: existing users of other relational database systems and greenfield opportunities. We do not envisage that Actian is targeting the former. For greenfield environments, it is important to bear in mind that Actian X is an SMP-based system that scales up rather than scales out. It is not likely, therefore, to be cost-effective for the very largest hybrid environments where you may have hundreds of terabytes of data to process. Where that is not an issue then Actian X may well be worth consideration. If the company wants to address the extreme end of this market then it will need to extend the use of Vector within Actian X to Vector Cluster.
The Bottom Line
If you are an existing Ingres user then Active X should be a trivial decision to support hybrid analytic and transactional processing within the same environment. It is also a strong contender for mid-sized and smaller deployments, especially where SQL and traditional relational approaches are preferred.