Trillium enters the data discovery/preparation market

Trillium Software has announced two new products: Trillium Prepare and Trillium Refine. Both are based on a partnership with UNIFi, a data discovery/preparation vendor that will be featuring in my forthcoming (updated) Market Update on data preparation. In fact, Trillium Prepare will simply be a white-labelled version of UNIFi, installed and supported by Trillium. However, Trillium Refine is a superset of Trillium Prepare and goes some way beyond it.

UNIFi, as I have said, is a self-service data discovery and preparation platform targeted at business users. It runs on Hadoop and leverages either or both of MapReduce and Spark. It has profiling capabilities, a built-in recommendation engine and machine learning to guide the business analyst, and a variety of other features you would expect in a data preparation product, as well as data discovery features that you would not. However, it is limited in data quality capabilities compared to what Trillium offers natively in Hadoop. As one would expect, Trillium has a much broader and deeper set of data quality capabilities – matching, parsing, sophisticated enrichment, and so on. These are what have been combined with UNIFi to create Trillium Refine, an integrated data discovery, preparation and quality solution.

With Trillium Prepare, users can discover, access, profile and prepare data using an interface designed to be business friendly. Data is imported into Hadoop and processed behind the scenes, and users can easily export joined data sets to their business intelligence platform of choice. Trillium Refine, however, allows users to ensure the accuracy and completeness of prepared data before analysing it in BI and analytics tools. As part of their workflows, users can easily deploy data quality projects (cleansing, parsing, matching, global address validation and so forth) by a simple click within the interface. Because Trillium’s quality projects also run natively in Hadoop, no additional movement of the data is necessary. Once finished, users still have the option to export the data directly to a front-end analytical or visualisation tool such as Qlik or Tableau.

This is an interesting approach. Clearly, Trillium wanted to get into the market quickly and this is a good way to do it. Other data quality vendors have either already moved into this space or are planning to do so. However, they have had to build out from their existing capabilities, which means that some have been relatively slow at coming to market and/or have relatively limited capabilities. Moreover, UNIFi is a much more complete product than many others in the market: it has widespread connectivity options, it offers data discovery as well as data preparation, and it is suitable for use by data scientists (who can combine the built-in expression builder with external capabilities using tools such as MLLib or R). Add in Trillium’s own inherent capabilities and you have products that are several steps ahead of most (if not all) of the competition.