Gathr Data Integration

Update solution on October 3, 2022

Gathr Data Integration

Gathr is a unified self-service data pipeline platform, and more specifically, an extensible and universal platform for creating, managing, and deploying data pipelines. Its capabilities include data integration, processing, ingestion, cleansing, transformation, blending, loading, preparation, analytics, and more. It also supports multiple deployment modes:

  1. SaaS: Gathr is offered as hosted software as a service and licensed using a low-commitment, pay-as-you-go model based on compute consumption. Moreover, anyone can register and start using it in only a few minutes.
  2. Cloud: Gathr can be hosted in your cloud account, and supports leading cloud platforms like AWS, Azure, and Google.
  3. On-premises: Gathr can be hosted in an on-premises environment on enterprise servers or virtual machines leveraging existing infrastructure.

Customer Quotes

“Gathr helped us power “in-the-moment” actionable insights from massive volumes of complex operational and digital interaction data.”
United Airlines

“Gathr helped us unlock business use cases for upselling, cross-selling, and customer retention with speed and scale. With Gathr, we brought all data sources together, built pipelines in a true self-service way, and retired legacy data platforms.”
Truist

Gathr helps you collect data from multiple sources and transform it to create a single source of truth. This data can be further utilised for accurate and high quality and reporting and analytics.

Fig 1 – Gathr data pipeline builder

More specifically, it enables you to assemble, debug, and deploy data pipelines using a low-code, drag-and-drop visual interface, seen in Figure 1. It automatically infers schema information, and is thus able to present an automated, dynamic view of the schema and outputs of your pipeline as you design it. A variety of pre-built pipeline components are provided out of the box, and you can modify these or add your own as necessary. Gathr pipelines have a self-healing capability and will automatically recover from failures when possible. Pipeline and data set management are also provided, complete with version control (either using the platform itself or outsourcing to Git) and data profiling.

The product offers a wizard-driven interface for data integration, as well as a drag-and-drop transformation builder with over 300 pre-built ETL functions. For simple ETL jobs, these will likely prove sufficient. For anything more complex, Gathr also provides a full-blown low-code environment for building ETL, ELT and Reverse ETL processes, in a similar manner to its data pipelines.

Gathr supports batch, micro-batch and streaming ingestion as part of data integration and can incorporate data classification into the ingestion process. Its transformation view (provided as part of its transformation builder) highlights all unique column values for you to operate on, as well as offering automated inspection functionality. ETL and other data integration jobs can be scheduled or triggered by a particular event, as you prefer, and lineage information for all executed jobs is available. Moreover, a built-in error search is available for each job.

The product also boasts data lineage, metadata management, data monitoring, machine learning, alerting, and CDC capabilities (on MySQL, MSSQL, PostgreSQL, Oracle, and MongoDB in the latter case). CDC in particular can be leveraged alongside data integration to capture and process data changes in real time. Finally, the company offers more than thirty connectors to various third-party and open-source products, and there is also a migration utility available to help you move over to Gathr from various legacy ETL platforms.

There are several reasons to care about Gathr as a data integration solution and as a data pipeline platform. Its analytic capabilities (especially streaming analytics) are well developed, including significant support for machine learning. It leverages a low-code, wizard-driven, highly visual development style that enables self-service and ease of use, with pre-built applications available for common use cases. It makes extensive use of open-source technologies, thus providing many of the attendant benefits with few of the traditional downsides (such as lack of enterprise support). Finally, it is lightweight, quick to deploy, and offers good time-to-value. At the same time, it is highly (and automatically) scalable, meaning that you can start small and scale it up as necessary. What’s more, built-in orchestration allows for multiple pipelines to be played one after the other based on preconfigured rules and triggers.

The Bottom Line

Gathr is a relatively broad platform-based solution that caters to data integration, pipelining, and engineering use cases in equal measure. Its support for (and use of) both batch and streaming analytics, as well as its ability to customise and use open-source technologies, are particularly appealing.

Related Company

Connect with Us

Ready to Get Started

Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."

Connect with us Join Our Community