In the data quality jungle something is stirring

Written By:
Content Copyright © 2008 Bloor. All Rights Reserved.
Also posted on: Accessibility

Among the big beasts in the data quality jungle I don’t think there are many people who would include Pitney Bowes. However, that could be about to change.

First, a word about the company: Pitney Bowes Inc (PBI) is well known for hardware in the mailroom. It’s a $6.5bn company. Its software subsidiary, Pitney Bowes Business Insight (PBBI) is nearly a $500m company that comprises what used to be Group 1 (customer data quality), Sagent (an ETL vendor) and MapInfo (location analytics and geo coding). From a pure size perspective there is therefore no question that PBBI belongs alongside the big hitters in the data quality space.

However, there is a difference between pure size and market presence, and in PBBI’s case this derives from history. Historically, Group 1 was very focused on CRM (and therefore limited to name and address matching) and it was very focused on the US market, which meant that it didn’t even support European postcode address files (PAF). Moreover, its emphasis on supporting such things as mailing campaigns as opposed to, say, the transfer of data into a data warehouse meant that it had no need for data profiling. So, you had a company with no profiling, no capabilities for such things as complex product cleansing and almost no exposure (and not much more capability) outside North America.

But that was what the position used to be. Now the company incorporates third-party data profiling capabilities and, in the latest release, there is a new module called Monitor Plus, also sourced from a third-party partner, which provides facilities specifically to support data governance and aimed at data stewards. In addition, the company has a partnership with Silver Creek for those needing data cleansing for complex data elements such as products. And, of course, the company supports PAF files worldwide and has offices around the globe. Even more interesting, the company has, for some time, had the ability to enrich data with geospatial data in the United States but, through the acquisition of MapInfo it has extended this to Canada and Australia in the latest release and expects to cover most of Western Europe within the next six months.

I have mentioned “the latest release” a couple of times and it is version 5.6 of what the company calls the Customer Data Quality Platform (CDQP—personally, I think they should drop the “customer”). This was released towards the end of November. However, before I discuss some of its interesting features it is worth noting the CDQP Universal Name Module (by the way, the whole product is modular so you just invest in what you need). Anyway, the Universal Name Module is in the same category of products as IBM’s Global Name Recognition and Informatica’s recently acquired Identity Systems. Note that none of the other vendors in the data quality space have such capability as far as I know.

This brings me on to the Open Parser, which has been introduced with version 5.6. This is part of the company’s Data Normalization Module and has been designed to allow you to introduce domain and culturally specific business rules that apply to names, addresses or other types of data. If we take the example of the Universal Name Module then this would allow you, say, to take account of the fact that in Spanish speaking countries there is the concept of both paternal and maternal surnames. The parser includes pattern grammar, debug and trace capabilities and pre-built domain definitions and templates.

Two other points are worth mentioning. The first is the product’s CRM connectors. These can be embedded within SAP or Siebel applications to prevent the entry of bad data. This sort of approach is by no means unique to PBBI but important nonetheless. And finally, PBBI also offers a software as a service based approach (the whole product is based on SOA) to licensing. In particular, you can mix and match between modules you license directly and those that PBBI hosts for you. So, for example, you might call the Enterprise Geocoding Module as a service but otherwise run various modules in-house. This seems like a cool idea.

So, all in all, I have to say that I am impressed. PBBI can clearly offer capabilities that many of its competitors cannot and I see no reason why it shouldn’t be in there mixing it like Tarzan.