Test Data Management

Last Updated: 12th April 2024
Analyst Coverage: Daniel Howard

What is it?

Traditionally, testing and quality assurance create test data by copying the live database. However, the average Global 2000 company has seven such copies, which is expensive in terms of license fees, hardware, and running costs. A cheaper option is to take subsets of the database instead of copies. However, without sophisticated tools that ensure the subset you take is representative of the database as a whole, you cannot ensure that you will be able to fully cover all the testing scenarios that might apply. Thus there is a trade-off between cost and quality of testing.

The second problem that assails the test data environment is that you need to put as little workload onto DBAs as possible, otherwise the testing (and therefore the development environment as a whole) will be less agile than it needs to be. Operations is frequently seen as an obstacle to providing test data while Development is all too frequently seen as a nuisance by DBAs. DevOps is a generalised approach to ensuring improved collaboration across these environments while test data management is a specific technology designed to achieve this; while at the same time supporting an agile development environment where testing is conducted early and often.

What does it do?

Test data management aims to square the circle of providing fully representative data with right-sized datasets (you may need a differently sized subset for different types of test) together with minimal impact on the database administrator. There are two methods generally in use for generating test data: either you take a subset of the data that is representative or you generate a synthetic set of data. The latter can be achieved either by sub-setting the data and then repeatedly applying data masking techniques while the former relies on having profiled the source data using a data profiling and discovery tool.

The advantage of a completely synthetic approach is that you don’t touch the live data at all, other than for the original profiling, and therefore it is very quick and easy to generate new test data sets without having to go to operations for assistance. Thus this is a particularly suitable approach for agile requirements.

Test data management solutions will also include data masking capabilities, so that personally identifiable and other sensitive data can be discovered and masked in an appropriate fashion (this is really a governance issue); although it should be noted that this is not necessary if you are generating completely synthetic data.

Who should care?

Those in charge of testing teams and quality control will be the most interested but this is also relevant for compliance officers (especially when development is to be outsourced) because of the synthetic or masked aspects of the data used for testing.

In addition, development teams adopting an agile methodology should care because agile development is not much use without agile testing and you can’t have agile testing if you don’t also have agile test data.

Emerging trends

While test data management has actually been around for some years it is only in this decade that it has really come to the fore. In our view the most likely trend going forward is the merger of test data management with service virtualisation to further speed up testing processes. Indeed, partnerships and acquisitions are already taking place within this sector to enable exactly this.

One noticeable fissure in the market is between those companies providing test data management from the perspective of developers (integrating with service virtualisation, testing tools, code coverage and so on) as exemplified by Grid-Tools, and those that offer a more data-centric approach, as typified by Informatica. In practice, nearly all vendors are in the latter camp which potentially gives Grid-Tools an advantage.

Vendor Landscape

Informatica acquired Applimation, IBM acquired Greenhat (a service virtualisation provider) and Grid-Tools has extended its portfolio to include service virtualisation. The latter has also partnered with a number of the service virtualisation vendors. New entrants into the field include Rever and Delphix where the latter is a virtualised environment for SQL Server and Oracle. It works with, rather than provides, data masking.

The big trend, however, is towards synthetic data generation. It used be that only Grid-Tools offered this but now GenRocket has emerged, Rever has introduced a test data management product (SEAL) that also includes data masking, and Informatica has added synthetic data generation. We expect IBM to follow suit in due course.

The next step for vendors will be to introduce something comparable to Grid-Tools’ test data warehouse. Informatica has announced that it will do so later in 2014.

Downloads

Commentary

What’s happening in test data management?

Keysight brings gen AI to the testing world

DevOpsQA issues a 60-hours challenge

Platform.sh

The advanced use of GAI

Accelario Test Data Management and Database Virtualisation

The goal of software testing in healthcare

Curiosity about Testing

2023 is the year to Get Real About The Metaverse

IRI Test Data Generation

IRI Voracity and Test Design Automation

Accelerating Software Quality: Machine Learning and Artificial Intelli...

Mainframe programming and the modern developer

A quiet revolution

Total cost of ownership

The mutable mainframe

Synergies with the Mutable Business

Testing should not be a roadblock

Testing and impact analysis

Traceability in testing

Automating reusability

Responsive automation

CA Technologies acquires Grid-Tools

Tackling the storage provisioning nightmare and business clamour to sa...

Solutions

These organisations are also known to offer solutions:

Net2000
OpenText
Oracle
Original Software
Polarion
Rever
Synthesized

Test Data Management

What is it?

What does it do?

Who should care?

Emerging trends

Vendor Landscape

Downloads

Commentary

Solutions

Research

Mage Data Test Data Management

Redgate Test Data Manager

K2view Test Data Management

Windocks (2024)

Curiosity Software Test Data Automation (2024)

DATPROF (2024)

Broadcom Test Data Manager (2024)

Test Data Management in IRI Voracity