
Fig 02 - Editing a masking rule in Data Masker
Data Masker (as seen in Figure 2) is a rules-based static data masking product that performs on the level of millions of rows per hour. It maintains the credibility of masked data by ensuring that correlated values (for instance, age and date of birth) remain consistent using substitution rules and correlated data sets that contain potential values to substitute. These are available out of the box or can be user defined, and this process can be actioned between databases or even database instances. Data Masker also has the ability to generate synthetic data in a limited capacity, best utilised when production data is either unavailable or incomplete.
Masking in Data Masker is irreversible, always retains relational integrity, and can mask primary or foreign keys without a join operation. It automatically generates reports whenever a masking rule is run, making the masking process fully auditable, and allows you to either apply your masking rules immediately or export them for use elsewhere (in SQL Clone or the SQL Data Catalog, for example).
Data Masker provides some basic sensitive data discovery functions, but in practice the meat of this is in SQL Data Catalog, which allows you to define your own taxonomy of classifications that your data can be matched against using an extensible library of built-in pattern matching rules. You can then attach actions to these tags in order to, for instance, automatically mask data that is classified as sensitive. To that end, compatibility with HIPAA, GDPR, and other regulations is provided out of the box, as is integration with various test data management processes.
SQL Clone allows you to create and centrally manage images and virtualised clones of your production data. Images are complete point-in-time copies of a database, taken from either a live server or a backup. As they are often quite large (since they are complete copies) they are usually stored centrally. You can modify your image during its creation using either SQL scripts or sets of masking rules exported from Data Masker.
Clones, on the other hand, are derived from an image and only store the differences between themselves and the image they were derived from. Due to this, they are small in size (usually less than a hundred megabytes) and can be created very quickly and wherever they are needed. For test data management, this means that your testers can provision a masked clone to their local machines whenever they need test data without having to wait on an administrator. Team management, permissions, and self-service features are provided to this end, as is integration with Git. PowerShell-driven automation is also available for image creation and management. By leveraging SQL Data Catalog, you can even distribute clones (and thus test data) as an automated part of your business processes.