DataSunrise Sensitive Data Discovery (2025)
Update solution on October 24, 2025
DataSunrise is a Seattle-based company founded in 2015. It also has representation in Europe and the Asia-Pacific. It provides a suite of data and database security products that collectively offer solutions for automated data compliance, data discovery, data masking, real-time activity monitoring, threat/vulnerability assessment, and more.
The company has partnerships with a number of technology vendors, such as IBM, Amazon, Microsoft, Cloudera, Databricks and Google Cloud. The company does not focus on any particular verticals – in fact it has customers in a variety of sectors, including financial services, government, e-commerce, healthcare, and others – and it has clients world-wide.
DataSunrise provides sensitive data discovery as part of its more general platform for end-to-end data protection and security. Among other things, this platform provides several different ways to protect any sensitive data that it discovers, including (but not limited to) both static and dynamic data masking. It can operate on structured, unstructured, and semi-structured data (including text files, documents, and images), it is available both in-cloud and on-premises, and it features a range of compatibility options for databases, LLMs, and so on. When deployed using AWS, Azure, or Google Cloud, it also benefits from high availability, autoscaling, and failover functionality.
Moreover, discovery in a cloud context benefits from the platform’s Data Security Posture Management (DSPM) to automatically locate your in-cloud data assets, preconfigure ready-to-use data protection and compliance measures, and provide optimised deployment suggestions through examination of your existing environment. You can subsequently discover sensitive data using the platform’s contextual search functionality. In addition to the cloud, it can also be used to discover sensitive data in databases and local file systems.
DataSunrise’s contextual search allows you to discover and classify sensitive data against a variety of information types (such as credit card information) and/or security standards (such as GDPR) that are comprised of multiple information types that have been grouped together. A number of both are provided out of the box, covering personal data, financial information, medical records, addresses, and internet-related data, but you can also define your own as necessary. Information types contain attribute filters that are used to match data against them (see Figure 1). This is typically used to compare column data to regular expressions – a form of pattern matching – which is ideal for identifying sensitive data that takes specific formats, such as social security numbers, credit card numbers, and passport details.

The contextual search itself provides several options for identifying sensitive data, including selection strategy (how many rows to examine for sensitive data), match strategy (whether to match using all available attribute filters or just the first one), and matching threshold (how sure you need to be to identify a match), as shown in Figure 2. Searches can be executed manually or scheduled to run automatically, and their results are displayed in a dashboard that includes a dynamic results table. Notably, you can create new security, masking, and auditing rules directly from this table. You can also export your results in a discovery report, with various display options (for example, column name aliasing).

This process can be applied to both structured and unstructured data. NLP- (Natural Language Processing) driven named entity recognition is available, and Amazon Workspace and/or DataSunrise Tesseract can be used to examine images and extract any text they contain (which can then be classified). Notably, the software allows you to introspect SQL code (stored procedures) as part of its discovery process. You can also build custom matching rules using Lua scripts and/or no-code, and various validation methods are provided to assist with discovery by verifying structural correctness (or, more accurately, identifying the lack thereof). DataSunrise also offers Data-Inspired Security, an additional measure for identifying and protecting sensitive data on a continual, automated basis. At present, this feature is only available for MySQL, MSSQL, and PostgreSQL-like databases.
In addition, DataSunrise can discover the relationships between tables containing sensitive data. This is achieved through activity monitoring, the identification of primary/foreign key relationships and constraints in the table’s definition, and/or by analysing already executed queries against the relevant database. This means that once you have discovered that a piece of data is sensitive, you can very quickly find any corresponding sensitive data in any related tables. In particular, this can be used to find all information related to a specific entity and create a virtual report to that effect, which is beneficial for fulfilling DSARs (Data Subject Access Requests).
DataSunrise can be deployed in several different operating modes: in proxy mode, it is placed between the database clients and the database server, disabling direct access; in sniffer mode, it receives mirrored database traffic via a network switch; in agent mode, it utilises an agent that sits between the database and its users to intercept all read/write packets and pass them on to the platform; and, finally, it can operate by reading database native audit logs. Agent mode is currently only available for Oracle and PostgreSQL, although we are told that MSSQL and Sybase compatibility is forthcoming. All that said, proxy mode is of primary interest for sensitive data discovery, as the other modes are – at least the moment – limited to data auditing.
Having discovered your sensitive data, DataSunrise offers significant capabilities for protecting it, in the form of automatically applied security, auditing, and masking rules. Both static and dynamic masking are available, with the latter applying to stored procedures and database functions in addition to SQL queries. Synthetic data generation is also available, primarily to facilitate test data management. Further security features, including real-time activity monitoring, threat detection, and role-based access control (that integrates with dynamic data masking) are also available, among other things. Notably, DataSunrise is starting to provide support for vector databases, as well as broader generative AI security. For example, it can be used to monitor prompts sent to LLMs and other AI tools and redact any sensitive information they contain.
DataSunrise is, first and foremost, a data security offering. From a sensitive data discovery perspective, its core strengths are that it provides a robust selection of complementary data security functionality, an extensive set of tools for protecting your sensitive data, and various data compliance capabilities on top of that. Its wide-ranging database (and to a lesser extent, LLM) support is worth mentioning, too, as are its ability to introspect SQL code and its support for relationship discovery, all of which are uncommon (though not unique) capabilities within the space. In short, there is a lot to like about it if you are in the market for data security in addition to sensitive data discovery.
While DataSunrise offers some advantages purely as a sensitive data discovery offering, its main appeal will be as a formidable data security platform which can also fulfil your discovery needs. This is not at all a bad place for it to be.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community