DataSunrise is sponsoring AWS re:Invent 2024 in Las Vegas, please visit us in DataSunrise's booth #2158

What is Data Masking?

What is Data Masking?

Native masking is shown on the left, while DataSunrise masking is displayed on the right. The key advantage of DataSunrise’s approach is its simple and uniform masking procedure, which is consistent across all supported data storage systems. This image shows data masked in red based on user access. DataSunrise can also mask data based on multiple criteria including SQL schema parameters (column names and types), accessing application, access time, and data content.

Data masking, also known as data obfuscation, is the process of replacing sensitive information with realistic but inauthentic data. Its primary purpose is to protect confidential information, such as personal data, stored in proprietary databases. However, effective masking strikes a balance between security and utility, ensuring that the obfuscated data remains suitable for essential corporate activities like software testing and application development.

 


 

Masking proves invaluable in scenarios such as:

  • a company needs to give access to its database(s) to outsource and third-party IT companies. When you are masking data, it’s very important to make it look and appear consistent so that hackers and other malicious actors think that they’re dealing with genuine data.
  • a company needs to mitigate operators’ errors. Companies usually trust their employees to make good and secure decisions, however many breaches are a result of operators’ errors. If data is masked, the results of such errors are not so catastrophic. Also, it’s worth mentioning that not all operations in databases need the use of entirely real, accurate data.
  • a company runs data-driven testing.

In this article we are going to look more closely at static masking, dynamic masking and in-place masking.

 

Examples of Masked Data

In the example below you can see how the Card column looked before masking:

SQL> select * from scott.emp;

    EMPNO ENAME     JOB            MGR  HIREDATE CARD      
--------- --------- ---------- ------- --------- -------------------
        1 SMITH     CLERK            0 17-DEC-80 4024-0071-8423-6700
        2 SCOTT     SALESMAN         0 20-FEB-01 4485-4392-7160-9980
        3 JONES     ANALYST          0 08-JUN-95 6011-0551-9875-8094
        4 ADAMS     MANAGER          1 23-MAY-87 5340-8760-4225-7182

4 rows selected.

And after masking:

SQL> select * from scott.emp;

    EMPNO ENAME     JOB            MGR  HIREDATE CARD      
--------- --------- ---------- ------- --------- -------------------
        1 SMITH     CLERK            0 17-DEC-80 XXXX-XXXX-XXXX-6700
        2 SCOTT     SALESMAN         0 20-FEB-01 XXXX-XXXX-XXXX-9980
        3 JONES     ANALYST          0 08-JUN-95 XXXX-XXXX-XXXX-8094
        4 ADAMS     MANAGER          1 23-MAY-87 XXXX-XXXX-XXXX-7182

4 rows selected.

DataSunrise lets you apply different masking methods to each field. You can choose from preset options or create custom masking rules for specific data types. Format-preserving masking maintains data structure while protecting sensitive information. This ensures masked data remains usable and retains its statistical properties.

Masking MethodOriginal DataMasked Data
Credit card masking4111 1111 1111 11114111 **** **** 1111
Email masking[email protected]j***e@e*****e.com
URL maskinghttps://www.example.com/user/profilehttps://www.******.com/****/******
Phone numbers masking+1 (555) 123-4567+1 (***) ***-4567
Random IPv4 address masking192.168.1.1203.45.169.78
Random Date/Datetime with constant year for string column types2023-05-152023-11-28
Random Date/Datetime and Time from interval for string column type2023-05-15 14:30:002024-02-19 09:45:32
Masking by empty, NULL, substring valueSensitive InformationNULL
Masking by fixed and random valuesJohn DoeAnonymous User 7392
Masking using a custom functionSecret123!S****t1**!
Mask first and last chars of stringsPassword*asswor*
Masking any sensitive data in a plain textMy SSN is 123-45-6789 and my DOB is 01/15/1980My SSN is XXX-XX-XXXX and my DOB is XX/XX/XXXX
Masking by values from predefined dictionariesJohn Smith, Software Engineer, New YorkAhmet Yılmaz, Data Analyst, Chicago

 

Data Masking Steps

When it comes to practical implementation, you need the best strategy that works within your organization. Below are the steps you need to take to make masking effective:

  • Find your sensitive data. The first step is to recover and identify data that may be sensitive and require protection. It’s better to use a special automated software tool for that, like DataSunrise sensitive data discovery with using of table relations.
  • Analyze the situation. At this stage the data security team should understand where the sensitive data is, who needs access to it and who doesn’t. You can use role-based access. Everyone who has a certain role can see an original or masked sensitive data.
  • Apply masking. One should bear in mind that in very large organizations, it isn’t feasible to assume that just a single masking tool can be used across the entire company. Instead, you might need different masking types.
  • Test masking results. This is the final step in the process. Quality assurance and testing are required to ensure that the masking configurations give the required results.

 

Data Masking Types

For more detailed information on the masking types and their implementations using both native and third-party solutions, please visit our YouTube channel and explore our comprehensive masking playlist.

Dynamic Masking

Dynamic Masking is a process of masking data at the moment a query to a database with real private data is made. It is done through modifying the query or the response. At this data is masked on the fly, that is, without saving it to a transitional data storage.

Static Masking

As the name suggests, when masking data statically database administrators need to create a copy of the original data and keep it somewhere safe and replace it with a fake set of data. This process involves duplicating the content of a database into a test environment, which the organization can then share with third-party contractors and other external parties. As a result, original sensitive data needing protection stays in the production database and a masked copy is moved into the test environment. However perfect it may seem to work with third-party contractors using static masking, for applications needing real data from production databases statically masked data may be a big problem.

In-Place Masking

In-place masking like static masking also creates test data based on real production data. This process usually consists of 3 main steps:

  1. Copying production data as is to a test database.
  2. Removing redundant test data to decrease data storage volume and speed up testing processes.
  3. Replacing all PII data in a test database with masked values – this step is called in-place masking.

The way of copying of production data is left out of scope of in-place data masking itself. For example, it can be an ETL procedure or backup-recovery of a production database or something else. The most important thing here is that in-place masking is applied to a copy of a production database to mask the PII data it contains.

 

Conditions Data Masking Should Meet

As it was mentioned earlier any data involved in masking has to remain meaningful at several levels:

  1. The data has to remain meaningful and valid for the application logic.
  2. The data must undergo enough changes so that it can’t be reverse-engineered.
  3. The obfuscated data should remain consistent across multiple databases within an organization when each database contains the specific data element being masked.

 

Data Masking with DataSunrise

Masking is a crucial feature of any data security solution. We’re proud to offer DataSunrise masking capabilities, which provide one of the easiest-to-use yet most robust and full-fledged masking solutions on the market. In the picture below, you can see the masking setup for an email field. There are dozens of masking types available. You simply select the database and the data to mask (or the unstructured data location), set the type of masking, and your data is ready to pass regulatory compliance checks.

Data Masking in DataSunrise - Setup for masking type

 

Conclusion

DataSunrise provides you with a possibility of static and dynamic data masking to protect your data (also masking XML, JSON, CSV, and unstructured text on Amazon S3). Moreover, data discovery with table relations will be an indispensable additional tool in the protection of your data. Our security suite guarantees the protection of data in your databases in the Cloud and on-Premises. Try now all our capabilities to be sure that everything is under your control.

Next

What is Access Control in Database Security?

What is Access Control in Database Security?

Learn More

Need Our Support Team Help?

Our experts will be glad to answer your questions.

General information:
[email protected]
Customer Service and Technical Support:
support.datasunrise.com
Partnership and Alliance Inquiries:
[email protected]