DataSunrise is sponsoring AWS re:Invent 2024 in Las Vegas, please visit us in DataSunrise's booth #2158

Static Data Masking for Amazon Aurora

Static Data Masking for Amazon Aurora

Introduction

As businesses increasingly rely on cloud databases like Amazon Aurora, the need for robust data security measures grows. One crucial technique in this realm is static data masking. This process helps organizations safeguard confidential data while allowing for realistic testing environments. Did you know that according to a recent study by Verizon, 64% all the compromised data is personal information? This startling statistic underscores the importance of implementing strong data protection measures, including static data masking.

What is Static Data Masking?

Static data masking is a data security technique that creates a replica of a production database with sensitive information replaced by realistic but fictitious data. This approach allows organizations to use masked data for testing, development, and analytics without exposing actual confidential information.

Key benefits of static data masking include:

  1. Enhanced data security
  2. Compliance with data protection regulations
  3. Reduced risk of data breaches
  4. Improved testing accuracy

Amazon Aurora Capabilities for Data Masking

Test Data

create table MOCK_DATA (
id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(50),
phone VARCHAR(50)
);
insert into MOCK_DATA (id, first_name, last_name, email, phone) values (1, 'Alica', 'Collyer', '[email protected]', '676-612-4979');
…
insert into MOCK_DATA (id, first_name, last_name, email, phone) values (10, 'Nevsa', 'Justun', '[email protected]', '997-928-5900');

Amazon Aurora itself doesn’t have built-in transformation or masking rules. Instead, you’ll need to implement masking logic using SQL queries or functions. Here are some practical approaches (both dynamic and static masking):

SQL Queries

Use SQL to create masked versions of your data. For example:

SELECT 
  id,
  CONCAT(LEFT(first_name, 1), REPEAT('*', LENGTH(first_name) - 1)) AS masked_name,
  CONCAT('****-****-****-', RIGHT(phone, 4)) AS masked_phone
FROM mock_data;

User-Defined Functions

Create custom functions for more complex masking or insert into the static table:

CREATE OR REPLACE FUNCTION mask_email(email VARCHAR(255))
RETURNS VARCHAR(255) AS $$
BEGIN
  RETURN CONCAT(LEFT(email, 1), '***', SUBSTRING(email FROM POSITION('@' IN email)));
END;
$$ LANGUAGE plpgsql;

SELECT mask_email('[email protected]') AS masked_email;

SELECT id, mask_email(email) AS masked_email 
FROM MOCK_DATA 
LIMIT 5;

These methods allow you to implement dynamic data masking directly within Aurora without relying on external transformation rules. They’re more straightforward and directly applicable to Aurora databases.

Copy Table

To implement static data masking in Aurora PostgreSQL you may just copy the data:

-- Create a new table with the same structure as the original
CREATE TABLE masked_MOCK_DATA (
    id INT,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(50),
    phone VARCHAR(50)
);

-- Insert masked data into the new table
INSERT INTO masked_MOCK_DATA
SELECT 
    id,
    CONCAT(LEFT(first_name, 1), REPEAT('*', LENGTH(first_name) - 1)) AS first_name,
    CONCAT(LEFT(last_name, 1), REPEAT('*', LENGTH(last_name) - 1)) AS last_name,
    CONCAT(LEFT(email, 2), '****', SUBSTRING(email FROM POSITION('@' IN email))) AS email,
    CONCAT('(***) ***-', RIGHT(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), '-', ''), 4)) AS phone
FROM MOCK_DATA;

To view a sample of the newly masked data, execute the following query:

SELECT * FROM masked_MOCK_DATA LIMIT 10;

For more advanced or automated masking, you might consider using third-party tools like DataSunrise that integrate with Aurora and provide additional masking capabilities.

Setting Up Static Masking Tasks in DataSunrise

DataSunrise offers a user-friendly interface for setting up static data masking tasks for Amazon Aurora. Here’s a step-by-step guide:

  1. Create Aurora Instance in DataSunrise
  2. Navigate to the Data Masking module
  3. Create a new Static Masking Task (SMTaskAurora in the figure below)
  1. Select the source and target databases
  1. Choose the tables (mock_data in the example below) and columns to mask (last_name, email, phone and ip_address)
  2. Apply masking method (e.g., substitution, shuffling, format-preserving encryption)
  1. Schedule the task execution (Manual by default)
  2. Run the task and verify the results

In the DBeaver you can now query the masked data from target database:

Tracking Execution Results

After setting up a static masking task, it’s crucial to monitor its execution and verify the results. DataSunrise provides comprehensive logging and reporting features for this purpose:

  1. Check the task execution status in the DataSunrise dashboard
  2. Review detailed logs for any errors or warnings
  3. Compare sample data from source and target databases
  4. Generate reports on masked columns and data distribution

Data-Driven Application Testing Approaches

When it comes to data-driven application testing, two main approaches are available:

1. Testing with Masked Data

This approach uses static data masking to create a realistic test environment with anonymized production data. It’s ideal for maintaining data relationships and distribution while protecting sensitive information.

2. Testing with Synthetic Data

Synthetic data is artificially generated to mimic the characteristics of real data. This approach offers more flexibility but may not fully represent all edge cases present in production data.

Both methods have their merits, and the choice depends on specific testing requirements and data sensitivity levels.

Best Practices for Static Data Masking in Amazon Aurora

To maximize the effectiveness of static data masking for Amazon Aurora, consider these best practices:

  1. Identify all sensitive data elements across your database
  2. Choose appropriate masking techniques for each data type
  3. Maintain data consistency across related tables
  4. Regularly update masking rules to address new data types or regulations
  5. Combine static masking with dynamic masking for comprehensive protection
  6. Implement strict access controls for masked databases

Conclusion

Static data masking for Amazon Aurora is a crucial technique for protecting sensitive data while enabling effective testing and development processes. By leveraging tools like DataSunrise, organizations can implement robust masking strategies that balance data utility with security and compliance requirements.

As data breaches continue to pose significant risks, implementing strong data protection measures, including static data masking, is no longer optional—it’s a necessity for responsible data management.

DataSunrise offers cutting-edge tools for database security, including audit, data discovery, and advanced masking capabilities. Our user-friendly interface makes it easy to implement comprehensive data protection strategies for Amazon Aurora and other database platforms. Visit our website for an online demo and to explore how we can help secure your valuable data assets.

Next

Data Masking in SQL Server

Data Masking in SQL Server

Learn More

Need Our Support Team Help?

Our experts will be glad to answer your questions.

General information:
[email protected]
Customer Service and Technical Support:
support.datasunrise.com
Partnership and Alliance Inquiries:
[email protected]