Sensitive data protection
Dataphin supports sensitive data detection and data masking. You can combine these features with permission management to build a complete system for data protection.
Data classification and categorization
Dataphin lets you manage data classification and categorization. It includes built-in classifications for common personal information. You can also define custom standards for your enterprise.
Dataphin's data classification supports multiple levels. It also includes built-in detection features and methods to help automate sensitive data detection.
You can use data categorization in data masking, permission requests, and data downloads. This lets you apply different control policies based on the data category.
Sensitive data detection
Dataphin generates a sensitive data checklist using two methods: automatic detection and manual tagging.
Automatic detection
Dataphin uses detection rules to automatically detect sensitive data. These rules scan data based on a configured scope, such as projects and tables, and detection methods, such as field content and field names. The scan then generates a list of sensitive data. Automatic detection supports both scheduled full scans and real-time incremental scans. This helps you detect sensitive data quickly and comprehensively.
Manual tagging
For data that you know is sensitive, you can use manual tagging to mark it. Manual tagging methods include specifying data in the UI, uploading an Excel file, and tagging using data standards or data modeling.
Sensitive data protection
After you detect sensitive data, Dataphin helps you protect it with data masking. This is useful in the following scenarios:
Masking for sensitive data queries
When you view protected data using features such as ad hoc analysis, code tasks, or data preview, the sensitive data is masked. This masking is based on your configured policy and prevents sensitive data from being exposed.
Masking when writing data from production to development environments
When you write data from a production environment to a development environment for testing, the data is automatically masked according to your configured rules. This prevents sensitive data from entering the development environment.
Encryption and decryption for sensitive data integration
In scenarios such as migrating data to the cloud or data exchange, you may need to protect data in transit. You can use Dataphin's integrated encryption and decryption features to protect your data. Only users with key permissions can view the encrypted data. This offers enhanced protection for sensitive information.
Dataphin provides two main solutions for sensitive data protection:
Masking solution
This solution uses redaction and hash masking to protect sensitive data. For example, a name such as "Zhang San" can be displayed as "Zhang *", or a phone number can be masked using an MD5 hash. This method prevents data exposure. Note that masked data cannot be restored to its original value. This solution is suitable for temporary queries but not for data exchange.
Encryption and decryption solution
Dataphin can encrypt or decrypt sensitive data during data integration. It supports common algorithms, such as AES, RSA, and SM4. Dataphin also provides unified permission management for encryption and decryption keys. This ensures data security during data exchange.
Integration with the permission system
After you classify and categorize your data, you can integrate it with the permission management system. This further enhances the protection of your sensitive data.
Data permission requests: When you request data permissions, you can select fields based on their security level. This lets you request access only to data at the level you need. During the approval process, the system highlights any sensitive data in the request. This helps ensure a compliant approval process.
Permission audit: When you audit permissions, you can filter by sensitive data. This lets you review the current permission status and any operations performed on that data.
Approval policy: When creating an approval policy, you can define different rules based on whether the data is sensitive.