Configure data classification rules for outbound files

更新时间:
复制 MD 格式

To prevent business losses caused by employees sending out sensitive files, use the outbound file detection feature of SASE (Secure Access Service Edge) to detect and control these transfers in real time. By configuring data classification detection rules, you can identify and manage data leakage risks. This document describes how to configure these detection rules for outbound files.

Prerequisites

  • You have purchased the Office Data Protection Edition of SASE for internet access security. For more information, see Billing and Get started.

  • The SASE client installed on your corporate endpoints is version 4.3.1 or later.

Configuration methods

When you configure data classification detection rules for outbound files, you can choose one of the following three methods based on your business needs. These methods improve the accuracy and efficiency of rule configuration while ensuring the security and compliance of outbound file transfers.

  • Built-in rules: SASE provides built-in data classification detection rules for common enterprise file types. You can select these rules when creating an outbound policy to efficiently manage and protect sensitive data.

  • Custom rules: You can create custom rules based on multiple dimensions, including file content, file names, file name extensions, and data sources.

  • AI recommendation library: You can add detection rules from the AI recommendation library directly to your configured data categories, which greatly simplifies the configuration process.

    Warning

    To use detection rules from the AI recommendation library, you must first complete asset mapping. A large model then analyzes these files and generates detection rules. For more information, see Create an asset mapping task.

Configure a custom rule

Step 1: Create data elements

You can configure data elements using multiple dimensions, such as sensitive word libraries for file content (like dictionaries and regular expressions), file name extensions, and data sources.

  1. Log on to the Secure Access Service Edge console.

  2. In the left-side navigation pane, choose Data Protection > Data Classification.

  3. On the Data Classification page, click the Data Elements tab. Configure data elements as described in the following table.

    Tab

    Description

    Actions

    Dictionaries and Regular Expressions

    Configure a sensitive word library. You can use a dictionary or regular expression to detect file content.

    Create a sensitive word library:

    1. Click Create Sensitive Word Library.

    2. In the Create Sensitive Word Library panel, configure the following parameters and click OK.

      1. Name: Enter a name for the sensitive word library.

      2. Type: Configure a dictionary or regular expression to validate file content.

        • Dictionary: Customize the dictionary content. You can add multiple entries, separated by commas (,), and press Enter to confirm each entry.

        • Regular Expression: Enter a custom regular expression. For example, ([A-Za-z0-9]+) matches one or more uppercase letters, lowercase letters, or digits. After you configure the regular expression, click Test Regular Expression and enter sample text to validate the expression.

    Other actions:

    • Filter data by criteria such as type and data source.

    • In the Actions column, click Delete to remove a library that is not associated with any rules.

      Important

      If a library is associated with a detection rule, you must first disassociate it from the rule before you can delete the library.

    Data Types

    SASE provides several built-in intelligent algorithm classifications. When you configure a detection rule, you can assign an algorithm classification. SASE then uses the selected classification and file type for efficient and accurate detection of file content.

    In the Associated Rules column, you can view the detection rules that are configured with the algorithm classification.

    Data Levels

    SASE provides several built-in intelligent algorithm levels. When you configure a detection rule, you can assign an algorithm level. SASE then uses the selected algorithm level, general definitions of data sensitivity, and the amount of sensitive data for efficient and accurate detection of file content.

    In the Associated Rules column, you can view the detection rules that are configured with the algorithm level.

    File Name Extensions

    SASE provides built-in file name extensions. You can also define custom file name extensions to detect files.

    Add a file name extension:

    1. Click Add File Extension.

    2. In the Add File Extension panel, enter a file name extension and click OK.

    Other actions:

    • Filter data by data source.

    • In the Actions column, click Delete to remove a custom file name extension.

    Data Source

    Add Web Applications and Code Repository. When users send files downloaded from these sources, the system automatically triggers the detection process. This helps you monitor the flow of sensitive data and ensure compliance with security policies.

    Add an application:

    1. Click Create Application.

    2. In the Add Data Source panel, configure the following parameters:

      • Web Applications

        • Application Name: Enter a name for the application.

        • Application Address: Enter the URL and file path. Click Add to enter multiple application addresses. For example:

          • URL: www.aliyun.com/api/file

          • Path: /api/file

      • Code Repository

        • Repository Name: Enter a name for the repository.

        • Git Repository URL: Enter the Git repository URL.

  4. After you complete the configuration, click OK.

Step 2: Create a custom detection rule

SASE provides default data classification detection rules for common file types that you can use when you configure file exfiltration policies. You can also create custom detection rules based on your business requirements and validate them against files reported by the asset mapping feature to ensure accuracy and applicability.

Create a detection rule

  1. Log on to the Secure Access Service Edge console.

  2. In the left-side navigation pane, choose Data Protection > Data Classification.

  3. On the Data Classification page, click the Identification Rules tab.

  4. In the Data Category area on the left, click Create, and then click Create Category.

  5. In the Create Category dialog box, enter a category name and click OK.

  6. To the right of the data category you created, click Create Rule Group to create a detection rule for this data category.

  7. In the Create Group panel, configure the following parameters. Then, click OK.

    Parameter

    Description

    Rule Name

    The name of the detection rule. The name must be 2 to 32 characters long and can contain Chinese characters, letters, digits, hyphens (-), and underscores (_).

    Data Category

    Select the data category for the group.

    Sensitivity Level

    The sensitivity level of the file. Valid values:

    • L4: Confidential Data

      Includes sensitive personal information of customers within your business; macro-level feature data, predictive data, or credit data generated from aggregation across one or more departments. Unauthorized disclosure is strictly prohibited because it would severely impact the business or create systemic risks, potentially leading to major legal liabilities. This level also covers communication records of personnel involved in major management decisions, investments, and financing.

    • L3: Secret/Private Data

      Includes customer information generated during business operations and business data aggregated at the department level. Unauthorized disclosure could harm the enterprise, its customers, or employees, and may result in financial, commercial, or reputational losses, as well as legal liabilities.

    • L2: Internal Data

      Includes company data and customer information that can be accessed only by employees or third parties who have signed a non-disclosure agreement, or information that the owner has agreed to disclose to a specific group. Unauthorized disclosure might cause minor negative impacts on customers, business operations, or employees.

    • L1: Public Data

      Includes data that is publicly accessible or has been designated for public release by a customer, and its public dissemination poses no security or legal issues.

    Rule Configuration

    Configure the conditions for the sensitive data detection rule.

    In the rule configuration section, click + Add Condition to add a rule condition, or click + Add Group to add a condition group. For each condition, select a field (such as Data source), an operator (such as Include Any), and a match value (such as RDS database).

    For example, a rule configured as "File name includes salary" triggers detection if a file name contains the word "salary".

    We recommend configuring multiple conditions to ensure your policy accurately matches file content based on your business needs. You can set the logical relationship between multiple conditions or groups to AND or OR.

Detection rule parameters

File name

Option

Logical operator

Content

Keyword

Include All, Include Any, Not Include

Enter the text to detect.

dictionary

Include All, Not Include

Select a dictionary from the Data Elements > Dictionaries and Regular Expressions tab and set the match count.

regular expression

Include All, Not Include

Select a regular expression from the Data Elements > Dictionaries and Regular Expressions tab and set the match count.

File content

Option

Logical operator

Content

Keyword

Include All, Include Any, Not Include

Enter the text to detect.

dictionary

Include All, Not Include

Select a dictionary from the Data Elements > Dictionaries and Regular Expressions tab and set the match count.

regular expression

Include All, Not Include

Select a regular expression from the Data Elements > Dictionaries and Regular Expressions tab and set the match count.

Algorithm recommended classification

Include All, Include Any, Not Include

Select a built-in recommended algorithm classification from the Data Elements > Data Types tab.

Algorithm recommended level

Include Any, Not Include

Select a built-in recommended algorithm level from the Data Elements > Data Levels tab.

Data source

Logical operator

Content

Include Any, Not Include

Select data source applications based on the application type. You can select multiple applications.

  • Instant Messaging Application: Includes Lark, DingTalk, WeCom, WeChat, QQ, and others.

  • Web Applications: Select an application that you configured on the Data Elements > Data Source tab.

File type

Option

Logical operator

Content

File format

Include Any, Not Include

Select common file formats. You can select multiple formats.

File name extensions

Include Any, Not Include

Select a file name extension that you configured on the File Name Extensions tab.

File size

Logical operator

Content

Greater Than or Equal To, Less Than or Equal To, Within [A,B]

Enter a detection range for the file size.

File encryption

Option

Content

Encrypted

Select Yes or No.

Verify a detection rule

For each created detection rule, you can select a verification method based on your business needs. The rule verification feature is designed to test the accuracy and applicability of a detection rule and does not affect the actual use of the rule.

  1. To verify a detection rule, click Verify in the Rule Verification section.

    After the verification is complete, a table appears and lists the files that match the rule conditions. The table includes information such as file name, format type, file size, and discovery time. You can click Preview to view the details of a file.

  2. In the Data Verification dialog box, select a Verification Method and click OK.

    SASE validates files based on your selection and the configured rule, and then displays the files that match the rule.

    • Verify Based on Files Used for Generating Rule: Validates the rule against the latest files that the large model analyzed in the asset mapping feature.

      Warning

      To use this verification method, you must first enable the large model to analyze the reported files in the asset mapping feature. For more information, see Intelligent rule generation.

    • Verify Based on Most Recently Detected Files: Validates the rule against the latest files reported by the asset mapping feature.

  3. To confirm the accuracy of your rule, click Preview in the Actions column to view the file's information.

Other operations

For custom detection rules or rules generated by the AI recommendation library, you can perform operations such as editing, enabling, or disabling them. You can also create sub-rules under existing detection rules for more fine-grained management and flexible configuration.

  • Edit: Click Edit Group Information to view and modify a configured detection rule.

  • Enable/Disable: Click the Rule Status switch to enable or disable the detection rule.

    Click the More icon (⋮) to the right of a category node and select New Rule, Edit, or Copy.

Configure rules with the AI recommendation library

SASE maps files on endpoints to generate an asset map. It also uses a large model to analyze the mapped files and automatically generate detection rules based on file types. You can enable these recommended rules from the AI recommendation library and add them to built-in or custom data classifications to simplify the configuration process.

  1. On the Data Classification page, click the Intelligent Recommendation Library tab.

  2. On the Intelligent Recommendation Library tab, select an intelligently generated detection rule and click Enable Recommended Rule.

  3. In the Enable Recommended Rule dialog box, select a data category for the detection rule. Once enabled, you can edit or reclassify the rule within the selected data category.

Related documents