Create quality rules in batches

更新时间: 2026-01-18 19:17:47

Creating quality rules in batches lets you apply uniform rules to multiple monitored objects. You can set alerts for exceptions and monitor objects in real time. This topic describes how to create quality rules in batches.

Prerequisites

The data tables or metrics must be published to the production environment. For more information, see Manage publishing tasks.

Permissions

  • Super administrators and quality administrators can create quality rules in batches, create and delete exception archiving tables, and configure score weights.

  • Quality owners can create quality rules, create and delete exception archiving tables, and configure score weights for the monitored objects they own.

  • Quality owners and regular users must have read permissions on data tables and data sources. To request permissions, see Request, renew, and return table permissions and Request data source permissions.

  • The supported operations vary for different objects. For more information, see Quality rule operation permissions.

Note

You can configure exception archiving tables and score weights only for Dataphin tables and global data tables.

Validation rule overview

When a data table is checked against a quality rule, if a weak monitoring rule is triggered, the system sends an alert message. This lets you find and handle the exception promptly. If a strong monitoring rule is triggered, the system automatically breaks the node where the table resides to prevent dirty data from flowing to downstream nodes. The system also sends an alert message to help you find and handle the exception promptly.

Add quality rules in batches

Adding quality rules in batches lets you configure the same quality rule for different objects, which improves configuration efficiency. You can configure rules at the table or field level. The configuration method is similar for different monitored objects, with the main difference being the object selection process. The following procedure uses a Dataphin table as an example.

  1. On the Dataphin home page, in the top menu bar, click Administration > Data Quality.

  2. In the left navigation pane, click Quality Rule. On the Quality Rule page, click Add Quality Rule in the upper-right corner. Alternatively, click the image icon and select Add by Monitored Object.

  3. On the Add Quality Rule page, configure the parameters.

    1. Basic information

      Parameter

      Description

      Rule Name

      A custom name for the quality rule. The name can be up to 256 characters long. After you select monitored objects, you can adjust the name for each object individually.

      Rule Strength

      Soft Rules and Strong Rules are supported.

      • Weak Rule: If you select Weak Rule, an alert is triggered when the quality rule check fails, but downstream task nodes are not blocked.

      • Strong Rule: If you select Strong Rule, an alert is triggered when the quality rule check fails. If there are downstream task nodes (scheduled by code check or task trigger), they are blocked to prevent data pollution. If there are no downstream task nodes (such as for periodic quality scheduling), only an alert is triggered.

      Description

      A custom description for the quality rule. The description can be up to 128 characters long.

      Configuration Method

      You can create templates and use custom SQL.

      • Create from Template: Quickly create quality rules using general system templates or custom business templates.

        • System Template: The built-in parameters are configurable. This is suitable for creating general rules.

        • Custom Template: The parameters are preset and do not need to be configured. This is generally used for creating rules that contain business logic.

      • Sql: Use SQL to flexibly define custom quality monitoring rules. This is suitable for flexible and complex scenarios. Only custom SQL templates support creating quality rules in batches.

      Note

      Data sources or real-time metadata tables cannot be configured.

      Rule Template

      The supported rule templates vary depending on the monitored object. For more information about templates, see Template types.

      • Dataphin tables and Global data tables support the following rule templates: Completeness, Uniqueness, Timeliness, Validity, Consistency, Stability, and Sql.

        • Completeness: Includes Null Value Validation and Empty String Validation.

        • Uniqueness: Includes Uniqueness Validation, Field group count validation, and Duplicate Value Count Validation.

        • Timeliness: Includes Time Comparison With Expression, Time Interval Comparison, and Time Interval Comparison In Two Tables.

        • Validity: Includes Column Format Validation, Column Length Validation, Column Value Domain Validation, Reference Table Validation, and Standard Reference Table Validation (requires the Data Standard module).

        • Consistency: Includes Columns Value Consistency Validation, Columns Statistical Consistency Validation, Single-field business logic consistency comparison, Columns In Two Tables Value Consistency Validation, Columns In Two Tables Statistical Consistency Validation, Columns In Two Tables Processing Logic Consistency Validation, and Cross-Source Columns Statistical Consistency Validation.

        • Stability: Includes Table Stability Validation, Table Volatility Validation, Column Stability Validation, and Column Volatility Validation.

        • Custom SQL: Contains information about rules created from the custom SQL rule template.

      • Metrics support the Uniqueness and Stability rule templates.

        • Uniqueness: Includes Field group count validation and Duplicate Value Count Validation.

        • Stability: Includes Column Stability Validation and Column Volatility Validation.

      • Data sources support the Stability rule template.

        • Connectivity monitoring: Monitors and sends alerts for connectivity changes. A data source configured in Dataphin may fail to connect due to network changes, incorrect usernames or passwords, or other reasons, which can cause nodes to report errors.

        • Table schema change: Monitors and sends alerts for schema changes. Changes in the schema of an ancestor table, such as renaming, deleting, or adding or removing fields, can cause downstream nodes to report errors. Only some data sources support table schema change monitoring rules. For more information, see Data sources supported by Dataphin.

      • Real-time metatables support the Consistency and Stability rule templates.

        • Consistency: Includes Stream-Batch Comparison and Real-time cross-ingest endpoint comparison.

        • Stability: Includes Real-time statistical value detection.

      Rule Type

      The rule type depends on the template. It is the most basic property of a template and can be used for descriptions and filtering.

      Object Filter

      You can filter monitored objects based on different conditions.

      • Dataphin table: Filter data tables by project (for physical tables) or domain (for logical tables), resource owner, and table type. You can select up to 100 projects or domains.

      • Global data table: Filter data tables by data source type, data source, and DB/Schema. For supported data sources, see Data sources supported by Dataphin. If a data source cannot connect to the Dataphin cluster, you must first perform metadata acquisition before you can configure quality monitoring rules. For supported data sources, see Create and manage metadata acquisition tasks.

      • Metric: Filter metrics by data domain and logical aggregate table.

      • Data source: Filter data sources by data source type. You can select any data source within Dataphin to create quality monitoring rules. All supported data sources can be tested for connectivity, but only some support table schema change monitoring rules. For more information, see Data sources supported by Dataphin.

      • Real-time metatable: Filter real-time metatables by project.

      Object Selection

      • Field object: To configure field-level monitoring rules, first select the data tables to monitor based on table name, table owner, or quality owner. Then, select the specific fields to monitor.

      • Table object: You can configure table-level monitoring rules when the rule template for a data table is set to Stability-Table Stability Validation or Stability-Table Volatility Validation, or when the rule template for a data source is set to Stability-Table schema change monitoring. You can select the data tables to configure based on table name, table owner, or quality owner.

    2. Click Next.

      If you click Cancel, the configured quality rules will not be added.

    3. Rule configuration (This step is not required for data sources. Proceed to the next step.)

      Parameter

      Description

      Reference Table

      The data table selected in the Object Selection step. The rule details are configured based on the fields of this table. For example, Table A has fields `id` and `name`. Table B has fields `id` and `age`. Table C has fields `name` and `age`. If Table A is the reference table and `id` is the validation field, Table B passes validation, but Table C fails.

      Note
      • A reference table must be configured when the monitored object is a data table or a real-time metatable and the rule template uses a complex configuration (that is, other fields are required in addition to the validation field).

      • When you need to configure comparison fields in batches and the fields differ across tables, a reference table provides a quick selection method.

      • Scenario: Use batch configuration for similar or identical requirements. If the requirements are completely different, using a reference table will cause an error during the validation in step three.

      Template Configuration

      When you select a quality rule template, its configuration information is displayed. To modify the configuration, see Quality rule templates.

      Rule Configuration

      The rule configuration varies depending on the selected rule template.

      Special configurations are as follows:

      • Validation table data filtering: This is disabled by default. If you enable it, you can configure filter conditions, partition filters, or regular data filters for the validation table. The filter conditions are appended directly to the validation SQL. If the validation table requires a partition filter, we recommend that you configure a partition filter expression in the scheduling configuration. After configuration, the validation partition will be the minimum granularity for viewing quality reports.

      • When the rule template is set to Consistency/Columns In Two Tables Statistical Consistency Validation or Consistency/Cross-Source Columns Statistical Consistency Validation, you can enable Comparison table data filtering. If enabled, you can configure filter conditions, partition filters, or regular data filters for the comparison table. The filter conditions are appended directly to the validation SQL.

      Validation Configuration

      • After a data quality rule is validated, the result is compared with the exception validation configuration. If the conditions are met, the validation fails, and subsequent processes such as alerting are triggered.

      • The available metrics for exception validation are determined by the template and its configuration. Multiple AND/OR conditions are supported. We recommend using fewer than three conditions in your actual configuration.

        For more information, see Validation configuration.

      Archiving Configuration

      This is disabled by default. If you enable it, you can archive abnormal data to a file or an archived table. After a quality check, you can download and analyze the archived abnormal data. For more information, see Exception archiving.

      Note

      Only Dataphin tables and global data tables support exception archiving.

      Business Property Configuration

      The standards for filling in business properties depend on the configuration of the quality rule properties. For example, if the value type for the rule owner field is custom input and the property field length is 256, you can enter up to 256 characters for this property value when creating a quality rule. For more information, see Property information.

      Quality Score Configuration

      The quality check results are evaluated using a quality score to help you understand the data quality. For more information about the configuration, see Quality score configuration.

      Note

      This is supported only for Dataphin tables and global data tables.

    4. Click Next.

    5. Object details configuration

      • Field-level details configuration

        You can view the field validation information for the selected data tables. You can also modify the rule name, quality owner, and quality score weight (only for Dataphin tables and global data tables). Additionally, you can edit the quality rule, delete the validation object, modify quality owners in batches, and edit quality score weights in batches for the validation object.

        image

        • Modify Quality Owner: Modifies the quality owner of a selected monitored object. You can also click Manage Quality Owners in Batches to append or modify them.

          Note

          If the current validation object is already a monitored object, the quality owner that you configure here will overwrite the existing quality owner of that monitored object after the batch rule is successfully created.

          • Append: A maximum of 20 quality owners can be added.

          • Modify: Replace all existing quality owners with the new selection. You can select up to 20 owners.

        • Modify Quality Score Weight: Configure the quality score weight for the quality rule. This weight is used to calculate the quality score of the monitored object. You can set an integer from 1 to 10.

        • Edit: Modify the rule configuration, validation configuration, business property configuration, and quality score configuration of the rule.

        • Delete: Delete the validation object.

      • Table-level details configuration

        You can view the verification details for the selected data table and modify its rule name, quality owner, and quality score weight. You can modify the quality score weight only for Dataphin tables and global data tables. You can also edit quality rules, delete verification objects, batch modify quality owners, and batch edit quality score weights. Table-level configuration is the same as field-level configuration. For more information, see Field-level details configuration.

        image

    6. Click Add Rule to complete the configuration.

List of batch-added quality rules

Note

After a quality rule passes a trial run and is enabled, you must save it.

After you create quality rules in batches, you can edit them, run a trial run, configure schedules, and delete them in the rule configuration list.

image

Area

Description

Filter and search area

You can perform a quick search by object or rule name. You can also filter quality rules by Trial Run Failed, None, Inactive, or Not Scheduled.

List area

Displays the Object Name, Rule Name, Data Table/Data Domain, Trial Run Status, Effective Status, Quality Owner, and Schedule Type for each rule configuration.

Effective Status: We recommend that you run a trial before enabling a rule. Enable the rule only if the trial run is successful to avoid blocking online nodes with incorrect rules.

  • After a rule is enabled, it runs automatically according to its schedule.

  • After a rule is disabled, it does not run automatically but can be run manually.

Operations area

You can perform the following operations: View, Edit, Scan Configuration, Trial Run, Change Quality Owner, and Delete.

  • View: View the details of the rule configuration.

  • Edit: After editing a rule, you must run a trial again.

  • Scan Configuration: In the dialog box, you can filter schedules by type or search for a schedule by name. You can also edit schedules. For more information, see Create a schedule.

    Note
    • When you configure schedules in batches, ensure that the validation scope expressions of the selected resource tables are consistent. If the configured partition is inconsistent with the actual table partition, an error may occur.

    • When you configure schedules in batches for non-partitioned tables, the validation scope expression is not saved. Only the scheduling configuration is saved.

  • Trial Run: You can run a trial for a rule using an Existing Schedule or a Custom Validation Scope. After the trial run, click the image icon to View Trial Run Log.

    Note

    When you run trials in batches, we recommend that you select tables with the same partition. The partition information is passed directly for execution. If the partitions are inconsistent, an error may occur.

  • Manage Quality Owners: Select the quality owner to whom you want to transfer ownership and click OK.

  • Delete: Deleting this quality rule object will delete all quality rules under it. This operation cannot be undone. Proceed with caution.

Batch operations area

  • Trial Run: You can run trials for rules in batches using an Existing Schedule or a Custom Validation Scope. After the trial run, click the image icon to View Trial Run Log.

    Note

    When you run trials in batches, we recommend that you select tables with the same partition. The partition information is passed directly for execution. If the partitions are inconsistent, an error may occur.

  • Scheduling Configuration: In this dialog box, you can filter schedules by type or search for them by name. You can also edit schedules and batch configure them for quality rules. For more information, see Create a schedule.

    Note
    • When you configure schedules in batches, ensure that the validation scope expressions of the selected resource tables are consistent. If the configured validation scope is inconsistent with the actual validation scope of the table, an error may occur.

    • When you configure schedules in batches for non-partitioned tables, the validation scope expression is not saved. Only the scheduling configuration is saved.

  • Enable: Enable the effective status for quality rule objects in batches. After a rule is enabled, it runs automatically according to its schedule.

  • Disable: Disable the effective status for quality rule objects in batches. After a rule is disabled, it does not run automatically but can be run manually.

  • Manage Quality Owners in Batches: Append or modify the quality owners of selected monitored objects in batches.

    • Append: You cannot add more quality owners if the list already contains 20.

    • Modify: Replace all quality owners in the current list with the ones specified this time. You can select up to 20 owners.

  • Delete: Delete quality rule objects in batches. This operation cannot be undone. Proceed with caution.

What to do next

In the quality rule list, configure the schedule and click Finish. You can then find the rule on the Dataphin table rule list page. For more information, see or Manage the monitored object list.

上一篇: Batch configuration quality rules 下一篇: Batch import quality rules
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈