Create metric quality rules

更新时间: 2026-05-22 06:07:59

Dataphin allows you to create quality rules to validate metrics and simplify metric quality monitoring. This topic describes how to configure these rules.

Prerequisites

To configure a quality rule, you must first add a monitoring object. For more information, see Add a monitoring objectAdd and manage monitoring objects.

Permissions

  • Super administrators, quality administrators, users with a custom global role that grants the Quality Rules-Manage permission, users with a custom project role that grants the Project Quality Management-Quality Rule Management permission for the table's project, and metric business owners can configure scheduling and alerts for quality rules.

  • Quality owners and regular users must also have read permissions on logical table fields. To apply for permissions, see Apply for, renew, and return table permissions.

  • Supported operations vary by object. For more information, see Operation permissions on quality rules.

Rule checks

When a metric is checked against a quality rule, if a weak rule is triggered, the system sends an alert. If a strong rule is triggered, the system automatically interrupts the task that populates the table to prevent dirty data from flowing to downstream tasks. An alert is also sent.

Trial run vs. run

A trial run and a normal run differ in how they are executed and where their results are displayed. A trial run is a simulated execution of a quality rule to check its correctness and behavior. The results of a trial run are not displayed in the quality report. A run checks a quality rule at a specified time, and its results are output to the quality report for users to view and analyze.

Quality rule configuration

  1. On the Dataphin homepage, choose Governance > Data Quality in the top navigation bar.

  2. In the left-side navigation pane, click Quality Rules. On the Metrics page, click the name of the target object to go to the Quality Rule Details page and configure the quality rule.

  3. On the Quality Rule Details page, click Create Quality Rule.

  4. In the Create Quality Rule dialog box, configure the parameters.

    Parameter

    Description

    Basic Information

    Rule Name

    A custom name for the quality rule.

    Rule Strength

    You can set the strength to weak rule or strong rule.

    • If you select weak rule, the system sends an alert but does not block downstream tasks when the quality rule check fails.

    • If you select strong rule, an alert is triggered when the quality rule check fails. If there are downstream tasks (such as code check scheduling or task-triggered scheduling), the rule also blocks them to prevent dirty data from spreading. If there are no downstream tasks (such as periodic quality scheduling), only an alert is sent.

    Description

    A custom description for the quality rule. The description can be up to 128 characters long.

    Configuration Method

    • Create from Template: Quickly create a quality rule by using a general system template or a custom business template.

      • System Template: Suitable for creating general rules. The template's built-in parameters are configurable.

      • Custom Template: Suitable for creating rules with business logic. The template parameters are pre-configured and do not need to be set.

    • Custom SQL: Defines a quality rule using custom SQL, which is suitable for complex scenarios.

    Rule Template

    Select a rule template from the drop-down list. Options include Uniqueness, Stability, and Custom SQL.

    • Uniqueness: Includes Field Group Count Check and Duplicate Field Value Count Check.

    • Stability: Includes Field Stability Check and Field Fluctuation Check.

    • Custom SQL: Includes Custom Statistical Metric Check.

    For more information, see Quality rule template types.

    Rule Type

    The rule type is associated with the template and serves as its most basic attribute. It can be used in descriptions and for filtering.

    Rule Configuration

    Rule Configuration

    If you set Rule Template to Uniqueness, the following parameters are available.

    • Field Group Count Check/Duplicate Field Value Count Check:

      • Check Table Data Filtering: Disabled by default. If enabled, you can configure filter conditions for the checked table, such as partition filters or regular data filters. Dataphin appends this filter condition to the check query. If the table requires partition filtering, we recommend that you configure a partition expression in the scheduling configuration. This allows you to view the quality report at the granularity of the check partition. Enter the data filtering content. Examples:

        id = 12 -- For a single table

        T1.id=12 and T2.name = "John" -- For two tables

    If you set Rule Template to Stability, the following parameters are available.

    • Field Stability Check/Field Fluctuation Check:

      • Statistical Method: We recommend that you select a statistical method based on your business scenario.

      • Check Table Data Filtering: Disabled by default. If enabled, you can configure filter conditions for the checked table, such as partition filters or regular data filters. Dataphin appends this filter condition to the check query. If the table requires partition filtering, we recommend that you configure a partition expression in the scheduling configuration. This allows you to view the quality report at the granularity of the check partition. Enter the data filtering content. Examples:

        id = 12 -- For a single table

        T1.id=12 and T2.name = "John" -- For two tables

    If you set Rule Template to Custom SQL, the following parameters are available.

    • Custom Statistical Metric Check:

      • Custom SQL: Supports SELECT query statements. The query object must include the primary table. Example:

        select sum(sale) from tableA where ds=${bizdate};

    Check Configuration

    Rule Check

    • After a data quality rule is checked, the result is compared with the anomaly check configuration. If the conditions are met, the check fails and triggers subsequent processes such as alerts.

    • The template and configuration determine the available metrics for an anomaly check. Multiple conditions can be combined by using AND or OR operators. We recommend that you configure fewer than three conditions.

    For more information, see Metric check configuration.

    Business Attribute Configuration

    Attribute Information

    The quality rule's attribute configuration determines the standards for entering business attributes. Examples:

    • The value type for the field mapped to the responsible department is an enumeration (multi-select). The available values are Big Data Department, Business Department, and Technology Department. When you create a quality rule, this attribute is shown as a multi-select drop-down list with these values.

    • The value type for the field mapped to the rule owner is custom input, and the maximum field length is 256 characters. When you create a quality rule, you can enter up to 256 characters for this attribute.

    If the input method for an attribute field is Range, configure it as follows:

    Range: This is typically used for continuous numerical or date values. You can choose from the operators >, >=, <, and <=. For more information about attribute configurations, see Create and manage quality rule attributes.

    Scheduling Attribute Configuration

    Scheduling Method

    You can select an existing schedule. If you have not yet decided on a scheduling method, you can create the quality rule first and configure scheduling later. To create a new schedule, see Create a schedule.

  5. Click Save to complete the rule configuration.

    You can click Preview SQL to compare the current configuration with the last saved version and view SQL changes.

    Note
    • The SQL preview is unavailable because key information is incomplete.

    • The SQL preview on the left shows the last saved configuration. If no configuration was saved, it is empty. The SQL preview on the right shows the current configuration.

    Rule Configuration List

    On the rule configuration list page, you can view information about the configured metric rules and perform operations such as view, edit, trial run, run, and delete.

    image

    Area

    Description

    Filter and Search Area

    Allows you to quickly search by object or rule name.

    Allows you to filter by rule type, rule template, rule strength, trial run status, and effective status.

    Note

    If a business attribute of the quality rule is configured to be searchable or filterable and is enabled, you can search or filter by that attribute.

    List Area

    Displays the object type/name, rule name/ID, trial run status, Effective Status, Rule Type, Rule Template, Rule Strength, Scheduling Type, and related knowledge base document information. Click the image icon before the Refresh button to select the columns to display in the rule list.

    • Effective Status: We recommend that you perform a trial run before enabling a rule. Enable only the rules that pass the trial run. This practice helps prevent incorrect rules from blocking production tasks.

      • After you enable a rule, it runs automatically according to its schedule.

      • After you disable a rule, it no longer runs automatically but can be run manually.

    • Related knowledge base document: Click View Details to see the knowledge base information associated with the rule. This includes the table name, check object, rule, and related knowledge base document. You can also search for, view, edit, or delete the knowledge base document. For more information, see View the knowledge base.

    Operations Area

    You can perform operations such as view, clone, edit, trial run, run, configure scheduling, associate a knowledge base document, and delete.

    • View: View the details of the rule configuration.

    • Clone: Quickly clone a rule.

    • Edit: After editing a rule, you must perform a new trial run.

    • Trial run: You can perform a trial run for a rule by using an Existing Schedule or a Custom Check Range. After the trial run, you can click the image icon to View trial run logs.

    • Run: You can run a rule by using an Existing Schedule or a Custom Check Range. After the run, you can view the check result in Check Records.

    • Configure Scheduling: In the dialog box, you can filter schedules by scheduling type or search by schedule name. You can also edit the schedule.

    • Associate knowledge base document: After you associate a rule with a knowledge base document, you can view the associated information in Quality Rules and the Governance Workbench. You can select an unassociated knowledge base document. To create one, see Create and manage the knowledge base.

    • Delete: Deleting a quality rule object also deletes all of its quality rules. This action cannot be undone. Proceed with caution.

    Batch Operations Area

    You can perform batch operations such as trial run, run, configure scheduling, enable, disable, modify business attributes, associate a knowledge base document, and delete.

    • Trial run: Perform a batch trial run for rules by using an Existing Schedule or a Custom Check Range. After the trial run, you can click the image icon to View trial run logs.

    • Run: Perform a batch run for rules by using an Existing Schedule or a Custom Check Range. After the run, you can view the check result in Check Records.

    • Configure Scheduling: In the dialog box, you can filter schedules by scheduling type or search by schedule name. You can also edit the schedule to configure it for multiple quality rules in a batch. Only the selected rules that are editable on the Quality Rules list page can be modified.

    • Enable: After you enable rules in a batch, they run automatically according to their schedules. Only the selected rules that are editable on the Quality Rules list page can be enabled.

    • Disable: After you disable rules in a batch, they no longer run automatically but can be run manually. Only the selected rules that are editable on the Quality Rules list page can be disabled.

    • Modify Business Attributes: You can modify business attributes in a batch if their corresponding field value type is single-select or multi-select.

      • If the field value type is multi-select, you can append or modify attribute values.

      • If the field value type is single-select, you can directly modify attribute values.

    • Associate knowledge base document: After you associate rules with a knowledge base document, you can view the associated information in Quality Rules and the Governance Workbench. You can associate a knowledge base document with monitoring objects in a batch. To create a knowledge base document, see Create and manage the knowledge base.

    • Delete: You can delete quality rule objects in a batch. This action cannot be undone. Proceed with caution. Only the selected rules that are editable on the Quality Rules list page can be deleted.

Create schedule

Note
  • When you configure a schedule for a rule, you can quickly base it on an existing schedule. Each table can have a maximum of 20 scheduling rules.

  • A single rule can have a maximum of 10 schedules.

  • If schedule configurations are identical, they are automatically deduplicated.

  • The check range acts as a filter in the check statement to control the scope of each quality check. The check range also serves as the basic unit for downstream processes like the quality report, which is viewed at the granularity of the check range.

  1. On the Quality Rule Details page, click the Scheduling Configuration tab, and then click Create Schedule to open the Create Schedule dialog box.

  2. In the Create Schedule dialog box, configure the parameters.

    Parameter

    Description

    Schedule Name

    A custom name for the schedule.

    Scheduling Type

    You can select time-based scheduling, Data Update-triggered Scheduling, or Fixed Task-triggered Scheduling.

    • Time-based scheduling: Performs periodic quality checks on data based on a set schedule. This is suitable for scenarios where data is generated at relatively fixed times.

      Scheduling Cycle: Running a quality rule consumes computing resources. We recommend that you avoid running multiple quality rules concurrently to prevent interference with production tasks. The available cycles are Day, Week, Month, Hour, and Minute.

      If the system time zone (the time zone in the User Center) differs from the scheduling time zone (the time zone configured in Management Center > System Settings > Basic Settings), the rule runs based on the system time zone.

    • Data Update-triggered Scheduling: Whenever a code task runs, the system determines whether the task has updated the specified check range for the current table. This is suitable for tables whose modification tasks are not fixed or for critical tables that require monitoring on every change.

      Note

      We recommend that you set the check range to the partitions updated by the task. For non-partitioned tables, the entire table is checked. The system automatically detects all data changes and performs checks, ensuring that no changes are missed.

    • Fixed Task-triggered Scheduling: Runs the configured quality rule after or before a specified task completes successfully. You can trigger the rule based on tasks from the following node types: SQL, Offline Pipeline, Python, Shell, Virtual, Dlink, and Database SQL. This is suitable for tables whose modification tasks are fixed.

      Note
      • You can select only production environment tasks for fixed task triggers. If a strong rule is configured and the scheduled task check fails, it may affect production tasks. Proceed with caution based on your business requirements.

      • Supported engine types are MaxCompute.

      • Trigger Time: Select when the quality check is triggered. You can choose to Trigger After All Tasks Run Successfully, Trigger After Each Task Runs Successfully, or Trigger Before Each Task Runs.

      • Triggering Task: The following roles can select a task node from a production project to trigger the task. You can also search by the node's output name.

        • Users who are project administrators for Prod/Basic projects, have the Ops system role for a Prod project, have the Developer system role for a Basic project, or have a custom project role with the Project Quality Management-Quality Rule Management permission in a Prod/Basic project can select task nodes from a production project.

        • Users with a custom global role that grants the Quality Rules-Manage permission can select task nodes from all production projects.

        Note

        If you set Trigger Time to Trigger After All Tasks Run Successfully, we recommend that you select tasks with the same scheduling cycle to avoid delayed rule execution and quality check results.

    Scheduling Conditions

    Disabled by default. If enabled, the system checks whether the scheduling conditions are met before running the rule. The schedule runs only if the conditions are met. Otherwise, the current schedule is skipped.

    • Business Date/Execution Date: If you select time-based scheduling (does not support execution date), Data Update-triggered Scheduling, or Fixed Task-triggered Scheduling, you can configure the date. You can choose Common Calendar or Custom Calendar. To learn how to create a custom calendar, see Create and manage public calendars.

      • If you select Common Calendar, the available conditions are Month, Day of Week, and Date. Example:

        image

      • If you select Custom Calendar, the available conditions are Date Type and Tag. Example:

        image

    • Instance Type: If you select Data Update-triggered Scheduling or Fixed Task-triggered Scheduling, you can configure the instance type. Options include Periodic Instance, backfill instance, and Manual Instance. Example:

      image

    Note
    • You must configure at least one rule. To add a rule, click + Add Rule.

    • You can configure a maximum of 10 scheduling conditions.

    • The relationship between scheduling conditions can be set to AND or OR.

    Check Range

    If the Scheduling Type is time-based scheduling or Fixed Task-triggered Scheduling, you can set the Check Range to Custom Check Range. If the Scheduling Type is Data Update-triggered Scheduling, you can set the Check Range to Task-updated Partition or Custom Check Range.

    • Task-updated Partition: If the triggering task updates a partition, the quality check runs directly on that partition.

      Note
      • In dynamic partition scenarios, the partition may not be resolved, and no quality check is performed.

      • Fluctuation check rules, such as checking partition size, row count, or field statistics, require a specific partition and do not support the task-updated partition check range.

      • If a non-partitioned table is updated, the entire table is checked.

    • Custom Check Range: For scenarios where the partition cannot be resolved, you can use a custom check range to specify the partition expression based on the business date or execution date.

      • Check range expression: A drop-down list where you can also type. You can directly enter the range to be checked, such as ds='${yyyyMMdd}'. You can also select a built-in partition expression and modify it to quickly configure the setting. For more information, see Built-in partition expressions.

        Note
        • If there are multiple conditions, you can connect them with and or or, such as province="Zhejiang" and ds<=${yyyyMMdd}.

        • If a filter condition is configured in the quality rule, the partition expression and the filter condition are combined with an AND operator.

        • The partition expression supports full table scans.

          Note: A full table scan consumes significant resources, and some scans are not supported. We recommend that you configure a partition expression to avoid full table scans.

      • Check Range Estimate: Defaults to the current business date.

  3. Click OK to finish the scheduling configuration.

Scheduling Configuration List

After you create a schedule, you can view, edit, clone, and delete it in the scheduling configuration list.

image.png

Area

Description

Filter and Search Area

Allows you to quickly search by schedule name.

Allows you to filter by time-based scheduling, Data Update-triggered Scheduling, or Fixed Task-triggered Scheduling.

List Area

Displays the Schedule Name, Scheduling Type, Last Modifier, and Last Modified Time for each schedule.

Operations Area

You can edit, clone, or delete a schedule.

  • Edit: You can modify a configured schedule.

    Important

    All rule configurations that reference this schedule will be updated automatically. Proceed with caution.

  • Clone: Quickly duplicate a schedule configuration.

  • Delete: Schedules that are referenced by a rule configuration cannot be deleted.

Alert configuration

You can configure different alert methods for different rules to distinguish between them. For example, you can configure phone call alerts for strong rule failures and SMS alerts for weak rule failures. If a rule matches multiple alert configurations, you can set a policy to determine which alert takes effect.

Note

You can create up to 20 alert configurations for a single monitoring object.

  1. On the Quality Rule Details page, click the Alert Configuration tab, and then click Create Alert Configuration to open the Create Alert Configuration dialog box.

    image.png

  2. In the Create Alert Configuration dialog box, configure the parameters.

    Parameter

    Description

    Coverage

    You can select All Rules, All strong rules, All weak rules, or Custom.

    Note
    • For a single monitoring object, you can create one alert configuration for each of the All Rules, All strong rules, and All weak rules scopes. New rules automatically match the corresponding alert based on their strength. To change one of these alert configurations, you must edit the existing configuration.

    • For a custom scope, you can select up to 200 of the configured rules under the current monitoring object.

    Alert Configuration Name

    The alert configuration name must be unique per monitoring object and can be up to 256 characters long.

    Alert Recipient

    Configure alert recipients and alert methods. You must select at least one alert recipient and one alert method.

    • Alert Recipient: You can select Custom, On-call schedule, or Quality Owner.

      You can configure up to 5 custom alert recipients and up to 3 on-call schedules.

    • Alert Method: You can select Phone, Email, SMS, DingTalk.

  3. Click OK to finish the alert configuration.

Alert Configuration List

After you configure an alert, you can sort, edit, and delete it in the alert configuration list.

image.png

Area

Description

① Sorting Area

Configure the alert policy for when a quality rule matches multiple alert configurations:

  • The first matched alert configuration takes effect: If you select this policy, only the first alert configuration that the rule matches takes effect. All other configurations are ignored. You can sort the configured alerts. Click Sort Rules. You can drag the image.png icon in front of the alert configuration name to sort, or use the icons in the Operations column to move items. From left to right, the icons are: Move to Top and Move to Bottom. After reordering, click Finish Sorting.

    image.png

  • All matched alert configurations take effect: All alert configurations in the list are effective for the quality rules under the current monitoring object.

    For example, if you configure multiple alert configurations and select this option, the system consolidates alerts based on the alert method, alert recipient, and alert rule. In special cases, if multiple alert configurations target the same recipient, alerts are consolidated based on a merge policy.

    Note

    Alert merging is not supported for an on-call schedule.

② List Area

Displays the name, effective scope, specific recipients for each alert type, and their corresponding alert methods.

Effective Scope: For custom alerts, you can view the configured object name and rule name. If the rule is deleted, the object name cannot be viewed. We recommend that you update the alert configuration.

③ Operations Area

You can edit and delete configured alerts.

  • Edit: Modify a configured alert. If you modify the alert recipient or alert method, notify the relevant personnel promptly to avoid missing business alerts.

  • Delete: After deletion, this alert configuration will no longer apply to the rules it matches. Proceed with caution.

View quality report

Click Quality Report to view the Rule Check Overview and Rule Check Details for the current quality rule.

  • You can quickly filter the check details by anomaly result, partition time, or by rule or object name keywords.

  • In the Operations column of the rule check details list, click the image icon to view the rule check details for the quality rule.

  • In the Operations column of the rule check details list, click the image icon to view the execution log for the quality rule.

Quality rule permissions

  1. Click Permission Management and configure Viewable Details to specify which members can view check record details, quality rule details, and the quality report.

    Viewable Details: You can select All Members or Only members with quality management permissions for the current object.

  2. Click OK to finish the permission management configuration.

Next steps

After you configure the quality rules, you can view them on the Quality Rules list page. For more information, see Manage the monitoring object list.

上一篇: Create a real-time metatable rule 下一篇: Batch configuration quality rules
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈