Create a real-time metatable rule

更新时间:
复制 MD 格式

This topic describes how to create a quality rule for a real-time metatable.

Prerequisites

To configure a quality rule, you must first add a monitoring object. For more information, see Add a monitoring objectAdd and manage monitoring objects.

Permissions

  • Super administrators, quality administrators, custom global roles with the Quality Rule-Manage permission, custom project roles with the Project Quality Management-Quality Rule Management permission for the project that contains the metatable, and real-time metatable owners can configure settings such as scheduling and alerts for quality rules.

  • Quality owners and regular users must also have read permissions on the real-time metatable. To apply for permissions on a real-time metatable, see Apply for, renew, and revoke table permissions.

  • The supported operations and permissions vary by object. For more information, see .

Trial run vs. run

The difference between a trial run and a regular run is the execution method and where the results are displayed. A trial run is a simulated execution of a quality rule to verify its correctness and performance. The results of a trial run are not displayed in the quality report. A run checks a quality rule within a specific time window, and its results are output to the quality report.

Metatable rule types

Rule type

Description

Statistical trend monitoring

Verifies data values and data change trends.

Real-time multi-link comparison

In business-critical scenarios, you can use real-time dual-link or triple-link quality rules to monitor data. If an anomaly occurs, operations staff can promptly switch to a backup data source. Real-time multi-link comparison rules can monitor issues such as data backlogs and statistical deviations.

Real-time and offline data reconciliation

When real-time and offline data use the same statistical logic, a real-time and offline data reconciliation rule can detect discrepancies between them. A significant discrepancy may indicate a data quality issue.

Quality rule configuration

  1. On the Dataphin homepage, choose Governance > Data Quality from the top navigation bar.

  2. In the left navigation pane, click Quality Rules. On the Real-time Metatable page, click the name of the target object to go to the Quality Rule Details page and configure the quality rule.

  3. On the Quality Rule Details page, click Create Quality Rule.

  4. In the Create Quality Rule dialog box, configure the parameters.

    Parameter

    Description

    Basic Information

    Rule Name

    Enter a custom name for the quality rule.

    Rule Severity

    Supports Weak Rule and Strong Rule.

    • If you select Weak rule, the system triggers an alert but does not block downstream task nodes when the quality rule check fails.

    • If you select Strong rule, an alert is triggered when the quality rule check fails. If downstream tasks exist (such as code check scheduling or task-triggered scheduling), the system also blocks them to prevent the spread of dirty data. If no downstream tasks exist (such as for periodic quality scheduling), only an alert is triggered.

    Description

    Enter a description for the quality rule. The description can be up to 128 characters long.

    Rule Template

    You can select a consistency or stability rule template.

    • Consistency: Includes Real-time and Offline Comparison and Real-time Pipeline Comparison.

    • Stability: includes Real-time statistics detection.

    For more information, see Real-time metatable template types.

    Rule Type

    The rule type is determined by the template and serves as a basic attribute for description and filtering.

    Rule Configuration

    Rule Configuration

    Configure the rules based on the selected Rule Template. For more information, see Offline link comparison parameter configuration, Multi-link comparison parameter configuration.

    Check Configuration

    Rule Check

    • After a quality rule runs, its result is compared against the anomaly check configuration. If the result matches the configured conditions, the check fails and triggers subsequent processes, such as alerts.

    • The available metrics for anomaly checks are determined by the template and its configuration. Multiple conditions can be combined by using AND/OR logic, but we recommend using no more than three conditions.

    For more information, see Check configuration details.

    Business Attribute Configuration

    Attribute Information

    The entry format for business attributes depends on how you configure the quality rule attributes. For example, if the field for the managing department is configured as a multi-select enumeration with options such as Big Data Dept, Business Dept, and Tech Dept, the attribute in the rule creation form is a multi-select drop-down list with these options.

    If the field for the rule owner is configured as custom text input with a length limit of 256 characters, you can enter up to 256 characters for this attribute when you create the rule.

    If an attribute field is configured as a Range, use the following method:

    Range: Commonly used for continuous numerical or date ranges. You can select from four operators: >, >=, <, and <=. For more information about configuring properties, see Create and manage quality rule properties.

    Scheduling Attribute Configuration

    Scheduling Method

    You can select a pre-configured schedule. If undecided on a scheduling method, you can configure it after creating the quality rule. To create a new schedule, see Create a schedule.

  5. Click OK to finish configuring the quality rule.

    You can click Preview SQL to view the differences between your current configuration and the last saved version. This helps you track SQL changes.

    Note
    • The Preview SQL button is disabled if required information is missing.

    • The left pane shows a preview of the SQL for the last saved configuration. If no configuration was saved, this pane is empty. The right pane shows a preview of the SQL for the current configuration.

Rule configuration list

On the rule configuration list page, you can view information about configured metatable rules and perform operations such as view, edit, trial run, run, and delete.

image

Area

Description

Filter and search area

You can search by object or rule name.

You can filter by Rule Type, Rule Template, Rule Severity, Trial Run Status, and Effective Status.

Note

If a business attribute is configured as searchable or filterable and is enabled, you can search or filter by that attribute.

List area

This area displays the Object Type/Name, Rule Name/ID, Trial Run Status, Effective Status, Rule Type, Rule Template, Rule Severity, Schedule Type, and related knowledge base document information. Click the image icon next to the Refresh button to select the columns to display in the rule list.

  • Effective Status: We recommend performing a trial run before you enable a rule. To prevent faulty rules from blocking production tasks, enable a rule only after it passes the trial run.

    • When a rule is enabled, it runs automatically according to its configured schedule.

    • When a rule is disabled, it no longer runs automatically but can still be run manually.

  • Related knowledge base documents: Click View Details to view information from the knowledge base associated with the rule. This includes the table name, check object, rule, and related documents. You can also search, view, edit, and delete knowledge base documents. For more information, see View the knowledge base.

Operations area

You can view, clone, edit, perform a trial run, run, configure scheduling, associate knowledge base documents, and delete rules.

  • View: View the rule configuration details.

  • Clone: Quickly clone a rule.

  • Edit: After you edit a rule, you must perform a new trial run.

  • Trial run: After a trial run, click the image icon to View trial run logs.

  • Run: After the run is complete, you can view the validation results in Validation Records.

  • Configure Scheduling: In the dialog box, you can filter by schedule type or search by schedule name. You can also edit the schedule.

  • Associate knowledge base documents: After a rule is associated with a knowledge base, you can view the associated knowledge in the quality rule and governance workbench. You can select an unassociated knowledge base. To create one, see Create and manage a knowledge base.

  • Delete: Deleting a quality rule object also deletes all rules under it. This action is irreversible.

Bulk operations area

You can perform bulk operations, including trial run, run, configure scheduling, enable, disable, modify business attributes, associate knowledge base documents, and delete.

  • Trial run: Perform trial runs for multiple rules in bulk. After the trial runs, click the image icon to View trial run logs.

  • Run: Runs rules in batches. You can view the validation results in Validation Records.

  • Scheduling configuration: You can filter by scheduling type or quickly search for schedules by name in the dialog box. You can also edit schedules and configure schedules for quality rules in batches. You can only modify selected rules that are editable on the quality rules list page.

  • Enable: After you enable the effective status for multiple rules, the selected rules run automatically according to their configured schedules. You can only enable rules that are editable on the quality rule list page.

  • Disable: After you disable the effective status for multiple rules, the selected rules no longer run automatically, but you can still run them manually. You can only disable rules that are editable on the quality rule list page.

  • Modify Business Attributes: You can bulk-modify business attributes when the corresponding attribute field is a single-select or multi-select type.

    • If the attribute is a multi-select type, you can append or overwrite attribute values.

    • If the attribute is a single-select type, you can directly overwrite the attribute value.

  • Associate knowledge base documents: After a rule is associated with a knowledge base, you can view the associated knowledge in the quality rule and governance workbench. You can configure a knowledge base for multiple monitoring objects in bulk. To create one, see Create and manage a knowledge base.

  • Delete: You can delete multiple quality rule objects in bulk. This action cannot be undone, so proceed with caution. You can only delete selected rules that are editable on the quality rule list page.

Create a schedule

Note
  • When configuring a schedule for a rule, you can select from the existing schedules for the current table. Each table can have a maximum of 20 schedules.

  • A single rule can be associated with a maximum of 10 schedules.

  • Identical schedule configurations are automatically deduplicated.

  1. On the Quality Rule Details page, click the Scheduling Configuration tab, and then click Create Schedule to open the Create Schedule dialog box.

  2. In the Create Schedule dialog box, configure the parameters.

    Parameter

    Description

    Schedule Name

    Enter a custom name for the schedule.

    Schedule Type

    You can select periodic scheduling, data update-triggered scheduling, or fixed task-triggered scheduling.

    • Periodic scheduling: Runs data quality checks on a fixed, recurring schedule. This is suitable for scenarios where data is generated at predictable times.

      Scheduling cycle: Running quality rules consumes computing resources. We recommend that you avoid running multiple quality rules concurrently so that the normal operation of your production tasks is not affected. The scheduling cycle includes five types: Day, Week, Month, Hour, and Minute.

      If the system time zone (your user center's time zone) differs from the scheduling time zone (configured in Management Center > System Settings > Basic Settings), the rule executes based on the system time zone.

    • Data update-triggered scheduling: When any code task runs, the system determines whether the run updated the specified check scope for the current table. Use this for tables with unpredictable update tasks or for critical tables that require monitoring after every change.

      Note

      We recommend that you set the check scope to the partitions that are updated by the task. For non-partitioned tables, the entire table is checked. The system automatically detects all data changes and performs checks, which helps prevent omissions.

    • Fixed task-triggered scheduling: Executes the configured quality rule after a specified task succeeds, or before it runs. This scheduling can be triggered by tasks of the following types: SQL, Offline Pipeline, Python, Shell, Virtual, Dlink, and database SQL nodes. This is suitable for tables with fixed update tasks.

      Note
      • Only production environment tasks can be selected for fixed task-triggered scheduling. If the rule is a strong rule, a check failure can impact online tasks. Use this feature with caution based on your business needs.

      • Supported engine types include MaxCompute.

      • Trigger Condition: Select the trigger condition for the quality check. You can select Trigger after all tasks run successfully, Trigger after each successful run of a task, or Trigger before each run of a task.

      • Trigger Task: The following roles can select a task node from a production project to trigger the task. You can also search by node output name.

        • You can select task nodes in a production project if you are a project administrator of a Prod or Basic project, have the O&M system role for a Prod project, have the developer system role for a Basic project, or have a custom project role with Project Quality Management-Quality Rule Management permissions in a Prod or Basic project.

        • Custom global roles that have the Quality Rule - Manage permission for Prod/Basic projects can select task nodes in all production projects.

          Note

          When you select Trigger after all tasks are successfully run as the trigger timing, select trigger tasks that share the same scheduling cycle as the rule. This prevents delays in rule execution and quality check results due to different scheduling cycles.

    Scheduling Conditions

    Disabled by default. When enabled, the system first verifies the scheduling conditions before scheduling the quality rule. The schedule proceeds only if the conditions are met; otherwise, the current schedule is skipped.

    • Business Date/Execution Date: If you select periodic scheduling (Execution Date is not supported), data update-triggered scheduling, or fixed task-triggered scheduling as the scheduling type, you can configure the date. You can select Standard Calendar or Custom Calendar. For information about how to create a custom calendar, see Create and manage public calendars.

      • If you select Standard Calendar, the available conditions are Month, Day of week, and Date, as shown in the following figure.

        image

      • If you select Custom Calendar, you can select conditions for Date Type and Tag. The following figure is an example:

        image

    • Instance Type: If you select data update-triggered scheduling or fixed task-triggered scheduling, you can select periodic instance, backfill instance, or manual instance. As shown in the following figure:

      image

    Note
    • Configure at least one rule. To add a rule, click + Add Rule.

    • You can configure a maximum of 10 scheduling conditions.

    • The relationship between scheduling conditions can be set to AND or OR.

  3. Click OK to finish configuring the schedule.

Scheduling configuration list

You can view, edit, clone, and delete created schedules from the scheduling configuration list.

image.png

Area

Description

① Filter and search area

You can search by schedule name.

You can filter by periodic scheduling, data update-triggered scheduling, or fixed task-triggered scheduling.

List area

The rule configuration list displays the Scheduling Name, Scheduling Type, Last Updated By, and Last Updated Time.

Operations area

You can edit, clone, and delete schedules.

  • Edit: Modify a configured schedule.

    Important

    All rule configurations that reference this schedule are changed simultaneously. Proceed with caution.

  • Clone: Quickly duplicate a schedule configuration.

  • Delete: You cannot delete a schedule that is referenced by a rule configuration.

Alert configuration

You can configure different alert methods for different rules to distinguish them. For example, you can configure phone call alerts for strong rule failures and SMS alerts for weak rule failures. If a rule triggers multiple alert configurations, you can set a policy to determine which alert takes precedence.

Note

You can create a maximum of 20 alert configurations for a single monitoring object.

  1. On the Quality Rule Details page, click the Alert Configuration tab, and then click Create Alert Configuration to open the Create Alert Configuration dialog box.

  2. In the Create Alert Configuration dialog box, configure the parameters.

    Parameter

    Description

    Scope

    You can select All Rules, All Strong Rules, All Weak Rules, or Custom.

    Note
    • For a single monitoring object, you can configure one alert configuration for each of the All rules, All strong rules, and All weak rules scopes. New rules automatically use the corresponding alert configuration based on their severity. To change one of these configurations, you must edit the existing one.

    • The Custom scope allows you to select up to 200 of the configured rules for the current monitoring object.

    Alert Configuration Name

    The alert configuration name must be unique within a single monitoring object and can be up to 256 characters long.

    Alert Recipient

    Configure the alert recipients and methods. You must select at least one recipient and one method.

    • Alert Recipient: You can select Custom, On-call schedule, or Quality Owner as the recipient type.

      You can configure up to 5 custom alert recipients and up to 3 on-call schedules.

    • Alert Method: You can select Phone, Email, SMS, DingTalk as the delivery method.

  3. Click OK to finish configuring the alert.

Alert configuration list

After you configure an alert, you can sort, edit, and delete it from the alert configuration list.

image.png

Area

Description

① Sorting area

Configure the alert policy for when a quality rule matches multiple alert configurations:

  • The first matched alert configuration takes effect: If you select this policy, only the first matched alert configuration is triggered, and all others are ignored. You can then reorder the alert configurations. Click Sort Rules, and then drag the image.png icon next to the alert configuration name to reorder. You can also use the icons in the Operations column to move an item to the top or bottom. After you adjust the order, click Finish Sorting.

    image.png

  • All alert configurations take effect: All alert configurations in the current list are effective for the quality rules under the current monitoring object.

    For example, if you have configured multiple alert configurations and selected this option, the system merges alerts based on the recipient method, recipient, and rule. Specifically, if the recipient is the same person and the recipient types are Custom and Quality Owner, the alert messages are merged according to the merge policy.

    Note

    On-call schedules do not support alert merging.

② List area

This area displays the name of the alert configuration, its scope, the specific recipients for each alert type, and the corresponding alert delivery methods.

Scope: For custom alerts, you can view the configured object name and rule name. If a rule is deleted, its name is no longer displayed. Update the alert configuration.

③ Operations area

You can edit and delete your configured alerts.

  • Edit: Modify a configured alert. If you change the alert recipients or methods, notify the affected personnel to avoid missing important business alerts.

  • Delete: After an alert configuration is deleted, the rules it matched no longer trigger this alert. Proceed with caution.

Quality report

Click Quality Report to view the Rule Check Overview and Rule Check Details for the current quality rule.

  • You can quickly filter check details by anomaly result, partition time, or a keyword in the rule or object name.

  • In the Operations column of the rule check details list, click the image icon to view the quality rule check details.

  • In the Operations column of the rule check details list, click the image icon to view the quality rule execution log.

Permission management

  1. Click Permission Management and configure View Details to specify which members can view check record details, quality rule details, and quality reports.

    Viewable by: Select All members or Only members with quality management permissions for the current object.

  2. Click OK to save the permission configuration.

Next steps

After configuring the quality rule, you can view it on the real-time metatable rule list. For more information, see Manage the monitoring object list.