Create data quality rules for global data tables to streamline quality monitoring.
Prerequisites
You must add a monitoring object before you can configure a quality rule. For instructions, see Add and manage monitoring objects.
Permissions
-
Super administrators, quality administrators, and users assigned a custom global role with the quality rule-manage permission can configure scheduling, alarms, exception archive tables, and scoring weights for quality rules.
-
Quality owners can configure scheduling, alarms, exception archive tables, and scoring weights for quality rules on their monitoring objects.
-
Quality owners and standard users must also have read permission on the global data table's data source. To request this permission, see Request data source permission.
-
The supported operation permissions vary by object. For more information, see Quality rule operation permissions.
Validation rules
When a table is evaluated against a quality rule, if a weak monitoring rule is triggered, the system sends an alert. If a strong monitoring rule is triggered, the system stops the task to prevent dirty data from flowing downstream and sends an alert.
Trial run vs. run
A trial run simulates the execution of a quality rule to verify that it works as expected. Its results do not appear in the quality report. A scheduled run executes the quality rule at a specified time and records its results in the quality report.
Configure quality rules
-
On the Dataphin homepage, choose governance > data quality from the top navigation bar.
-
In the left-side navigation pane, click quality rule. On the global data table page, click the name of the target object to go to the Quality Rule Details page.
-
On the Quality Rule Details page, click Create Quality Rule.
-
In the Create Quality Rule dialog box, configure the parameters.
Parameter
Description
Basic information
Rule Name
A custom name for the quality rule. The name can contain up to 256 characters.
Rule strength
The available options are weak rule and strong rule.
-
Weak rule: If you select Weak rule, the system raises an alarm but does not block downstream task nodes when the quality rule verification fails.
-
Strong rule: If you select Strong rule, an alarm is triggered when the quality rule check returns an abnormal result. If there are downstream tasks (such as in code check scheduling or task-triggered scheduling), the downstream tasks are blocked to prevent the spread of dirty data. If there are no downstream tasks (such as in periodic quality scheduling), only an alarm is triggered.
Description
A custom description for the quality rule. The description can contain up to 128 characters.
Configuration Method
-
Create from Template: Use a general-purpose system template or a custom business template to quickly create a quality rule.
-
System Template: The template provides configurable built-in parameters. This method is suitable for creating general-purpose rules.
-
Custom Template: The template provides preset parameters that require no further configuration. This method is typically used for creating rules that involve business logic.
-
-
custom SQL: You can flexibly define custom quality monitoring rules by using SQL. This method is suitable for flexible and complex scenarios.
Rule template
Select a rule template from the drop-down list. Options include completeness, uniqueness, timeliness, validity, consistency, stability, and custom SQL.
-
completeness: Includes Field Null Value Verification and Field Empty String Verification.
-
uniqueness: Includes Field Uniqueness Verification, Field Group Count Verification, and Field Duplicate Value Count Verification.
-
timeliness: Includes Time Function Comparison, Single-Table Time Field Comparison, and Two-Table Time Field Comparison.
-
validity: Includes Field Format Verification, Field Length Verification, Field Value Range Verification, Code Table Reference Comparison, and Data Standard Code Table Reference Comparison (requires the Data Standard module).
-
consistency: Includes Single-Table Field Value Consistency Comparison, Single-Table Field Statistic Consistency Comparison, Single-Field Business Logic Consistency Comparison, Two-Table Field Value Consistency Comparison, Two-Table Field Statistic Consistency Comparison, Two-Table Field Business Logic Consistency Comparison, and Cross-Source Two-Table Field Statistic Consistency Comparison.
-
stability: Includes Table Stability Verification, Table Volatility Verification, Field Stability Verification, and Field Volatility Verification.
-
custom SQL: Includes Custom Statistic Verification and Custom Data Detail Verification.
For more information, see Template type descriptions.
Rule Type
The rule type is determined by the selected template. You can use it for descriptions and filtering.
Monitoring granularity
When the configuration method is custom SQL, you can set the monitoring granularity to either Entire Table or a specific field.
Template configuration
Template Information
When you select a quality rule template, its configuration information is displayed. To modify the configuration, go to Quality rule templates.
Rule configuration
Rule Configuration
The configuration options depend on the selected rule template. For more information, see Data table parameter configuration.
-
Filter Verified Table Data: Disabled by default. If enabled, you can configure filter conditions for the verified table, such as partition filters or general data filters. The filter conditions are directly appended to the verification SQL. If the verified table requires partition filtering, we recommend configuring a partition expression in the scheduling configuration. This sets the verified partition as the minimum granularity for viewing the quality report.
-
When the rule template is consistency/Two-Table Field Statistic Consistency Comparison or consistency/Cross-Source Two-Table Field Statistic Consistency Comparison, you can enable Filter Compared Table Data. If enabled, you can configure filter conditions for the compared table, such as partition filters or general data filters. The filter conditions are directly appended to the verification SQL.
Verification configuration
Rule Verification
-
The system compares the verification result against the configured exception conditions. If the result meets these conditions, the verification fails, triggering subsequent processes such as alarms.
-
The metrics available for exception verification depend on the template and its configuration. You can use multiple conditions with AND/OR logic. We recommend using fewer than three conditions in your actual configuration.
For more information, see Verification configuration descriptions.
Archive Configuration
Exception archiving
By default, this feature is disabled. You can enable it to archive abnormal data to a file or table. After a quality verification, you can download and analyze the archived abnormal data.
-
Archive Mode supports Archive Only Anomalous Fields and Archive Full Records.
-
Archive only abnormal fields: Deduplicates and archives only the current monitored field. This is applicable to cases where a single field is sufficient to identify abnormal data.
-
Archive complete record: Archives the entire record that contains the anomalous data. This is suitable for situations where the complete record is required to locate the anomalous data. Note: Because archiving complete records significantly increases the amount of archived data, we recommend that you only archive anomalous fields under normal circumstances.
-
-
Archive Location supports Default file server and Exception archive table. If you have not created an exception archive table, you can click Manage Exception Archive Tables to create one. For more information, see Add an exception archive table.
-
Default file server: The system file server configured during Dataphin deployment. You can then download the exception data that is archived for each verification from the Validation Record page. When you use the default file server, a maximum of 100 exception data entries are archived for each verification, which makes it suitable for verifying small volumes of data.
-
Abnormal data archive table: If you want to store more abnormal data or consolidate abnormal data from different validation records for comparative analysis, you can specify your own archive table. Each quality rule can record a maximum of 10,000 abnormal data entries per run. In addition to quickly downloading the abnormal data from a single validation on the validation record page, you can also directly access the archive table and customize its lifecycle for greater flexibility.
Note-
You can download the consolidated exception data that is generated by all rules for the current run. The download is limited to 10,000 records. If you want to view more data, archive the data to the specified exception archive table and then access the table directly.
-
An exception archive table must meet specific format requirements. Otherwise, an error may occur when you write data, which affects its use. For more information, see Add an exception archive table.
-
-
Business property configuration
Property Information
The input method for business properties depends on the quality rule property configuration. For example, if a property's value type is set to an enumeration (multi-select) with options such as Big Data Department, Business Department, and Technology Department, a multi-select drop-down list with these options is displayed when you create a quality rule.
If a property's value type is set to custom input with a length of 256, you can enter up to 256 characters for that property when creating a quality rule.
If the input method for a property is Range, configure it as follows:
Range: This is typically used for continuous numerical or date ranges. You can use the operators >, >=, <, and <=. For more information about property configuration, see Create and manage quality rule properties.
Scheduling property configuration
Scheduling method
Select a pre-configured schedule. If you have not decided on a scheduling method, you can configure it after creating the quality rule. To create a new schedule, see Create a schedule.
Quality score configuration
Scoring method
Valid values are Quality verification status and Data conformity rate.
-
Quality verification status: Scores the rule based on the verification status of its last successful run. A passing verification receives 100 points, while a failing one receives 0 points.
-
Data conformity rate: Uses the proportion of normal data (the conformity rate) from the last successful run as the score. For example, if the data format validity is 80%, the quality score is 80.
The supported scoring methods vary by rule template. The following templates support only the Quality verification status scoring method:
-
In the uniqueness category: Field Group Count Verification and Field Duplicate Value Count Verification.
-
In the consistency category: Single-Table Field Statistic Consistency Comparison and Cross-Source Two-Table Field Statistic Consistency Comparison.
-
All templates in the stability category.
-
In the custom SQL category: Custom Statistic Verification.
Quality score weight
The weight of the rule's quality score. This value is used to calculate the total quality score of the monitored object. You can select an integer from 1 to 10.
-
-
Click OK to save the rule configuration.
You can click Preview SQL to compare the current configuration with the last saved version and review any SQL changes.
Note-
Preview SQL is unavailable if key information is incomplete.
-
The left pane shows the SQL preview of the last saved configuration. It is empty if no configuration was saved. The right pane shows the SQL preview of the current configuration.
-
Rule configuration list
The rule configuration list displays configured data table rules and supports operations such as view, edit, test run, run, and delete.

|
Area |
Description |
|
① Filter and search area |
You can quickly search by object name or rule name. You can filter by rule type, rule template, rule strength, test run status, and activation status. Note
If a quality rule's business properties are configured to be searchable or filterable and are enabled, you can search or filter by those properties. |
|
② List area |
This area displays the object type/name, rule name/ID, test run status, activation status, rule type, rule template, rule strength, scheduling type, and related knowledge base document information. Click the
|
|
③ Actions area |
You can perform the following operations: View, Clone, Edit, Test Run, Run, Configure Scheduling, Associate Knowledge Base Document, Quality Score Configuration, and Delete.
|
|
④ Bulk actions area |
You can perform the following operations in bulk: Test Run, Run, Configure Scheduling, Enable, Disable, Modify Business Properties, Associate Knowledge Base Document, Quality Score Configuration, Export Rules, and Delete.
|
Create a schedule
-
When configuring a schedule for a quality rule, you can quickly apply an existing schedule configuration. Each table supports a maximum of 20 schedules.
-
You can configure up to 10 schedules for a single quality rule.
-
The system automatically deduplicates identical schedule configurations.
-
The verification scope acts as a filter condition in the quality verification statement to control the data range for each check. This scope also defines the minimum granularity for viewing downstream components, such as quality reports.
-
On the Quality Rule Details page, click the Schedule Configuration tab, and then click Create Schedule to open the Create Schedule dialog box.
-
In the Create Schedule dialog box, configure the following parameters.
Parameter
Description
Schedule name
A custom name for the schedule. The name can contain up to 64 characters.
Schedule type
Supports Time-based schedule and Task-triggered schedule.
-
Time-based schedule: Performs data quality checks periodically at a configured time. This is suitable for scenarios where data is generated at fixed intervals.
Scheduling Cycle: Running quality rules consumes computing resources. To avoid affecting your production tasks, do not run multiple quality rules at the same time. The scheduling cycle includes five types: Day, Week, Month, Hour, and Minute.
If the system time zone (your time zone in the user center) differs from the schedule time zone (configured in Management Center > System Settings > Basic Settings), the rule runs according to the system time zone.
-
Task-triggered schedule: Runs the configured quality rule after a specified task succeeds or before it runs. You can select tasks of the following node types to trigger the schedule: SQL, Offline Pipeline, Python, Shell, Virtual, Dlink, and Database SQL. This schedule type is suitable for cases where table modification tasks are fixed.
Note-
You can only select tasks in the production environment as triggers. If the quality rule is configured as a strong rule, a verification failure may affect production tasks. Proceed with caution based on your business needs.
-
Supported engine types: MaxCompute.
-
Trigger Condition: Select when to trigger the quality check. Valid values: Trigger after all tasks are successful, Trigger after each task is successful, and Trigger before each task runs.
-
Triggering Task: The following roles can select a task node in a production project as the trigger. You can also search for the node by its output name.
-
The following roles can select a task node in a production project: project administrators of Prod/Basic projects, users with the O&M system role in Prod projects, users with the developer system role in Basic projects, and users with a custom project role that has the Project Quality Management-Quality Rule Management permission in Prod/Basic projects.
-
Users with a custom global role that has the Quality Rule-Management permission can select task nodes in any production project.
NoteIf you set Trigger Condition to Trigger after all tasks are successful, select trigger tasks that have the same schedule cycle. This prevents delays in rule execution and the generation of quality check results.
-
-
Schedule condition
Disabled by default. If enabled, the system checks whether the schedule conditions are met before a scheduled run. The quality rule runs only if the conditions are met. Otherwise, the current run is skipped.
-
Business Date/Execution Date: If you set the schedule type to Time-based schedule (time-based schedules do not support execution dates) or Task-triggered schedule, you can configure the date. You can select a standard calendar or a custom calendar. For information about how to customize a calendar, see Create and manage public calendars.
-
If you select General Calendar, you can set conditions based on Month, Week, or Date. For example:

-
If you select Custom Calendar, you can set conditions based on Date Type or Tag. For example:

-
-
Instance type: If you select Task-triggered schedule as the schedule type, you can select Periodic instance, Backfill instance, or Manual instance. As shown in the following figure:

Note-
You must configure at least one condition. To add a condition, click + Add Rule.
-
You can configure a maximum of 10 conditions.
-
You can set the logical relationship between conditions to AND or OR.
Verification scope expression
An editable drop-down list where you can define the scope of data to be verified. You can enter a custom expression, such as
ds='${yyyyMMdd}', or select and modify a built-in partition expression for faster configuration. For more information about built-in expressions, see Built-in partition expression types.Note-
If you use multiple conditions, connect them with and or or, for example,
province="Zhejiang" and ds<=${yyyyMMdd}. -
If a filter condition is also configured in the quality rule, it is combined with the verification scope expression by using an AND operator.
-
The verification scope expression supports full table scans.
Note: Full table scans consume significant resources and are not supported in some scenarios. Configure a partition expression to avoid full table scans.
Default scope date
Defaults to the current business date.
-
-
Click OK to complete the schedule configuration.
Schedule configuration list
After you create a schedule, you can view, edit, clone, or delete it from the schedule configuration list.

|
Area |
Description |
|
① Filter and search |
Search for schedules by name. Filter by Time-based schedule or Task-triggered schedule. |
|
② List |
Displays the Schedule Name, Schedule Type, Last Updated By, and Last Update Time for each schedule configuration. |
|
③ Actions |
You can edit, clone, or delete a schedule.
|
Alert configuration
Configure notification methods per rule to route alerts based on severity. For example, set up phone alerts for strong rule exceptions and SMS alerts for weak rule exceptions. If a quality rule matches multiple alert configurations, define an activation policy to determine which one takes effect.
You can create a maximum of 20 alert configurations for a single monitoring object.
-
On the Quality Rule Details page, click the Alert Configuration tab, and then click the Create Alert Configuration button. The Create Alert Configuration dialog box appears.
-
In the Create Alert Configuration dialog box, configure the parameters.
Parameter
Description
Scope
Select All rules, All strong rules, All weak rules, or Custom.
Note-
For each monitoring object, you can create one alert configuration for each of the following scopes: All rules, All strong rules, and All weak rules. New rules are automatically matched to the corresponding alert configuration based on their strength. To change one of these configurations, you must edit the existing one.
-
If you select Custom, you can select up to 200 existing rules associated with the current monitoring object.
Alert configuration name
The name must be unique within the monitoring object and cannot exceed 256 characters.
Alert recipient
Specify at least one alert recipient and one notification method.
-
Alert recipient: Select the recipient type. Supported types include Custom, on-call schedule, and quality owner.
You can configure up to five custom alert recipients and up to three on-call schedules.
-
Notification method: Select one or more notification methods, such as Phone, Email, SMS, DingTalk.
-
-
Click OK.
Alert configuration list
After you create alert configurations, you can sort, edit, and delete them in the list.

|
Area |
Description |
|
① Policy area |
Configure the activation policy for when a quality rule matches multiple alert configurations:
|
|
② List area |
Displays the name, effective scope, recipients, and notification methods for each alert configuration. Effective scope: For configurations with a custom scope, you can view the names of the associated objects and rules. If a rule is deleted, its name is no longer displayed. In this case, we recommend that you update the alert configuration. |
|
③ Actions area |
Use these actions to edit or delete alert configurations.
|
Add an exception archive table
An exception archive table stores records that fail quality rules.
-
On the Quality Rule Details page, click the Exception Archiving tab, and then click +Add Exception Archive Table to open the Add Exception Archive Table dialog box.
-
In the Add Exception Archive Table dialog box, configure the parameters.
For Add Method, you can choose New Table or Select Existing Table. This process adds special quality validation fields, ensuring that exception data is archived in a separate table instead of being written to the original data table.
-
New Table: You can specify a custom table name. The default name is current_table_name_exception_data. After you add the table, the system creates a new table in the same database or data source. The name can contain letters, digits, underscores (_), and periods (.), and can be up to 128 characters long.
-
If the monitored table is a physical table, the system creates the exception archive table in the same project as the monitored table.
-
If the monitored table is a dimension logical table or a fact logical table, the system creates the exception archive table in the same project by default. You can also manually specify a project within the monitored table's module, such as
projectA.table_name. -
If the monitored table is a summary logical table, we recommend that you specify a project name within the same module for the archive table. Otherwise, the system automatically creates the archive table in a project within the monitored table's module.
-
The exception archive table must include all fields from the monitored table, as well as the quality validation fields. The script format is as follows:
create table Current_table_name_exception_data (dataphin_quality_tenant_id varchar(64) comment 'tenant ID' , dataphin_quality_rule_id varchar(64) comment 'quality rule ID', dataphin_quality_rule_name varchar(256) comment 'quality rule name', dataphin_quality_column_name varchar(1024) comment 'The name of the validated field', dataphin_quality_watch_task_id varchar(128) comment 'The task ID of the monitored object', dataphin_quality_rule_task_id varchar(64) comment 'rule task ID', dataphin_quality_validate_time varchar(64) comment 'The time of the quality check', dataphin_quality_archive_mode varchar(32) comment 'The exception archive mode, ONLY_ERROR_FIELD/FULL_RECORD', dataphin_quality_error_data string comment 'The exception data', ljba_id bigint comment 'The primary key of ljba', ljb_id bigint comment 'The primary key of ljb', col_tinyint tinyint comment 'The field type is TINYINT, lowercase.', col_tinyint_02 tinyint comment '2', col_smallint smallint comment 'The field type is SMALLINT, lowercase.', col_smallint_02 smallint comment '4', col_int int comment 'The field type is INT, lowercase.', col_int_02 int comment '6', col_bigint bigint comment 'The field type is BIGINT, lowercase.', col_bigint_02 bigint comment '8', col_float float comment 'The field type is FLOAT, lowercase.', col_float_02 float comment '10', col_double double comment 'The field type is DOUBLE, lowercase.', col_double_02 double comment '11', col_decimal decimal(38,18) comment 'The field type is DECIMAL(38,18), lowercase.', col_decimal_02 decimal(38,18) comment '12', col_varchar varchar(500) comment 'The field type is VARCHAR(500), lowercase.', col_varchar_02 varchar(500) comment '13', col_char char(10) comment 'The field type is CHAR(10), lowercase.', col_char_02 char(10) comment '14', col_string string comment 'The field type is STRING, lowercase.', col_string_02 string comment '15', col_date date comment 'The field type is DATE, lowercase.', col_date_02 date comment '16', col_datetime datetime comment 'The field type is DATETIME, lowercase.', col_datetime_02 datetime comment '17', col_timestmap timestamp comment 'The field type is TIMESTAMP, lowercase.', col_timestmap_02 timestamp comment '18', col_boolean boolean comment 'The field type is BOOLEAN, lowercase.', col_boolean_02 boolean comment '19', col_binary binary comment 'The field type is BINARY, lowercase.', col_binary_02 binary comment '20', col_array array<int> comment 'The field type is ARRAY<int>, lowercase.', col_array_02 array<string> comment '21', col_map map<string,string> comment 'The field type is MAP<string, string>, lowercase.', col_map_02 map<string,int> comment '22', ds string comment 'Date partition, yyyyMMdd' ) partitioned by (dataphin_quality_validate_date string comment 'The validation date (partition field)');
-
-
Select Existing Table: You can select a table from the same project or data source. The selected table must include all fields from the monitored table and the quality validation fields. You can click View Exception Archive Table DDL to view the required CREATE TABLE statement. The script format is as follows:
create table Current_table_name_exception_data (dataphin_quality_tenant_id varchar(64) comment 'tenant ID' , dataphin_quality_rule_id varchar(64) comment 'quality rule ID', dataphin_quality_rule_name varchar(256) comment 'quality rule name', dataphin_quality_column_name varchar(1024) comment 'The name of the validated field', dataphin_quality_watch_task_id varchar(128) comment 'The task ID of the monitored object', dataphin_quality_rule_task_id varchar(64) comment 'rule task ID', dataphin_quality_validate_time varchar(64) comment 'The time of the quality check', dataphin_quality_archive_mode varchar(32) comment 'The exception archive mode, ONLY_ERROR_FIELD/FULL_RECORD', dataphin_quality_error_data string comment 'The exception data', ljba_id bigint comment 'The primary key of ljba', ljb_id bigint comment 'The primary key of ljb', col_tinyint tinyint comment 'The field type is TINYINT, lowercase.', col_tinyint_02 tinyint comment '2', col_smallint smallint comment 'The field type is SMALLINT, lowercase.', col_smallint_02 smallint comment '4', col_int int comment 'The field type is INT, lowercase.', col_int_02 int comment '6', col_bigint bigint comment 'The field type is BIGINT, lowercase.', col_bigint_02 bigint comment '8', col_float float comment 'The field type is FLOAT, lowercase.', col_float_02 float comment '10', col_double double comment 'The field type is DOUBLE, lowercase.', col_double_02 double comment '11', col_decimal decimal(38,18) comment 'The field type is DECIMAL(38,18), lowercase.', col_decimal_02 decimal(38,18) comment '12', col_varchar varchar(500) comment 'The field type is VARCHAR(500), lowercase.', col_varchar_02 varchar(500) comment '13', col_char char(10) comment 'The field type is CHAR(10), lowercase.', col_char_02 char(10) comment '14', col_string string comment 'The field type is STRING, lowercase.', col_string_02 string comment '15', col_date date comment 'The field type is DATE, lowercase.', col_date_02 date comment '16', col_datetime datetime comment 'The field type is DATETIME, lowercase.', col_datetime_02 datetime comment '17', col_timestmap timestamp comment 'The field type is TIMESTAMP, lowercase.', col_timestmap_02 timestamp comment '18', col_boolean boolean comment 'The field type is BOOLEAN, lowercase.', col_boolean_02 boolean comment '19', col_binary binary comment 'The field type is BINARY, lowercase.', col_binary_02 binary comment '20', col_array array<int> comment 'The field type is ARRAY<int>, lowercase.', col_array_02 array<string> comment '21', col_map map<string,string> comment 'The field type is MAP<string, string>, lowercase.', col_map_02 map<string,int> comment '22', ds string comment 'Date partition, yyyyMMdd' ) partitioned by (dataphin_quality_validate_date string comment 'The validation date (partition field)');
-
-
Click OK to add the exception archive table.
You can select Set as active archive table automatically after creation. Selecting this option automatically sets the table as the archive destination for subsequent quality rules.
Exception archive table list
After a table is successfully added, the system sets the first one in the list as the active archive table by default. You can click the name of an exception archive table to view its schema. You can also set other tables as the active archive table or delete them.
-
Set as Active Archive Table: The active archive table receives exception data from all quality rules for the monitored object that are configured to use a custom archive table.
-
Delete: This action only removes the reference to the exception archive table and does not delete the physical table itself. You can re-add the reference later.
Quality report
Click quality report to view the rule validation overview and rule validation details for the current quality rule.
-
Filter the rule validation details by exception result, partition time, or a keyword in the rule or object name.
-
In the operations column, click the
icon to view the rule validation details. -
In the operations column, click the
icon to view the execution log.
Quality rule permission management
-
Click Permission Management and configure the Can view details permission to specify which members can view validation record details, quality rule details, and quality reports.
Can view details: You can select All members or Only members with quality management permissions for the current object.
-
Click OK to complete the permission management configuration.
Next steps
After you configure the quality rules, you can view them on the rule list page of the global data table. For more information, see or Manage the monitored object list.
