Create a Dataphin table quality rule

更新时间: 2026-06-23 13:36:59

Dataphin supports creating quality rules to validate data tables and simplify quality monitoring. This guide covers rule creation, scheduling, alerts, exception archiving, and quality reports.

Prerequisites

To configure a quality rule, you must first add a monitoring object. For more information, see Add a monitoring objectAdd and manage monitoring objects.

Permissions

  • Super administrators, quality administrators, users in custom global roles with the quality rule-management permission, and users in custom project roles with the project quality management-quality rule management permission on the project containing the table can configure scheduling, alerts, exception archive tables, and scoring weights for quality rules.

  • Quality owners can configure scheduling, alerts, exception archive tables, and scoring weights for quality rules on the monitoring objects they manage.

  • Quality owners and standard users must also have read permission on Dataphin tables. To apply for this permission, see Apply for, renew, and return table permissions.

  • Supported operations vary by object. For details, see quality rule operation permission.

Quality rules

You can configure quality rules in two ways: custom configuration and monitoring by referencing data standards, which requires the data standard module.

  • Custom configuration lets you quickly create rules from built-in or custom quality rule templates. It also supports custom SQL for flexible monitoring.

  • Monitoring by referencing data standards uses the quality rules from the data standard mapped to the current asset object. This method helps you enforce standard constraints more effectively.

Validation rules

When a quality rule evaluates a data table, a weak monitoring rule violation sends an alert so you can address the anomaly promptly. A strong monitoring rule violation interrupts the task to prevent dirty data from flowing downstream, and also sends an alert.

Trial run vs. run

A trial run simulates a quality rule to verify that it works as expected. Its results do not appear in the quality report. A run evaluates the quality rule during a specific period and adds its results to the quality report.

Quality rule configuration

  1. On the Dataphin homepage, click Governance > Data Quality in the top navigation bar.

  2. In the left-side navigation pane, click Quality Rules. On the Dataphin Table page, click the name of the target object to open the Quality Rule Details page and configure the quality rule.

    • Custom Configuration

      1. Hover over the data table, point to Create Quality Rule and select Custom Configuration, or click Create Quality Rule to open the Create Quality Rule dialog box.

      2. In the Create Quality Rule dialog box, configure the following parameters.

        Parameter

        Description

        Basic Information

        Rule Name

        A custom name for the quality rule. The name can contain up to 256 characters.

        Rule Strength

        You can select a weak rule or a strong rule.

        • Weak rule: If you select Weak rule, an alert is raised but downstream task nodes are not blocked when the quality rule check fails.

        • Strong rule: If you select Strong rule, an alert is triggered when a quality rule check fails. If downstream tasks exist (such as in code check scheduling and task-triggered scheduling), they are also blocked to prevent the spread of contaminated data. If no downstream tasks exist (such as in periodic quality scheduling), only an alert is triggered.

        Description

        A custom description for the quality rule. The description can contain up to 128 characters.

        Configuration Method

        • Create from Template: Quickly create a quality rule by using a generic system template or a custom business template.

          • System Template: Suitable for creating generic rules. You can configure the built-in parameters of the template.

          • Custom Template: Typically used to create rules that contain business logic. The template has pre-configured parameters.

        • Custom SQL: Flexibly define quality monitoring rules by using SQL. This method is suitable for complex scenarios.

        Rule Template

        Select a rule template from the drop-down list. Options include Completeness, Uniqueness, Timeliness, Validity, Consistency, Stability, and Custom SQL.

        • Completeness: Includes Field Null Value Check and Field Empty String Check.

        • Uniqueness: Includes Field Uniqueness Check, Field Group Count Check, and Field Duplicate Value Count Check.

        • Timeliness: Includes Time Function Comparison, Single-table Time Field Comparison, and Two-table Time Field Comparison.

        • Validity: Includes Field Format Check, Field Length Check, Field Value Range Check, Code Table Reference Comparison, and Data Standard Code Table Reference Comparison (requires the Data Standard module).

        • Consistency: Includes Single-Table Field Value Consistency Comparison, Single-Table Field Statistical Value Consistency Comparison, Single-Field Business Logic Consistency Comparison, Two-Table Field Value Consistency Comparison, Two-Table Field Statistical Value Consistency Comparison, Two-Table Field Business Logic Consistency Comparison, and Cross-Source Two-Table Field Statistical Value Consistency Comparison.

        • Stability: Includes Table Stability Check, Table Volatility Check, Field Stability Check, and Field Volatility Check.

        • Custom SQL: Includes Custom Statistical Metric Check and Custom Data Detail Check.

        For more information, see Template type description.

        Rule Type

        The rule type is determined by the template and serves as its most basic attribute, used for description and filtering.

        Monitoring Granularity

        When the configuration method is Custom SQL, you can configure the monitoring granularity. You can monitor the Full Table or a specific Field.

        Template Configuration

        Template Information

        When you select a quality rule template, its configuration information appears. To modify this information, go to Quality rule template.

        Rule Configuration

        Rule Configuration

        The rule configuration varies based on the selected rule template. For more information, see Data table parameter configuration.

        Special configurations:

        • Filter Verified Table Data: Disabled by default. If enabled, you can configure filter conditions for the verified table, such as partition filters or standard data filters. The filter conditions are appended directly to the verification SQL. If the verified table requires partition filtering, we recommend you configure a partition expression in the schedule settings. After configuration, the verification partition becomes the most granular level for viewing the quality report.

        • When the rule template is Consistency/Two-table Field Statistical Value Consistency Comparison or Consistency/Cross-source Two-table Field Statistical Value Comparison, you can choose to enable Filter Compared Table Data. If enabled, you can configure filter conditions for the comparison table, such as partition or standard data filters. These conditions are appended directly to the verification SQL.

        Verification Configuration

        Rule Verification

        • After the system verifies the data quality rule, it compares the result with the anomaly check configuration. If the conditions are met, the verification fails, which triggers alerts and other subsequent processes.

        • The available metrics for the anomaly check depend on the template and its configuration. The check supports multiple AND/OR conditions. We recommend using fewer than three conditions.

        For more information, see Verification configuration description.

        Archive Configuration

        Anomaly Archiving

        This feature is Off by default. When turned On, you can archive anomalous data to a file or table. After a quality check, you can download and analyze the archived data.

        • Archive Mode supports Archive Anomaly Fields Only and Archive Full Records.

          • Archive Anomaly Fields Only: This mode deduplicates and archives only the monitored fields, which is suitable when a single field is enough to identify the anomalous data.

          • Archive Full Records: Archives the entire record that contains the anomalous data. This is suitable when the full record is required to locate the anomalous data.

            Note

            Archiving full records creates a significantly larger volume of archived data. We recommend using the Archive Anomaly Fields Only mode in most cases.

        • Archive Location supports Default File Server and Anomaly Data Archive Table. If no anomaly archive table exists, click Manage Anomaly Archive Tables to create one. For more information, see Add an anomaly archive table.

          • Default File Server: Refers to the system file server configured during Dataphin deployment. You can later download the archived anomalous data for each check from the Verification Records page. When using the default file server, a maximum of 100 anomalous data entries are archived per check, making it suitable for scenarios with small data volumes.

          • Anomaly Data Archive Table: If you want to store more anomalous data or consolidate data from different checks for later analysis, we recommend specifying an archive table. Each quality rule can record up to 10,000 anomalous data entries per run. In addition to downloading data for a single check from the Verification Records page, you can also access the archive table directly and define a custom table lifecycle for greater flexibility.

            Note
            • You can download a summary of all anomalous data generated by all rules in the current run, up to a limit of 10,000 entries. To view more data, we recommend archiving it to a specified anomaly archive table and accessing the table directly.

            • The anomaly archive table must meet specific format requirements to prevent data writing errors and ensure proper functionality. For more information, see Add an anomaly archive table.

        Business Attribute Configuration

        Attribute Information

        The input format for business attributes depends on the quality rule attribute settings. For example, if the value type for the managing department field is an enumerated list (multi-select) with options like Big Data Department, Business Department, and Technology Department, then this attribute appears as a multi-select drop-down list with those options when you create a quality rule.

        If the value type for the rule owner field is custom input with a field length of 256, you can enter up to 256 characters for this attribute when creating the rule.

        If the input method for an attribute field is Range, configure it as follows:

        Range: Typically used for continuous numerical or date value ranges. You can select one of four operators: >, >=, <, and <=. For more information about attribute configuration, see Create and manage quality rule attributes.

        Scheduling Attribute Configuration

        Scheduling Method

        Select an existing schedule. If you have not decided on a scheduling method, you can create the quality rule first and configure it later. To create a new schedule, see Create a schedule.

        Quality score configuration

        Scoring Method

        Two scoring methods are available: quality verification status and data qualification ratio.

        • Quality Verification Status: This method assigns scores based on the verification status of the rule's last successful execution. A successful verification receives 100 points, while a failed verification receives 0 points.

        • Data Qualification Ratio: The score is the percentage of valid data from the rule's last successful execution. For example, if the data format validity is 80%, the quality score is 80.

        The supported scoring methods vary by rule template. The following templates only support the Quality Verification Status method:

        • Field Group Count Check and Field Duplicate Value Count Check in the Uniqueness category.

        • Single-table Field Statistical Value Consistency Comparison and Cross-source Two-table Field Statistical Value Comparison in the Consistency category.

        • The Stability category.

        • Custom Statistical Metric Check in the Custom SQL category.

        Quality score weight

        The weight of the rule's score, which is used to calculate the monitored object's overall quality score. You can select an integer from 1 to 10.

      3. Click OK to finish configuring the custom rule.

        You can click Preview SQL to compare the current configuration with the last saved version and view any SQL changes.

        Note
        • The Preview SQL feature is unavailable if you have not entered all key information.

        • The left pane shows a preview of the SQL from the last saved configuration. This pane is empty if you have not saved a configuration. The right pane shows a preview of the SQL for the current configuration.

  • Reference a data standard for monitoring

    1. Hover over the data table, point to Create Quality Rule, and select Reference Data Standard for Monitoring.

    2. In the Reference Data Standard for Monitoring dialog box, select the data standard rules that you want to reference. You can filter the rules by template (Validity, Uniqueness, Completeness, and Stability) or search by object name.

      In this dialog box, you can modify the rule name and enable or disable its status. You can also click a standard code to view standard details, or click the image icon in the Actions column to view the quality rule.

      Note

      After a rule is referenced, you cannot modify its detailed configuration. However, you can configure its schedule and change settings like the rule strength.

    3. Click Add Selected Rules to finish referencing the data standard rules.

Rule configuration list

After a quality rule is created, you can view, edit, dry run, run, and delete it in the rule configuration list.

image

Area

Description

Filter and search area

Allows you to search by object or rule name.

You can filter by rule type, rule template, rule strength, dry run status, effective status, and rule source.

Note

If a business attribute for a quality rule is configured to be searchable or filterable and is enabled, you can search or filter by that attribute.

List area

Displays the object type/name, rule name/ID, dry run status, effective status, rule type, rule template, rule strength, schedule type, and related knowledge base document information. Click the image icon before the Refresh button to select the columns to display in the list.

  • If the rule references a data standard, you can click the image icon next to the rule name to view the standard details.

  • If the rule is a standard constraint for a development field, it is marked with an image icon after the rule name.

  • Effective Status: We recommend performing a dry run before enabling a rule. Enable a rule only after a successful dry run to avoid blocking online tasks.

    • After you enable a rule, it runs automatically based on its configured schedule.

    • After you disable a rule, it does not run automatically, but you can still run it manually.

  • Related knowledge base documents: Click View Details to see the knowledge base information associated with the rule. This includes the table name, verification object, rule, and related documents. You can also search for, view, edit, and delete knowledge base documents. For more information, see View a knowledge base.

Operations area

You can view, clone, edit, dry run, run, configure a schedule, associate a knowledge base document, configure a quality score, and delete a rule.

  • View: View the rule configuration details.

  • Clone: Quickly duplicate a rule.

  • Edit: After you edit a rule, you must dry run it again. The editable information varies depending on how the quality rule was created. For more information, see Quality rule editing instructions.

  • Dry Run: You can dry run the rule by using an Existing Schedule or a Custom Verification Scope. After the dry run, you can click the image icon to View Dry Run Log.

  • Run: You can run the rule by using an Existing Schedule or a Custom Verification Scope. After the run, you can view the results in Verification Records.

  • Configure Schedule: In the dialog box, you can filter by schedule type or search by schedule name. You can also edit the schedule.

  • Associate Knowledge Base Document: After a rule is associated with a knowledge document, you can view the associated knowledge in the Quality Rules and Governance Workbench. You can select an unassociated knowledge document to create an association. To create a new document, see Create and manage a knowledge base.

  • Quality score configuration: Modify the scoring method and quality score weight for the quality rule.

    Important

    The quality score weight is used to calculate the overall quality score of the monitored object. Modifying it affects the quality score result. Proceed with caution.

  • Delete: Deleting this quality rule object also deletes all quality rules under it. This action cannot be undone. Proceed with caution.

Batch operations area

You can perform batch operations, including dry run, run, configure schedule, enable, disable, modify business attributes, associate knowledge base documents, configure quality scores, export rules, and delete.

  • Dry Run: Dry run multiple rules in a batch by using an Existing Schedule or a Custom Verification Scope. After the dry run, you can click the image icon to View Dry Run Log.

  • Run: Run multiple rules in a batch by using an Existing Schedule or a Custom Verification Scope. After the run, you can view the results in Verification Records.

    Note

    When running rules in a batch, we recommend selecting tables with the same partition. The partition information is passed directly for execution. If partitions differ, errors may occur.

  • Configure Schedule: Configure schedules for multiple quality rules in a batch. In the dialog box, you can filter by schedule type or search by schedule name. You can also edit schedules. You can only modify selected rules that are editable on the Quality Rules list page.

  • Enable: After you enable multiple rules, they run automatically based on their configured schedules. You can only enable selected rules that are editable on the Quality Rules list page.

  • Disable: After you disable multiple rules, they no longer run automatically, but you can still run them manually. You can only disable selected rules that are editable on the Quality Rules list page.

  • Modify Business Attributes: You can modify business attributes in a batch if the corresponding attribute field is a single-select or multi-select type.

    • If the value type is multi-select, you can append or modify attribute values.

    • If the value type is single-select, you can directly modify the attribute value.

  • Associate Knowledge Base Document: After a rule is associated with a knowledge document, you can view the associated knowledge in the Quality Rules and Governance Workbench. You can associate knowledge documents with monitored objects in a batch. To create a new document, see Create and manage a knowledge base.

  • Quality score configuration: You can modify the scoring method and quality score weight for multiple rules in a batch.

    Important

    The quality score weight is used to calculate the overall quality score of the monitored object. Modifying it affects the quality score result. Proceed with caution.

  • Export Rules: Export the selected custom SQL quality rules under the current monitored object that you have permission to view.

  • Delete: You can delete quality rule objects in a batch. This action cannot be undone, so proceed with caution. You can only delete rules you have permission to edit.

Create a schedule

Note
  • When configuring a schedule for a quality rule, you can quickly base it on an existing schedule. A maximum of 20 schedules can be configured per table.

  • A single quality rule can have up to 10 schedules.

  • Schedules with identical configurations are automatically deduplicated.

  • For a Hologres partitioned table, it is recommended to use a schedule triggered by a fixed task.

  • The check scope defines the data to be checked by serving as a filter condition in the quality check statement. Downstream components, such as quality reports, also use the check scope as their basic unit of granularity.

  1. On the Quality Rule Details page, click the Scheduling Configuration tab, and then click Create Schedule to open the Create Schedule dialog box.

  2. In the Create Schedule dialog box, configure the parameters.

    Parameter

    Description

    Schedule name

    A custom name for the schedule, up to 64 characters in length.

    Schedule type

    Supports time-based schedule, data update-triggered schedule, and schedule triggered by fixed task.

    • Time-based schedule: Runs quality checks periodically based on the configured time. This is suitable for scenarios where data is generated at relatively fixed times.

      • Scheduling cycle: Running a quality rule consumes computing resources. Avoid running multiple quality rules concurrently to prevent disrupting production tasks. The scheduling cycle supports five types: Day, Week, Month, Hour, and Minute.

        If the system time zone (the time zone in the user center) is different from the schedule time zone (the time zone configured in Management Center > System Settings > Basic Settings), the rule runs based on the system time zone.

      • Fill in Recommended Time: You can click Fill in Recommended Time, and the system will recommend a time based on the average completion time of the task that generates the current table.

    • Data update-triggered schedule: When a code-based task runs, the system checks if its execution updates the data within the check scope of the current table. This schedule type is ideal for tables modified by various tasks.

      Note

      We recommend setting the check scope to the partitions updated by the task. For a non-partitioned table, the quality check runs on the full table. The system automatically identifies and checks all data changes to ensure no changes are missed.

    • Schedule triggered by fixed task: Runs the configured quality rule after a specified task completes successfully or before it starts. You can select tasks of the following engine types to trigger the schedule: SQL, offline pipeline, Python, Shell, Virtual, Dlink, and database SQL nodes. This is suitable for scenarios where the table is modified by a fixed task.

      Note
      • You can only select tasks from the production environment to trigger the schedule. If you set a strong rule and the scheduled quality check fails, it might affect online tasks. Use this feature with caution based on your business needs.

      • Supported engine types: MaxCompute.

      • Trigger timing: Specifies when the quality check is triggered. You can select Trigger after all tasks are successfully run, Trigger after each task is successfully run, or Trigger before each task is run.

      • Trigger task: The following roles can select a task node from a production project as the trigger task. You can search by node output name or select from the Recommended Tasks or All Tasks lists.

        • The following roles can select task nodes from the production project: project administrators of Prod/Basic projects, operations system roles of Prod projects, developer system roles of Basic projects, and custom project roles with Project Quality Management-Quality Rule Management permissions in Prod/Basic projects.

        • Custom global roles with Quality Rule-Management permissions in Prod/Basic projects can select task nodes from all production projects.

        • Recommended Tasks: Displays lineage tasks where the current table is an output, and tasks where the node output name matches the format plate_name/project_name.table_name. This corresponds to the tasks shown in Asset Catalog-Asset Details-Output Information.

        • All Tasks: Displays all production tasks for which the current user has operations permissions.

          Note
          • If you set the trigger timing to 'Trigger after all tasks are successfully run', we recommend selecting trigger tasks that have the same scheduling cycle. This helps prevent rule execution delays and ensures timely quality check results.

          • If you set the trigger timing to 'Trigger before each task is run', the Recommended Tasks list shows lineage tasks where the current table is an input.

    Scheduling condition

    When enabled, the system checks if the scheduling condition is met before executing the schedule. The schedule runs only if the condition is met; otherwise, the system skips the current execution.

    • Business date/Execution date: If the schedule type is time-based schedule (which does not support execution date), data update-triggered schedule, or schedule triggered by fixed task, you can configure a date condition. You can choose a General Calendar or a Custom Calendar. To learn how to create a custom calendar, see Create and manage public calendars.

      • If you select General Calendar, you can set conditions based on Month, Day of week, or Date. For example:

        image

      • If you select Custom Calendar, you can set conditions based on Date Type or Tag. For example:

        image

    • Instance type: If the schedule type is data update-triggered schedule or schedule triggered by fixed task, you can configure the instance type. Options are scheduled instance, backfill instance, and manual instance. For example:

      image

    Note
    • You must configure at least one condition. To add a condition, click the +Add Rule button.

    • You can configure a maximum of 10 scheduling conditions.

    • The logical relationship between scheduling conditions can be set to AND or OR.

    Check scope

    If the schedule type is time-based schedule or schedule triggered by fixed task, you can select Custom check scope. If the schedule type is data update-triggered schedule, you can select Partitions updated by task

    or Custom check scope.

    • Partitions updated by task: If the task updates partitions, the quality check runs on those updated partitions.

      Note
      • In dynamic partition scenarios, the system might not be able to resolve partitions, and no quality check will be performed.

      • Volatility check rules (such as checking partition size, row count, or field statistics) require a specific partition and do not support the 'Partitions updated by task' check scope.

      • For non-partitioned tables, a quality check is performed on the full table if any data is updated.

    • Custom check scope: For scenarios where partitions cannot be resolved, you can use a custom check scope and specify the partition expression based on the business date or execution date.

      • Check scope expression: An editable drop-down list where you can enter the check scope, such as ds='${yyyyMMdd}'. You can also select and modify a built-in partition expression to help you configure it quickly. For details about partition expressions, see Built-in partition expression types.

        Note
        • If there are multiple conditions, you can connect them with and or or, such as province="Zhejiang" and ds<=${yyyyMMdd}.

        • A filter condition configured in the quality rule is combined with the check scope expression using AND. During the quality check, the system applies both conditions to filter the data.

        • The check scope expression supports a full table scan.

          Note: A full table scan consumes significant resources and is not supported by all engines. Configure a partition expression to avoid full table scans.

      • Check scope preview: By default, the preview is for the current business date.

  3. Click OK to complete the scheduling configuration.

Scheduling configuration list

After creating a schedule, you can view, edit, clone, and delete it in the scheduling configuration list.

image

Area

Description

Filter and search area

Allows you to search by schedule name.

You can filter by time-based schedule, data update-triggered schedule, and schedule triggered by fixed task.

List area

Displays the Schedule name, Schedule type, Last modified by, and Last modified time for each scheduling configuration.

Operations area

You can edit, clone, or delete a schedule.

  • Edit: Lets you modify the configured schedule information.

    Important

    Editing this schedule affects all quality rules that use it. Proceed with caution.

  • Clone: Quickly duplicate a schedule.

  • Delete: You cannot delete a schedule that is currently used by a quality rule.

Alert configuration

Configure separate alert settings for different rules to distinguish notifications. For example, use phone alerts for strong-rule anomalies and SMS alerts for weak-rule anomalies. If a rule matches multiple alert configurations, set an effective policy to determine which one applies.

Note

You can create a maximum of 20 alert configurations for each monitored object.

  1. On the Quality Rule Details page, click the Alert configuration tab, and then click Create Alert Configuration. The Create Alert Configuration dialog box appears.

  2. In the Create Alert Configuration dialog box, configure the parameters.

    Parameter

    Description

    Scope

    Select All Rules, All Strong Rules, All Weak Rules, or Custom.

    Note
    • For each monitored object, you can create one alert configuration for each of these three scopes: All Rules, All Strong Rules, and All Weak Rules. The system automatically matches new rules to the corresponding alert configuration based on their strength. To change one of these configurations, you must edit the existing one.

    • The Custom scope lets you select up to 200 existing rules configured for the current monitored object.

    Alert configuration name

    The name must be unique for each monitored object and cannot exceed 256 characters.

    Alert recipient

    Configure the alert recipients and notification methods. You must select at least one alert recipient and one notification method.

    • Alert recipient: You can select from three types of alert recipients: Custom, On-call Schedule, and Quality Owner.

      You can configure up to 5 custom alert recipients and up to 3 on-call schedules.

    • Notification method: Supported methods include phone, email, SMS, DingTalk.

  3. Click OK to save the alert configuration.

Alert configuration list

After you create an alert configuration, you can sort, edit, or delete it in the list.

image

No.

Description

① Sorting area

Configure the effective policy that applies when a quality rule matches multiple alert configurations:

  • The first matched alert configuration takes effect: When you select this policy, only the first matching alert configuration applies, and all others are ignored. You can reorder the alert configurations to set their priority. Click Sort Rules. You can then drag the image.png icon next to an alert configuration name or use the icons in the Operations column to move it. From left to right, the icons are: Move to Top and Move to Bottom. After reordering, click Finish Sorting to save the new order.

    image

  • All alert configurations take effect: All alert configurations in the list apply to the quality rules for the current monitored object.

    For example, if you have multiple configurations and select this option, the system merges alerts based on the notification method, alert recipient, and quality rule. Specifically, if a custom recipient and a quality owner are the same person, their alerts are merged according to the alert merging policy.

    Note

    Alert merging is not supported for on-call schedules.

② List area

Displays the name, effective scope, specific recipients, and notification method for each alert configuration.

Effective scope: For alert configurations with a custom scope, you can view the names of the configured monitored objects and rules. If a rule is deleted, its name is no longer displayed. In this case, update the alert configuration.

③ Operations area

You can edit or delete your alert configurations.

  • Edit: Modify the settings of an existing alert configuration. If you change the alert recipients or notification methods, notify the affected personnel to avoid missed business alerts.

  • Delete: After you delete an alert configuration, it no longer applies to its associated rules. Proceed with caution.

Add an exception archive table

An exception archive table stores records that fail quality rule checks, enabling later analysis.

  1. On the Quality Rule Details page, click the Exception Archiving tab, and click + Add Exception Archive Table to open the Add Exception Archive Table dialog box.

  2. In the Add Exception Archive Table dialog box, configure the parameters.

    The Method supports two options: Create table and Select existing table. Special quality check fields are added, and archived abnormal data is not written to the original data table.

    • Create Table: You can specify a custom table name. The table must be in the same project or workspace as the monitored table. By default, the table name is current_table_name_exception_data. The table is created in the same database or data source. The name can contain letters, digits, underscores (_), and periods (.), and can be up to 128 characters long.

      • If the monitored table is a physical table, the exception archive table is created in the same project as the monitored table.

      • If the monitored table is a dimension logical table or a fact logical table, the exception archive table is created in the same project by default. You can also manually specify a project within the monitored table's workspace, such as projectA.table_name.

      • If the monitored table is a summary logical table, specify a project name within the same workspace for the exception archive table. Otherwise, the system automatically archives the table to a project in the monitored table's workspace.

      • The exception archive table must contain all fields from the monitored table, and the required quality check fields. The script format is as follows:

        create table Current_table_name_exception_data
         (dataphin_quality_tenant_id      varchar(64)   comment 'Tenant ID' , 
          dataphin_quality_rule_id        varchar(64)   comment 'Quality rule ID', 
          dataphin_quality_rule_name      varchar(256)  comment 'Quality rule name', 
          dataphin_quality_column_name    varchar(1024) comment 'Check field name', 
          dataphin_quality_watch_task_id  varchar(128)  comment 'Monitored object task ID', 
          dataphin_quality_rule_task_id   varchar(64)   comment 'Rule task ID', 
          dataphin_quality_validate_time  varchar(64)   comment 'Quality check time', 
          dataphin_quality_archive_mode   varchar(32)   comment 'Exception archive mode, ONLY_ERROR_FIELD/FULL_RECORD', 
          dataphin_quality_error_data     string        comment 'Exception data', 
          ljba_id                         bigint        comment  'ljba_primary key', 
          ljb_id                          bigint        comment  'ljb_primary key', 
          col_tinyint                     tinyint       comment 'Data type is TINYINT and in lowercase',
          col_tinyint_02                  tinyint       comment '2',
          col_smallint                    smallint      comment 'Data type is SMALLINT and in lowercase',
          col_smallint_02                 smallint      comment '4',
          col_int                         int           comment 'Data type is INT and in lowercase',
          col_int_02                      int           comment '6',
          col_bigint                      bigint        comment 'Data type is BIGINT and in lowercase',
          col_bigint_02                   bigint        comment '8',
          col_float                       float         comment 'Data type is FLOAT and in lowercase',
          col_float_02                    float         comment '10',
          col_double                      double        comment 'Data type is DOUBLE and in lowercase',
          col_double_02                   double        comment '11',
          col_decimal                     decimal(38,18) comment 'Data type is DECIMAL(38,18) and in lowercase',
          col_decimal_02                  decimal(38,18) comment '12',
          col_varchar                     varchar(500)   comment 'Data type is VARCHAR(500) and in lowercase',
          col_varchar_02                  varchar(500)   comment '13',
          col_char                        char(10)       comment 'Data type is CHAR(10) and in lowercase',
          col_char_02                     char(10)       comment '14',
          col_string                      string         comment 'Data type is STRING and in lowercase',
          col_string_02                   string         comment '15',
          col_date                        date           comment 'Data type is DATE and in lowercase',
          col_date_02                     date           comment '16',
          col_datetime                    datetime       comment 'Data type is DATETIME and in lowercase',
          col_datetime_02                 datetime       comment '17',
          col_timestmap                   timestamp      comment 'Data type is TIMESTAMP and in lowercase',
          col_timestmap_02                timestamp      comment '18',
          col_boolean                     boolean        comment 'Data type is BOOLEAN and in lowercase',
          col_boolean_02                  boolean        comment '19',
          col_binary                      binary         comment 'Data type is BINARY and in lowercase',
          col_binary_02                   binary         comment '20',
          col_array                       array<int>     comment 'Data type is ARRAY<int> and in lowercase',
          col_array_02                    array<string>  comment '21',
          col_map                         map<string,string>  comment 'Data type is MAP<string, string> and in lowercase',
          col_map_02                      map<string,int>     comment '22',
          ds                              string              comment 'Date partition, yyyyMMdd'
         ) 
        partitioned by 
        (dataphin_quality_validate_date string comment 'Check date (partition field)');
    • Select Existing Table: You can select a table from the same project or data source. The exception archive table must contain all fields from the monitored table, and the required quality check fields. You can click View Exception Archive Table DDL to view the DDL statement. The script format is as follows:

      create table Current_table_name_exception_data
       (dataphin_quality_tenant_id      varchar(64)   comment 'Tenant ID' , 
        dataphin_quality_rule_id        varchar(64)   comment 'Quality rule ID', 
        dataphin_quality_rule_name      varchar(256)  comment 'Quality rule name', 
        dataphin_quality_column_name    varchar(1024) comment 'Check field name', 
        dataphin_quality_watch_task_id  varchar(128)  comment 'Monitored object task ID', 
        dataphin_quality_rule_task_id   varchar(64)   comment 'Rule task ID', 
        dataphin_quality_validate_time  varchar(64)   comment 'Quality check time', 
        dataphin_quality_archive_mode   varchar(32)   comment 'Exception archive mode, ONLY_ERROR_FIELD/FULL_RECORD', 
        dataphin_quality_error_data     string        comment 'Exception data', 
        ljba_id                         bigint        comment  'ljba_primary key', 
        ljb_id                          bigint        comment  'ljb_primary key', 
        col_tinyint                     tinyint       comment 'Data type is TINYINT and in lowercase',
        col_tinyint_02                  tinyint       comment '2',
        col_smallint                    smallint      comment 'Data type is SMALLINT and in lowercase',
        col_smallint_02                 smallint      comment '4',
        col_int                         int           comment 'Data type is INT and in lowercase',
        col_int_02                      int           comment '6',
        col_bigint                      bigint        comment 'Data type is BIGINT and in lowercase',
        col_bigint_02                   bigint        comment '8',
        col_float                       float         comment 'Data type is FLOAT and in lowercase',
        col_float_02                    float         comment '10',
        col_double                      double        comment 'Data type is DOUBLE and in lowercase',
        col_double_02                   double        comment '11',
        col_decimal                     decimal(38,18) comment 'Data type is DECIMAL(38,18) and in lowercase',
        col_decimal_02                  decimal(38,18) comment '12',
        col_varchar                     varchar(500)   comment 'Data type is VARCHAR(500) and in lowercase',
        col_varchar_02                  varchar(500)   comment '13',
        col_char                        char(10)       comment 'Data type is CHAR(10) and in lowercase',
        col_char_02                     char(10)       comment '14',
        col_string                      string         comment 'Data type is STRING and in lowercase',
        col_string_02                   string         comment '15',
        col_date                        date           comment 'Data type is DATE and in lowercase',
        col_date_02                     date           comment '16',
        col_datetime                    datetime       comment 'Data type is DATETIME and in lowercase',
        col_datetime_02                 datetime       comment '17',
        col_timestmap                   timestamp      comment 'Data type is TIMESTAMP and in lowercase',
        col_timestmap_02                timestamp      comment '18',
        col_boolean                     boolean        comment 'Data type is BOOLEAN and in lowercase',
        col_boolean_02                  boolean        comment '19',
        col_binary                      binary         comment 'Data type is BINARY and in lowercase',
        col_binary_02                   binary         comment '20',
        col_array                       array<int>     comment 'Data type is ARRAY<int> and in lowercase',
        col_array_02                    array<string>  comment '21',
        col_map                         map<string,string>  comment 'Data type is MAP<string, string> and in lowercase',
        col_map_02                      map<string,int>     comment '22',
        ds                              string              comment 'Date partition, yyyyMMdd'
       ) 
      partitioned by 
      (dataphin_quality_validate_date string comment 'Check date (partition field)');
  3. Click OK to add the exception archive table.

    You can select the Set as active archive table after creation checkbox. This ensures the table is automatically selected as the destination for exception data when you create new quality rules.

Exception archive table list

After you add an exception archive table, the system sets it as the active archive table by default. You can click the name of an exception archive table to view its table structure. You can also perform the following operations on other tables in the list.

  • Set as active archive table: When you set a table as the active archive table, any of the object's quality rules configured for custom archiving will send their exception data to this table.

  • Delete: This action removes the reference to the exception archive table but does not delete the physical table. You can re-add the reference later if needed.

Quality report

Click quality report to view the rule check overview and rule check details for the current quality rule.

  • You can quickly filter the rule check details by abnormal results, partition time, or keywords in rule or object names.

  • In the rule check details list, click the image icon in the Actions column to view the rule check details for the rule.

  • In the rule check details list, click the image icon in the Actions column to view the execution log for the rule.

Permission management for quality rules

  1. Click Permission Management to configure the View Details setting. This setting controls who can view the details of validation records, quality rules, and quality reports.

    View Details: You can select All members or Only members with quality management permissions for the current object.

  2. Click OK to save the permission management configuration.

Next steps

After you configure the quality rules, you can view them on the Dataphin Table Rules page. For more information, see or Manage a list of monitoring objects.

上一篇: Create quality rules for a single monitored object 下一篇: Create a global table quality rule
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈