View and manage baseline instances

更新时间:
复制 MD 格式

Timeliness and accuracy are the core quality metrics for data services. You can define scenarios to ensure content quality and define baselines to ensure timeliness based on the importance of your data services. When you create a baseline, you identify the nodes or assets to protect and set a commitment time for them. This ensures timely data output and prioritizes resource allocation. A baseline is a group of nodes or assets with dependencies. This group is prioritized for cluster resources and is protected by strong alert guarantees. This topic describes how to view and manage baseline instances.

Prerequisites

You have purchased the Artificial Intelligence for IT Operations (AIOps) value-added service.

Baseline instance generation time

Baseline instances for the next day are generated between 23:00 and 00:00, after the node instances are generated.

Access a baseline instance

  1. On the Dataphin home page, click Develop in the top menu bar.

  1. Follow the path shown in the following figure to go to the Baseline Instances page.

    image.png

Baseline Instances page

The Baseline Instances page includes a search and filter area and a list of baseline instances. On this page, you can search for and filter baseline instances, refresh the list, view details, view Gantt charts, and set alerts.

image.png

Parameter

Description

Search and filter area

Search for baseline instances by baseline name. You can also use quick filters: Yesterday for Data Timestamp, My baselines, Broken baselines, and Unfinished baselines.

  • Data Timestamp: Allows you to quickly filter for baseline instances from the previous day.

  • My baselines: Filters for baseline instances where the current account is the owner.

  • Broken baselines: Filters for baseline instances with a status of Broken.

  • Unfinished baselines: Filters for baseline instances with a run status of Unfinished.

Filter area

Click image to expand the filter area. The filter options include the following:

  • Baseline Status:

    • Safe: The (estimated) completion time of the baseline is on or before the alert time.

    • Warning: The (estimated) completion time of the baseline is after the alert time but on or before the commitment time.

    • Broken: The (estimated) completion time of the baseline exceeds the guaranteed time.

    • Other: All instances in the baseline are paused, or it is an empty baseline with no guaranteed nodes (an empty instance generated from an empty baseline).

  • Is Completed: Filters by whether the baseline is completed. You can select Yes or No.

  • Priority: Filters by baseline priority. You can select Highest or High.

  • Baseline Type: Filters by baseline type. You can select Daily Baseline or Empty Baseline.

    Note

    You cannot create empty baselines. If all guaranteed nodes configured for a baseline are unpublished, the baseline automatically becomes an empty baseline.

  • Baseline Owner: Filters by baseline owner.

  • Data Timestamp: Use this to filter results by the data timestamp of the baseline. You can select Today, Yesterday, or Custom Date.

Baseline instance list

Displays information about baseline instances, including Baseline Name, Baseline Status, Is Completed, Alert/Guaranteed Baseline Time, Alert Margin, Latest Instance, Priority, Baseline Type, Owner, and available operations.

  • Baseline instance details: Click the image icon to view the baseline instance details. For more information, see View baseline instance details.

  • Critical path Gantt chart: Click the image icon. to view the critical path Gantt chart. For more information, see View the critical path Gantt chart.

  • Related alerts: Click the image icon. to view alerts related to the baseline instance. For more information, see Alert events.

View baseline instance details

The baseline instance details page includes the following information: Basic Information, (Estimated) Latest Instance, Current Critical Instance, Critical Path, Historical Runtime Trend, and Alert Events.

Area

Description

Basic Information

Basic information about the baseline instance. Includes Baseline Name, Owner, Priority, Baseline Type, Guaranteed Time, and Alert Time.

image.png

(Estimated) Latest Instance

image.png

The last node on the critical path.

Current Critical Instance

image.png

The highest-level instance on the critical path that has not run successfully.

Critical Path

image.png

Among the multiple paths that affect the baseline's guaranteed nodes, the critical path is the one with the longest runtime. It includes the protected object name, protection details (logical table node), scheduled/estimated output time, project/module, priority, and owner.

You can also click to view the critical path Gantt chart. You will be redirected to the critical path Gantt chart. For more information, see View the critical path Gantt chart.

Historical Runtime Trend

image.png

A trend graph of the baseline instance's historical runtime.

  • Horizontal axis: Data Timestamp.

  • Vertical axis: Output Time.

  • Data variables: Includes Commitment Time, Alert Time, Average Output Time, and Actual Output Time for each data timestamp.

    • Commitment Time: The latest time a baseline node must run successfully. This is the time by which the node is committed to finish. It is marked by a red dashed line (image.png).

    • Alert Time: This is equal to Commitment Time - Alert Margin. It is marked by an orange dashed line (image.png).

    • Average Output Time: The average of the actual output times on historical data timestamps. It is marked by a green dashed line (image.png).

    • Actual Output Time for each data timestamp: The time when the baseline node ran successfully. It is marked by a blue solid line (image.png).

Note
  • If different data variables have the same value, they are displayed in the following order of priority: Actual Output Time > Commitment Time > Alert Time > Average Output Time.

  • In the historical runtime trend graph, dark dots mark the earliest and latest completion times of the instance's historical runs.

Alert Events

image.png

Records of alert events triggered by the current baseline instance. Includes Alerting Node, Alert Cause, Last Alert Time, Recipient, Current Status, and Operation.

  • Alerting Node: Includes event alerting nodes and baseline alerting nodes.

  • Alert Cause: Includes Baseline Warning, Baseline Broken, Node Running Slow, and Node Error.

  • Last Alert Time: The last time an alert message was sent for this event.

  • Recipient: The recipient of the alert for this event.

  • Operation: Click image.png to go to the Alert Center and view the details of the current alert event.

View the critical path Gantt chart

The critical path is the path with the longest duration in a baseline instance. It represents the shortest possible time to complete the baseline instance. The critical path Gantt chart displays the run status of nodes in the baseline instance and their execution time. This chart helps you understand the completion progress of the baseline instance and identify key nodes for effective planning and resource allocation.

image.png

Area

Description

Basic Information

Basic information for the critical path Gantt chart. Includes Baseline Name, Owner, Baseline Status, Commitment Time, Alert Time, and Margin.

You can also adjust the display granularity of the Gantt chart by changing the time interval. You can select 10 minutes or 30 minutes.

Note

If the runtime is short, select the 10 minute interval for a clearer view of the run status.

Critical path Gantt chart

Contains the list of critical path nodes for the current baseline instance, along with the progress time and status information for the instance nodes.

image.png

  • Node list order: Nodes are sorted from top to bottom, starting from the first node on the critical path to the last.

  • Progress time: The chart shows the run progress of the instance nodes on the baseline. The progress starts from the time when the first instance node on the critical path has all its upstream dependencies successfully run and is waiting for its scheduled time or in another state.

  • Historical average runtime and instance run status transition time ranges:

    • Historical average runtime: Shows the historical average runtime of the current instance node. You can use this as a reference to determine if the current runtime is abnormal. This time range is marked by a light green bar (image.png). You can move the mouse pointer over the bar to view the Recent Mean.

    • Instance run status transition markers: Includes the following time ranges:

      • Time before all upstream nodes run successfully: No node marker is displayed when the current instance node is waiting for all upstream nodes on the critical path to run successfully.

      • Time from when all upstream nodes run successfully to the scheduled time: When all upstream instance nodes on the critical path have run successfully but the current business time has not yet reached the instance's scheduled time, this time range is marked by a purple dashed line and the text Waiting for Time (image.png). You can move the mouse pointer over the marker to view the Total Duration.

      • Time from the scheduled time to when throttling is passed: When all upstream instance nodes on the critical path have run successfully and the scheduled time has been reached, but the instance is being throttled, this time range is marked by a purple dashed line and the text Waiting for Dispatch (image.png). You can move the mouse pointer over the marker to view the Total Duration.

      • Time from when throttling is passed to when the instance is submitted to the compute engine: When all upstream instance nodes on the critical path have run successfully, the scheduled time has been reached, and throttling is passed, this time range is marked by a green bar (image.png). You can move the mouse pointer over the bar to view the instance's Task ID, Owner, Project, Run Status, and Duration.

      • Time from when the operation is paused to when it is resumed: From the time you pause the instance node until you resume it, this time range is marked by a purple dashed line and the text Paused (image.png). You can move the mouse pointer over the marker to view the Total Duration.

        Important
        • If the instance node triggers a "running slow" alert, the system marks the corresponding time point with a small yellow dot. You can move the mouse pointer over the dot to view the Data Timestamp, Node Name (Node ID), Cause, and Occurrence Time.

        • If the instance node triggers a "running error" alert, the system marks the corresponding time point with a small red dot. You can move the mouse pointer over the dot to view the Data Timestamp, Node Name (Node ID), Cause, and Occurrence Time.

        • The alert time is marked by a yellow solid line at the corresponding time point.

        • The broken time is marked by a red solid line at the corresponding time point.

        • If the alert time and the broken time are at the same point, the alert time is overwritten, and only the broken time is marked.