To ensure that your critical tasks finish on time, you can add them to a baseline and set a committed finish time. The system calculates an estimated finish time for baseline tasks based on their run history. If the system predicts that a task may not complete before the committed finish time, it sends an alert. This topic describes how to create and manage baselines.
Background
Smart Baseline detects exceptions that prevent tasks from completing on time and sends alerts in advance. This ensures that important data is generated within the expected timeframe, especially in scenarios with complex dependencies. For more information, see Smart Baseline overview.
After a baseline is created and enabled, it takes effect the next day. You can then go to the Auto Triggered Instances page to view the baseline's execution status.
Limitations
-
Version requirements:
The baseline management feature is available only in DataWorks Standard Edition and later. If you are using an earlier version, upgrade your DataWorks instance before you use this feature. For more information, see Features of DataWorks editions.
-
Permission control:
-
Only an Alibaba Cloud account or a RAM user with the workspace administrator or tenant administrator role can create baselines.
-
Only the tenant administrator and the baseline owner can enable, disable, delete, or modify a baseline.
To grant a user these permissions, assign the required roles to the user. For more information, see Add workspace members and assign roles.
-
-
Alert notification methods:
DataWorks supports multiple alert notification methods, including Email, SMS, phone calls, DingTalk Chatbot, and Webhook. The following table describes the limitations for each method.
Notification method
Available regions
Available editions
Description
SMS
All regions
Standard Edition and later.
To receive text message alerts in other regions, click the application link to join the "Alibaba Cloud Big Data AI Platform" chat group. Then, scan the QR code below to join the DataWorks product DingTalk group for support. There, you can ask the chatbot for help or contact an on-duty engineer during business hours.

Webhook
All regions
Basic Edition
Supports sending alerts to DingTalk, Lark, and WeCom groups by using group webhooks.
Enterprise Edition
In addition to the features available in Basic Edition, Enterprise Edition supports configuring custom webhooks to receive alerts.
NoteIf you need to use a custom webhook, refer to Custom webhooks for smart monitoring for configuration details. After you complete the configuration, contact us for further assistance.
NoteTo enable a RAM user to receive alerts by text message or phone call, add them as an alert contact on the Alert Contacts page. When a task fails, DataWorks sends the alert to the associated contacts. For more information, see View and manage alert contacts.
Create a baseline
-
Go to the Operation Center page.
-
In the left-side navigation pane, click .
-
Create the baseline.
-
On the Baselines tab, click Create Baseline.
-
Configure the basic properties of the baseline.
The following table describes the parameters.
Parameter
Description
Baseline Name
A custom name for the baseline.
Belongs work space
Select the workspace to which the nodes you want to monitor belong.
NoteWhen you configure the Nodes parameter, you can select only nodes and workflows within this Belongs work space.
Owner
The owner of the baseline.
Baseline Type
Defines the monitoring cycle for the baseline, which can be daily or hourly.
-
Day-level Baseline: Monitors tasks on a daily basis. This is suitable for daily scheduled tasks.
-
Hour-level Baseline: Monitors tasks on an hourly basis. This is suitable for hourly scheduled tasks.
Nodes
Select the tasks you want to monitor to ensure they complete on time.
Adding a node to a baseline moves it from its current baseline to the new one.
-
Node: Enter the name or ID of a node and click the Add button. You can add multiple nodes to the baseline.
-
Workflow: Enter the name or ID of a workflow and click the Add button. By default, all nodes in the workflow are added to the baseline.
NoteAfter you select a workflow, we recommend adding only its most downstream nodes. Once added, the baseline automatically includes all upstream nodes that affect their data output in its monitoring scope. Adding all nodes in a workflow to a baseline is not recommended.
Priority
Sets the baseline priority, where a larger value indicates higher priority. When scheduling resources are limited, tasks on higher-priority baselines are scheduled first. This priority setting applies to auto triggered instances generated the following day.
Note-
MaxCompute nodes:
The baseline priority is mapped to the priority of MaxCompute compute jobs under the following conditions:
-
The priority feature is enabled for the MaxCompute project.
-
The MaxCompute project uses subscription compute resources.
MaxCompute job priority = 9 - DataWorks baseline priority.
-
-
EMR nodes:
You can map baseline priorities to YARN queue priorities to adjust the final YARN queue priority of a node. This determines whether the node can be preferentially scheduled and executed. For more information, see Configure mappings between baseline priorities and YARN queue priorities.
Once a node is added to a baseline, its priority is set by that baseline. This priority then propagates to all of the node's upstream dependencies.
-
If a node affects the data output of multiple baselines with different priorities, the node's priority is determined by the highest priority among those baselines.
-
Baseline priority does not affect upstream nodes with cross-cycle dependencies.
Estimated Completion Time
DataWorks calculates the baseline's estimated finish time based on the average completion time of its tasks over a historical period (typically the last 10 days). If the estimated finish time is later than the baseline alert time, DataWorks triggers a baseline alert. For information about the alerting mechanism, see Appendix: Baseline alerting mechanism.
NoteIf there is insufficient historical data, the system displays a message: The completion time cannot be estimated due to the lack of historical data.
Committed Completion Time
The latest time by which tasks on the baseline must be completed. This is also the deadline for data production. The baseline uses this time to calculate the alert time. You must configure the committed finish time based on the estimated finish time. The alert time (calculated as
committed finish time - alert margin threshold) must be later than the tasks' estimated finish time.Note-
The formula for the alert time is
alert time = committed finish time - alert margin threshold. An alert is triggered if the system predicts that a task will not be completed by thealert time. For example, if the committed finish time is set to3:30and the alert margin threshold is 10 minutes, the alert time is3:20. A baseline alert is sent if the system predicts the task will not finish by this time. -
For an hour-level baseline, you must specify which hourly instance (i.e., which specific run cycle) requires monitoring and set its latest completion time.
-
Because tasks on a baseline can run longer than 24 hours, you can set the committed finish time within a two-day window, from
00:00to47:59. For example, if a task runs for more than a day, you can set the time to36:00.
Alert Margin Threshold
This threshold determines how early an alert is triggered before the committed finish time. The resulting alert time should be later than the estimated finish time to prevent frequent, unnecessary alerts. We recommend that you configure the alert margin threshold based on the runtime of the tasks on the baseline. For more information, see Configure an appropriate committed finish time and alert margin threshold.
-
-
-
Configure alerting for the baseline.
These policies include baseline alerts, which are triggered when data is predicted to be late, and event alerts, which are triggered by errors or slowdowns affecting the baseline's tasks or their dependencies. Before you configure these settings, we recommend that you understand the alerting mechanism. For more information, see Appendix: Baseline alerting mechanism.
-
Enable alerting.
After you enable alerting, DataWorks checks for conditions that meet the alert rules and sends notifications accordingly.
-
If the system predicts that tasks on the baseline will not complete within the committed time, it sends a baseline alert based on the configured notification method. For more information, see Key logic: baseline alerts.
-
If a baseline task or its upstream dependencies encounter an error, or if a task on the critical path slows down, the system sends an event alert based on the configured notification method. You can view the list of existing events on the Events page in DataWorks. For more information, see Manage events.
-
-
Select notification methods.
After you enable alerting, you can select notification methods as needed. We recommend that you configure both baseline alerts and event alerts for critical tasks.
Important-
Baseline and event alerts are triggered only for auto triggered instances with a business date of yesterday or the day before yesterday. Instances outside this range are not monitored for alerts.
-
If you cannot receive alerts, see What do I do if I cannot receive alerts after I configure alert settings in Operation Center?
Baseline alert
Parameter
Description
Enable Alerting
Enables or disables alerting for this baseline.
NoteIf you disable alerting, the baseline does not generate any alerts. However, if the baseline is enabled, baseline instances are still generated and the priority setting remains in effect.
Alert Notification Method
-
Supports sending alerts by Email, SMS, or Phone call to the baseline owner, the on-duty engineer in the shift schedule, or specified recipients. To configure a shift schedule, see Shift schedule.
-
Supports sending alerts to other applications such as DingTalk, WeCom, and Lark by using a DingTalk Chatbot or a Webhook. To configure a DingTalk chatbot, see Scenario: Send alert notifications to a DingTalk group.
Note-
You can use Check Contact Information or Send Test Message to verify that alerts can be sent correctly.
-
Phone call alerts are available only in DataWorks Professional Edition and later.
-
If you select phone call alerts, DataWorks rate-limits calls to avoid a burst of calls in a short period. A user receives at most one alert call every 20 minutes. Additional alerts are downgraded to text messages.
Maximum Alerts
The maximum number of alerts that can be sent. After this limit is reached, no more alerts are generated.
Minimum Alert Interval
The minimum time interval between two consecutive alerts.
Alerting Do-Not-Disturb Period
If you set a do-not-disturb period, the system does not send alerts during this time.
For example, if the do-not-disturb period for a task is set from
00:00to08:00, baseline and event alerts are not triggered during this period. If the event is still in an abnormal state at 08:00, an alert is sent.Event alert
Parameter
Description
Event Type
Defines the event types that trigger an alert. They include:
-
Error: A task within the baseline monitoring scope fails to run.
-
Slow: The current runtime of a task within the baseline monitoring scope is significantly longer than its average runtime over a past period.
Alert Notification Method
-
Supports sending alerts by Email, SMS, or Phone call to the task owner, the on-duty engineer in the shift schedule, or specified recipients. To configure a shift schedule, see Shift schedule.
-
Supports sending alerts to other applications such as DingTalk, WeCom, and Lark by using a DingTalk Chatbot or a Webhook. To configure a DingTalk chatbot, see Scenario: Send alert notifications to a DingTalk group.
Note-
You can use Check Contact Information or Send Test Message to verify that alerts can be sent correctly.
-
Phone call alerts are available only in DataWorks Professional Edition and later.
-
If you select phone call alerts, DataWorks rate-limits calls to avoid a burst of calls in a short period. A user receives at most one alert call every 20 minutes. Additional alerts are downgraded to text messages.
Maximum Alerts
The maximum number of alerts that can be sent. After this limit is reached, no more alerts are generated.
Minimum Alert Interval
The minimum time interval between two consecutive alerts.
Alerting Do-Not-Disturb Period
If you set a do-not-disturb period, the system does not send alerts during this time.
For example, if the do-not-disturb period for a task is set from
00:00to08:00, baseline and event alerts are not triggered during this period. If the event is still in an abnormal state at 08:00, an alert is sent. -
-
Click OK to create the baseline.
NoteIf you disable alerting, the baseline does not generate any alerts. However, if the baseline is enabled, baseline instances are still generated and the priority setting remains in effect.
-
Log on to the DataWorks console. After you select a region, click in the left-side navigation pane. In the drop-down list, select the desired workspace and click Operation Center.
Add nodes to a baseline
A node can belong to only one baseline at a time. Adding a node that is already in Baseline A to Baseline B will move it to Baseline B.
If an enabled baseline has no nodes, it becomes an empty baseline and generates empty baseline instances. For more information about empty baselines, see Why is the status of my baseline displayed as Empty Baseline on the Baseline Instances page?
You can add nodes to a baseline in one of the following two ways:
-
Go to the Baselines page and click Create Baseline to add nodes.
-
Go to the Auto Triggered Node page and choose for a specific task.
NoteThis method only allows you to create a new baseline for the selected tasks. You cannot use it to add tasks to an existing baseline.
-
Add a single node to a baseline
In the Actions column of the target auto triggered task, click .
-
Add multiple nodes to a baseline
Select multiple auto triggered tasks and, in the menu bar at the bottom, click .
-
Manage baselines
On the Baselines page, you can filter baselines by criteria such as Owner, Workspace, Baseline Name, and Priority, and perform the following operations:
-
View View Details: View the basic information of the baseline tasks.
-
Modify Baseline: Modify the baseline information as needed.
-
View View Change Records: View the historical changes of the baseline.
-
Enable or Disable Baseline: Controls whether the baseline is active. An enabled baseline generates a new baseline instance daily. You can view the daily baseline details on the Baseline Instances panel.
-
Delete Baseline: Delete the baseline as needed.
Appendix: baseline alerting mechanism
Baseline alerting is a notification service for baselines that are enabled and have alerting turned on. You can configure the Alert Margin Threshold and Committed Completion Time for a baseline based on its Estimated Completion Time. DataWorks calculates a baseline's estimated finish time based on the historical average runtime (typically the last 10 days) of the tasks it monitors. The system then monitors the tasks based on their actual running status. If the system predicts that a task on the baseline cannot be completed by the alert time (committed finish time - alert margin threshold), it sends a baseline alert to the configured recipients.
Improperly configured alert margin thresholds and committed finish times can lead to unexpected alerts. For more information, see Configure an appropriate committed finish time and alert margin threshold.
-
Baseline alert policy before a task runs:
NoteBefore a daily task runs, the baseline system calculates the average completion time of tasks within its monitoring scope over the past 10 days. If it predicts that a task will not complete by the alert time, it immediately sends a baseline alert to the configured recipients. In scenarios where task dependencies are complex and frequently change, baselines can help you detect issues and receive early warnings.
-
If the estimated finish time of a baseline task, calculated from its average completion time over the past 10 days, is later than the baseline alert time, the platform triggers a baseline warning. You can view the calculated estimated finish time on the Baselines page. For more information, see Create a baseline.
-
If the estimated finish time of an upstream task, calculated from its average completion time over the past 10 days, is later than the baseline alert time, the platform triggers a baseline warning.
-
-
Baseline alert policy while a task is running:
A baseline warning is triggered if the actual completion time of a task on the baseline is later than the baseline alert time.
Next steps
After you create a baseline, you can perform the following operations:
-
View baseline instances: An enabled baseline generates an instance every day. You can view baseline run details on the Baseline Instances page.
-
EMR node: Configure mappings between baseline priorities and YARN queue priorities to adjust the final YARN queue priority of an EMR node. This determines whether the node is prioritized for scheduling and execution.
-
View baseline operation records: View a history of all operations performed on your baselines in Operation Center.