Instance run diagnosis

更新时间: 2026-05-22 06:08:48

The execution of a recurring instance or backfill instance is influenced by multiple factors in addition to its scheduled time. These include upstream task status, resource availability, and throttling rules. To address this, Dataphin provides the instance run diagnosis feature to analyze an instance's execution process and its entire dependency chain. When an instance does not run as expected, you can use this feature to quickly identify the problem.

Limitations

  • Run diagnosis is available only for offline recurring instances and backfill instances, including script, detail and summary table, and extraction instances. It does not support real-time instances (including real-time computing and real-time integration) or manual instances.

  • For detail and summary table instances, analysis is available only at the materialized node level, not at the field level.

Overview

In O&M, different colors and icons indicate an instance's running status, with each representing a stage in the run process. You can use these indicators to determine an instance's current stage or to identify why it has not run. The instance running statuses and the overall run process are as follows:

Status icon

Description

Run process diagram

test

Not Running

image.png

test

Waiting for Scheduled Time

test

Throttling

test

Waiting for Scheduling Resources

test

Running

test

Succeeded

test

Failed

An instance's successful execution depends on multiple factors, including its upstream dependencies, scheduled time, available resources, and its own execution status. When an instance fails or remains in a single running status for an extended period, you can use the run diagnosis feature to analyze the issue based on the following checks:

Item

Description

Upstream dependency

Checks the running status of upstream instances. The current instance runs only after all its upstream instances successfully complete and meet the dependency policy. If an upstream instance's status does not meet the policy, it will block the current instance. You can view the upstream dependency diagnosis results to investigate the cause of failure.

Scheduled time

Checks if the instance's scheduled time has been reached.

Throttling rule

You can view the throttling rules triggered by the current instance and the list of instances already dispatched in the current queue.

Scheduling resource

You can view how long the instance has been waiting for scheduling resources and the full list of instances currently using resources in its resource group. You can then act on the provided diagnostic suggestions.

Instance execution

You can view the instance run result and execution logs.

Accessing run diagnosis

  1. In the top menu bar of the Dataphin homepage, choose Develop > O&M.

  2. In the navigation pane on the left, choose Instance O&M > Recurring Instance, Backfill Instance, or Manual Instance.

  3. On the Recurring Instance, Backfill Instance, or Manual Instance page, click the name of the target instance. Below the DAG on the right, click View Node Details.

    The following figure uses the Recurring Instance page as an example.image

  4. On the node details page, click the Run Diagnosis tab.

Upstream dependency

The upstream dependency diagnosis shows the result from the instance's last run and the current status of its upstream instances. The current instance proceeds to the next check only when all upstream instances have completed. You can use the diagnosis result to investigate the cause of any failures. An instance that has already passed the upstream dependency diagnosis in its last run is not re-diagnosed. To refresh the last run's result or the status of upstream instances, click the 刷新 Refresh icon.

  • If the last run of the instance succeeded and it was not a forced rerun, the diagnosis result is Passed.

    Feature

    Description

    Last run

    Shows the running status and the completion time of the last run.

    Note

    An instance is scheduled only after all its upstream dependencies have been satisfied.

    Current diagnosis result

    Shows the diagnosis result.

    • Scheduling Type: Includes dry run, Normal Run, and suspended running. If the current instance's scheduling type is suspended running, you must resume scheduling for it to run.

    • root blocking node: Shows the highest-level node in the dependency chain that is preventing the current node from running. Instances that have passed the upstream diagnosis do not have a root blocking node.

    • Direct Upstream List: Shows the list of direct upstream dependencies. You can search by node name, node ID, or instance ID, and filter by running status or owner.

  • If an instance has not started running and its scheduling is not suspended, the scheduling type in the diagnosis result is Normal Run. You can follow the prompts to focus on the root blocking node. Resolving the issue with the blocking node allows the current node to run. The current instance will be scheduled only after all its upstream instances run successfully.

  • If the instance is currently in a suspended running state, the diagnosis stops, and the result is suspended running.

  • A forced rerun bypasses the upstream dependency check. If the instance's last run was a forced rerun, the diagnosis result is Skipped.

Scheduled time

This check determines if the instance has reached its scheduled time. The result displayed is for the most recent run. An instance is scheduled for execution only after its scheduled time is reached. Otherwise, it remains in the Waiting for Scheduled Time state. To refresh the diagnosis result, click the 刷新 Refresh icon.

  • If the instance has not yet reached its scheduled time and its scheduling is not suspended, the diagnosis result is Waiting for Scheduled Time. To run the instance earlier, you can perform a forced rerun after ensuring it will not affect downstream data quality.

  • If the instance is in a suspended running state because its scheduling is suspended, the diagnosis result is Suspended. To run it, click resume scheduling.

  • If the instance reached its scheduled time in its last run and it was not a forced rerun, the diagnosis result is Passed.

  • A forced rerun bypasses the scheduled time check and starts immediately. If the last run of the instance was a forced rerun, the diagnosis result is Skipped.

Throttling rule

If you have purchased the intelligent O&M add-on feature, you can configure throttling rules. For instructions, see Throttling Configuration.

All instances are checked against throttling rules. After passing the upstream dependency and scheduled time checks, an instance must also satisfy all matching throttling rules before it is dispatched to the resource scheduling system. To refresh the diagnosis result, click the 刷新 Refresh icon.

  • If the instance's last run passed the upstream dependency and scheduled time checks and also satisfied all matching throttling rules, the diagnosis result is Passed.

  • If the instance is currently being throttled while waiting to be dispatched and its scheduling is not suspended, the diagnosis result is Throttling. The duration it has been waiting is also displayed.

    Item

    Description

    Blocking rule

    Shows the name of the throttling rule that the current instance has triggered. You can click the rule name to view its details.

    Dispatched instance list

    Lists the instances that have already been dispatched in the queue of the triggered throttling rule. You can search or filter dispatched instances by name or ID.

  • If the instance is in a suspended running state because its scheduling is suspended, the diagnosis result is Suspended. You must resume the instance's run state before it can be dispatched for resource scheduling.

Scheduling resource

Instances that use shared running resources are generally less affected by scheduling resource availability. However, instances that use dedicated resources must wait for sufficient idle resources in their assigned resource group before they can be scheduled. Otherwise, their status will be Waiting for Scheduling Resources. To refresh the diagnosis result, click the 刷新 Refresh icon.

  • If sufficient allocatable idle resources were available in the instance's resource group during its last run, and its scheduling is not suspended, the diagnosis result is Passed.

  • If there are not enough idle scheduling resources available for the current instance, the diagnosis result is Waiting for Scheduling Resources. The scheduling resource diagnosis page shows the Resource wait time, a diagnostic Suggestion, and a list of Resource occupying instances. You can act on the suggestions and use the list of Resource occupying instances to free up resources for the current instance.

Instance execution

Only instances that have reached the execution stage are shown on the Instance Execution diagnosis page. This page displays the Run result and run log. If the Run result is Failed, you can use the run log to troubleshoot the issue. To refresh the diagnosis result, click the 刷新 Refresh icon.

Click Open Run Log to navigate to the Run Log page. The run log contains error messages, performance diagnostics, and error codes. For more information about performance diagnostics, see Diagnose offline integration task performance.

上一篇: Manage manual instances 下一篇: Real-time instance operations
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈