Configure scheduling dependencies for offline tasks
Scheduling dependencies between nodes define the execution order of your data workflows. Properly configured dependencies ensure that each node runs in the correct order and delivers timely, accurate data.
Background
A scheduling dependency defines the upstream-downstream relationship between nodes. A downstream task runs only after its upstream node completes successfully, ensuring that the downstream task retrieves data only after the upstream tables are generated. This prevents errors caused by accessing unavailable data.
Procedure
-
In the top menu bar of the Dataphin homepage, choose Plan > Data Development.
-
On the Data Development page, select a project from the top menu bar. If you are using the Dev-Prod mode, you must also select an environment.
-
In the navigation pane on the left, choose Data Processing > Compute Task.
-
In the compute task list, click the target compute task to open its configuration tab.
-
In the right-side pane, click Properties to open the Properties panel. In the Scheduling dependencies section, configure the following parameters.
-
Upstream dependencies
-
Auto parse
-
For SQL tasks, click Auto parse to let Dataphin analyze the task code, identify upstream tasks and output tables, and add the parsed dependencies to the upstream dependency list. You can view, edit, or delete the parsed dependencies.
Note-
If an input table is generated by multiple tasks, Dataphin automatically adds all of those tasks as upstream dependencies.
-
For all parsed dependencies, the dependency period is set to Current cycle by default.
-
If the code references a project variable or does not specify a project, the system resolves the dependency to the production project name by default to ensure scheduling stability. For example, if the development project is named
onedata_dev:-
If the code contains
select * from s_order, the scheduling dependency is parsed asonedata.s_order. -
If the code contains
select * from ${onedata}.s_order, the scheduling dependency is parsed asonedata.s_order. -
If the code contains
select * from onedata.s_order, the scheduling dependency is parsed asonedata.s_order. -
If the code contains
select * from onedata_dev.s_order, the scheduling dependency is parsed asonedata_dev.s_order.
-
-
-
Add root node
If a task has no upstream dependencies, click Add root node to set a root node as its upstream dependency.
NoteEach tenant or enterprise is initialized with a virtual root node whose name starts with virtual_root_node.
-
Add previous cycle of this node
This option creates a self-dependency, where the task's execution depends on the success of its instance from the previous cycle (for example, the previous day or N hours ago).
-
Add dependency
If Automatic Parsing cannot resolve scheduling dependencies or the parsed configuration does not match the actual requirements, manually click +Add Dependency to add the node's upstream dependency.
-
Add dependency - Physical node
Select one or more physical nodes from the node list. You can search or filter by This project, Project, Node type, Node name, or Output table name.
-
Add dependency - Logical table node
Select one or more logical table nodes from the node list. You can search and filter by Logical table type, Business segment, and Logical table name.
To depend on specific fields in a logical table instead of the entire table, click
in the Dependent fields column to view and select the required fields.
When you add a dependency, the Dependency period and Dependency policy are automatically set to recommended values. To modify these settings, click
in the dependency list to edit the Dependency period and Dependency policy.-
Dependency period: The time range for the scheduled execution of the upstream task instance. Typically, this is the current day, covering the range [00:00~24:00).
-
Dependency policy: If multiple instances can exist within a dependency period, you must specify a dependency policy. You also need to select whether the dependency is met when the upstream task has Succeeded, Failed, or Completed (Succeeded or Failed). The default selection is Succeeded. For default policies for cross-cycle dependencies, see Appendix 2: Default policies for cross-cycle dependencies.
ImportantIf there is only one instance, you can select any policy. To ensure compatibility with potential changes to the upstream task's schedule, only relative path policies are supported.
-
Succeeded: The downstream task runs only if the upstream task succeeds. If the upstream task fails, the downstream task does not run.
-
Failed: The downstream task runs only if the upstream task fails after all automatic retries. If the upstream task succeeds, the downstream task does not run.
-
Completed (Succeeded or Failed): The downstream task runs whether the upstream task succeeds or fails after all automatic retries.
-
-
-
-
Output of this node
The system automatically generates an output name for the node. To add multiple output names, click Auto-generate output name.
ImportantThe system uses output names to build the scheduling dependency graph. The output name is generated automatically. Manual modification is not recommended.
-
-
Click OK to save the scheduling dependency configuration.
Preview dependency period and policy
-
Click Properties for the target offline compute task. In the Properties panel, navigate to the Scheduling dependencies section.
-
In the Upstream dependencies list of the Scheduling dependencies section, click the
icon in the Actions column for the target dependency. -
In the Edit dependency dialog box, you can view information such as the dependent node name, dependency period, dependency policy, and a preview of the node dependency period.
-
Dependency period: The time range for the scheduled execution of the upstream task instance. Typically, this is the current day, covering the range [00:00~24:00).
-
Dependency policy: If multiple instances can exist within a dependency period, you must specify a dependency policy. If there is only one instance, you can select any policy. To ensure compatibility with potential changes to the upstream task's schedule, only relative path policies are supported.
-
Node dependency period preview: View the instance list for the current node and the selected upstream node for a specific business date.

Section
Description
① Instance list of the selected upstream node
-
Business date: This date is determined by the Dependency period and the selected business date for the current node.
-
If the dependency period is Current cycle, the business date is the same as the current node's business date.
-
If the dependency period is Previous cycle, the business date is one day before the current node's business date.
-
If the dependency period is N days ago, the business date is N days before the current node's business date.
-
If the dependency period is Last 24 hours and the instances span two business dates, the business date is displayed as
{yyyy-MM-dd ~ yyyy-MM-dd}.
-
-
Instance list: Shows the total number of instances for the selected upstream node on a given business date.
-
If the total number of instances on the business date is less than or equal to 5, the instance list displays all instances.
-
If the total number of instances on the business date is greater than 5, you can click Expand all to view all instances.
-
If the dependent upstream instance is the first or last instance in its list, the UI displays the first instance and the last instance.
-
If a dependent upstream instance is not the first or last instance in its list, the UI displays the first instance, the dependent instance, and the last instance.
-
-
Instances are displayed sequentially in the format
Instance n ({Instance scheduled time}), where n starts from 1.
-
② Instance list of the current node
The total number of instances for the current node on the selected business date.
If the total number of instances on the business date is less than or equal to 5, the list displays all instances. If the total number of instances is greater than 5, the list displays only the first instance and the last instance. You can click Expand all to view all instances. The first instance (Instance 1) is selected by default. You can click an instance to select a different one.
Instances are displayed sequentially in the format
Instance n ({Instance scheduled time}), where n starts from 1.③ Connector line from the selected instance on the right to its dependent instance on the left
-
If the Dependency policy is First instance, Last instance, Nearest instance backward, or Nearest instance forward, a single line connects the selected instance on the right (current node) to a single instance on the left (upstream node).
-
If the Dependency policy is All instances, all instances on the left (upstream node) are selected. Connecting lines show that the selected instance on the right depends on all instances on the left.
-
-
Appendix 1: Default dependency periods and policies
|
Current node cycle |
Upstream node cycle |
Upstream self-dependency |
Default dependency period |
Default dependency policy |
|
Daily/Weekly/Monthly |
Daily |
Yes/No |
Current cycle |
Last instance |
|
Daily/Weekly/Monthly |
Hourly/Minutely |
No |
Current cycle |
All instances |
|
Daily/Weekly/Monthly |
Hourly/Minutely |
Yes |
Current cycle |
Last instance |
|
Monthly/Weekly/Daily/Hourly/Minutely |
Monthly/Weekly |
Yes |
Current cycle |
Last instance |
|
Monthly/Weekly/Daily/Hourly/Minutely |
Monthly/Weekly |
No |
Current cycle |
Last instance |
|
Hourly/Minutely |
Daily |
Yes/No |
Current cycle |
Last instance |
|
Hourly/Minutely |
Hourly/Minutely |
Yes/No |
Current cycle |
Last instance |
Appendix 2: Default policies for cross-cycle dependencies
In the following table, - indicates that the parameter is not applicable.
|
Current node cycle |
Upstream node |
Upstream node cycle |
Upstream self-dependency |
Default dependency period |
|
Monthly |
Current node (self-dependency) |
- |
- |
Previous cycle |
|
Weekly |
Current node (self-dependency) |
- |
- |
Previous cycle |
|
Daily |
Current node (self-dependency) |
- |
- |
Previous cycle |
|
Hourly |
Current node (self-dependency) |
- |
- |
Last 24 hours |
|
Minutely |
Current node (self-dependency) |
- |
- |
Last 24 hours |
|
Daily/Weekly/Monthly |
Other nodes |
Daily |
- |
Current cycle |
|
Daily/Weekly/Monthly |
Other nodes |
Hourly/Minutely |
No |
Current cycle |
|
Daily/Weekly/Monthly |
Other nodes |
Hourly/Minutely |
Yes |
Current cycle |
|
Monthly/Weekly/Daily/Hourly/Minutely |
Other nodes |
Monthly/Weekly |
Yes |
Current cycle |
|
Monthly/Weekly/Daily/Hourly/Minutely |
Other nodes |
Monthly |
No |
Current cycle |
|
Monthly/Weekly/Daily/Hourly/Minutely |
Other nodes |
Weekly |
No |
Current cycle |
|
Hourly/Minutely |
Other nodes |
Daily |
- |
Current cycle |
|
Hourly/Minutely |
Other nodes |
Hourly/Minutely |
- |
Current cycle |