Configure upstream dependencies for logical tables
Scheduling dependencies control the execution order of dimension and fact logical table tasks, ensuring that each node in a business workflow runs in the correct sequence and that business data is produced accurately and on time.
Procedure
-
In the top navigation bar of the Dataphin homepage, choose R&D > Data Development.
-
On the Data Development page, select a project from the top navigation bar. If you are using the Dev-Prod mode, you also need to select an environment.
-
In the left navigation pane, choose Standardized Modeling > Dimension Logical Table or Fact Logical Table. In the Logical Table list, click the name of the target logical table.
-
On the logical table tab, click Scheduling Configuration in the top menu bar to open the Scheduling Configuration panel.
-
In the Scheduling properties section, configure the upstream dependencies for the logical table.
-
Upstream dependencies
-
Auto-parse
-
Click Auto-parse. Dataphin analyzes the table's computing logic to identify upstream tasks and output tables, and adds all discovered dependencies to the upstream dependency list. You can then view, edit, or delete the parsed dependencies.
Note-
If an auto-parsed input table is associated with multiple output tasks, Dataphin sets all of them as upstream dependencies by default.
-
For all parsed dependencies, the dependency period is set to current period by default.
-
-
Add root node
If a task has no upstream dependencies, click Add Root Node to add the system's virtual root node as a dependency.
NoteEach tenant or enterprise has a virtual root node whose name starts with
virtual_root_node. -
Add previous cycle of the current node
Creates a self-dependency: the task's current run depends on the successful completion of its previous cycle (for example, the previous day or previous n hours).
-
Add dependency
If Auto-parse cannot resolve scheduling dependencies or the generated configuration does not match your requirements, click +Add Dependency to manually add upstream dependencies.
-
Add dependency - physical node
Select one or more physical nodes from the node list. You can search and filter by Current Project, project, node type, node name, or output table name.
-
Add dependency - logical table node
Select one or more logical table nodes from the node list. You can search and filter by logical table type, business category, and logical table name.
To depend on specific fields rather than the entire logical table, click
in the Dependent fields column to select the fields.
When you add a dependency, the Dependency Period and Dependency Policy are set to recommended values automatically. To modify them, click the
icon in the dependency list.-
Dependency period: The time range for the scheduled start time (trigger time) of the upstream task instance, typically the current day [00:00, 24:00).
-
Dependency policy: If multiple instances exist within a dependency period, specify a dependency policy and choose the required upstream task status: Succeeded, Failed, or Finished (Succeeded or Failed). The default is Succeeded. For default policies for cross-cycle dependencies, see Appendix: Default cross-cycle dependency policies.
ImportantIf there is only one instance, the dependency policy can be set to any option. To ensure compatibility with potential changes to an upstream task's scheduling settings, only relative-path policies are supported.
-
Succeeded: The downstream task can run only after the upstream task succeeds. If the upstream task fails, the downstream task does not run.
-
Failed: The downstream task can run only after the upstream task ultimately fails (after the final automatic retry). If the upstream task succeeds, the downstream task does not run.
-
Finished (Succeeded or Failed): The downstream task can run after the upstream task succeeds or ultimately fails (after the final automatic retry).
-
-
-
-
Output of this node
The system automatically generates an output name for the node. To add more output names, click Auto-generate Output Name.
ImportantOutput names are used to build the scheduling dependency graph. They are generated automatically, and manual modification is not recommended.
-
-
Click OK to complete the scheduling dependency configuration.
Appendix: Default cross-cycle dependency policies
|
Current node cycle |
Upstream node |
Upstream node cycle |
Upstream self-dependency |
Default dependency period |
|
Month |
Current node (self-dependency) |
- |
Previous cycle (previous month) |
|
|
Week |
Current node (self-dependency) |
- |
Previous cycle (previous day) |
|
|
Day |
Current node (self-dependency) |
- |
Previous cycle (previous day) |
|
|
Hour |
Current node (self-dependency) |
- |
Last 24 hours |
|
|
Minute |
Current node (self-dependency) |
- |
Last 24 hours |
|
|
Day/Week/Month |
another node |
Day |
Current period (current day) |
|
|
Day/Week/Month |
another node |
Hour/Minute |
No |
Current period (current day) |
|
Day/Week/Month |
another node |
Hour/Minute |
Yes |
Current period (current day) |
|
Month/Week/Day/Hour/Minute |
another node |
Month/Week |
Yes |
Current period (current day) |
|
Month/Week/Day/Hour/Minute |
another node |
Month |
No |
Current period (current day) |
|
Month/Week/Day/Hour/Minute |
another node |
Week |
No |
Current period (current day) |
|
Hour/Minute |
another node |
Day |
Current period (current day) |
|
|
Hour/Minute |
another node |
Hour/Minute |
Current period (current day) |