Offline mode dependency configuration

更新时间: 2026-06-23 10:57:33

Dataphin runs nodes in a business process in the correct order based on their scheduling dependencies, ensuring that business data is delivered on time. This topic describes how to configure offline mode dependencies for stream-batch integrated tasks.

Background information

Scheduling dependencies define the upstream and downstream relationships between nodes. In Dataphin, a downstream task node starts only after its upstream task node has completed successfully. This ensures that each task receives the correct data at runtime. Dataphin checks the running status of the upstream node to determine whether the latest upstream table data is available, preventing the downstream node from reading data before it is ready.

Procedure

  1. Access the Offline Mode configuration panel by referring to Offline Mode Configuration Entry.

  2. In the Dependency section of the offline mode configuration panel, set the Dependency parameters.

    Parameter

    Description

    Start Parsing

    If the node's task type is SQL, you can click Start parsing. The system parses the tables in the code and matches table names against output names. The node associated with a matching output name becomes the upstream dependency for the current node.

    If the code references project variables or does not specify a project, the system defaults to the production project name to ensure scheduling stability. For example, if the development project name is onedata_dev:

    • Code specifying select * from s_order results in a dependency of onedata.s_order.

    • Code with select * from ${onedata}.s_order also results in a dependency of onedata.s_order.

    • Code specifying select * from onedata.s_order results in a dependency of onedata.s_order.

    • Code specifying select * from onedata_dev.s_order results in a dependency of onedata_dev.s_order.

    Upstream Dependency

    To add an upstream node that the current node depends on for scheduling:

    1. Click Manually Add Upstream.

    2. In the New Upstream Dependency dialog box, search for dependency nodes by:

      • Entering the output name keyword of the dependent node.

      • Entering virtual to find virtual nodes (each tenant or enterprise has a root node upon initialization).

      Note

      Note: The output name of the node is globally unique and case-insensitive.

    3. Click Confirm Addition.

    You can also click the Actions column's fagaga icon to delete the added dependency node.

    Current Node

    To set the output name of the current node so that other nodes can depend on it:

    1. Click Manually Add Output.

    2. In the Add Current Node Output dialog box, enter the output name. Use a consistent naming convention, typically project name.table name (case-insensitive). This helps identify the table produced by this node and makes it easier for other nodes to select it as a scheduling dependency.

      For example, for a development project named onedata_dev, the recommended output name is onedata.s_order. Setting the output name to onedata_dev.s_order means only code specifying select * from onedata_dev.s_order can parse the upstream dependency node.

    3. Click Confirm Addition.

    For existing output names on the current node, you can:

    • To delete the added output name, click the Actions column's fagaga icon.

    • If the node has been submitted or published and has downstream dependencies (with submitted tasks), click the Actions column's icon to view the dependent downstream nodes.

  3. Complete the offline mode dependency configuration by clicking Confirm.

上一篇: Offline mode dependency file configuration 下一篇: Offline mode operation configuration
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈