Overview of Offline Task Property Configuration-Dataphin(Dataphin)-阿里云帮助中心

To run an offline task on a recurring schedule, define its scheduling properties. These include the scheduling cycle, scheduling dependencies, and scheduling parameters. This topic describes offline task properties and scheduling.

Important Notes

The system supports scheduling configuration only for offline computing tasks with the scheduling type set to auto triggered task.
A dependency defines the execution order between two nodes. The status of an upstream node affects the execution status of downstream nodes.
When you configure dependencies, the system schedules downstream nodes as follows. First, it waits until the upstream node completes successfully. Then, it checks whether the scheduled time for the downstream node has been reached.
If you submit scheduling configuration before the scheduled time, the configuration takes effect after that time. If you configure dependencies after the scheduled time, the system creates instances one day later.
Scheduling configuration defines only the properties used when the task runs on schedule. To apply this configuration, publish the task to the production environment.
The scheduled time defines the expected execution time. The actual execution time depends on upstream node status. For details about task execution conditions, see Instance Run Diagnostics.

Access Offline Task Properties

On the Dataphin homepage, in the top menu bar, click Develop > Data Development.
On the Develop page, in the top menu bar, click Project.
In the navigation pane on the left, select Data Processing > Compute Job. In the Compute Job list, click the target job name.
On the task tab, click Property on the right to open the Property panel.

Configure Offline Task Properties

On the offline task property page, configure the task’s basic information and scheduling properties using the table below.

Configuration Item	Description
Basic Information	Includes task name, ID, node type, development owner, O&M owner, and description. Task Name: The name entered when you created the task. Node ID: A unique identifier for the node. The system generates it after you submit the node. Development Owner: Defaults to the current user. You can select any member of the current project. Note In the production environment, you cannot configure the development owner. The value from the development environment applies. O&M Owner: Defaults to the node creator. You can select any member of the current project as the O&M owner.
Runtime Resources	CPU and memory resources assigned to run the task. Note This setting applies only to Python, Shell, Spark on MaxCompute, Spark on Yarn, MapReduce on MaxCompute, and MapReduce on Yarn tasks.
Python Third-Party Packages	Select the Python third-party packages to import. Note This setting applies only to Python and Shell tasks. After you add a third-party module to Python Third-Party Packages, you must declare a reference to it in the task before importing it in your code. You can configure the referenced module in Compute Task Properties > Python Third-Party Packages.
Runtime Parameters	Define parameters used during node scheduling. Dataphin provides built-in parameters and supports custom parameters to enable dynamic parameter assignment at runtime. Note If you define variables in your node code, assign values to them here. If no variables are defined, skip this step.
Scheduling Properties	Define how the task runs on a recurring schedule in the production environment. Scheduling Type: Defines the execution status of task instances in the production environment. Priority: Sets the task priority. When you create a task, the system uses the default priority from Management Hub > Development Platform > Node Task Settings > Default Priority. Note After you publish the task to the production environment or submit it in Basic mode, you cannot change the priority when editing the task. Update it in O&M operations in the production environment. The priority value reflects the latest setting in the production environment. Effective Date: Defines the date range during which the task runs on schedule. After this date, the system stops generating instances. Scheduling Cycle: Defines how often the task runs. Conditional Scheduling: Defines conditions under which the task runs. You can set multiple condition groups. The system evaluates them in order from top to bottom. When a condition matches, the system runs the corresponding schedule and stops evaluating further conditions. If no condition matches, the system uses the default schedule.
Scheduling Dependencies	Define upstream and downstream dependencies for the task. Dependencies ensure orderly execution: downstream nodes start only after upstream nodes succeed. This guarantees timely delivery of valid business data. Use automatic parsing to quickly set dependencies, or add them manually.
Runtime Configuration	Define the timeout period and retry policy for failed task runs. This prevents resource waste from long-running tasks and improves reliability.
Resource Configuration	Select the resource group for the compute task. The system uses resources from this group when scheduling the task.

What to do next

After configuring task properties, submit and publish the task to the production environment. Then perform related O&M operations in the production environment. For details, see Operation Center.

上一篇: Offline task properties 下一篇: Configure basic task information