Overview of Offline Task Property Configuration
To run an offline task on a recurring schedule, define its scheduling properties. These include the scheduling cycle, scheduling dependencies, and scheduling parameters. This topic describes offline task properties and scheduling.
Important Notes
The system supports scheduling configuration only for offline computing tasks with the scheduling type set to auto triggered task.
A dependency defines the execution order between two nodes. The status of an upstream node affects the execution status of downstream nodes.
When you configure dependencies, the system schedules downstream nodes as follows. First, it waits until the upstream node completes successfully. Then, it checks whether the scheduled time for the downstream node has been reached.
If you submit scheduling configuration before the scheduled time, the configuration takes effect after that time. If you configure dependencies after the scheduled time, the system creates instances one day later.
Scheduling configuration defines only the properties used when the task runs on schedule. To apply this configuration, publish the task to the production environment.
The scheduled time defines the expected execution time. The actual execution time depends on upstream node status. For details about task execution conditions, see Instance Run Diagnostics.
Access Offline Task Properties
On the Dataphin homepage, in the top menu bar, click Develop > Data Development.
On the Develop page, in the top menu bar, click Project.
In the navigation pane on the left, select Data Processing > Compute Job. In the Compute Job list, click the target job name.
On the task tab, click Property on the right to open the Property panel.
Configure Offline Task Properties
On the offline task property page, configure the task’s basic information and scheduling properties using the table below.
Configuration Item | Description |
Includes task name, ID, node type, development owner, O&M owner, and description.
| |
CPU and memory resources assigned to run the task. Note This setting applies only to Python, Shell, Spark on MaxCompute, Spark on Yarn, MapReduce on MaxCompute, and MapReduce on Yarn tasks. | |
Python Third-Party Packages | Select the Python third-party packages to import. Note
|
Define parameters used during node scheduling. Dataphin provides built-in parameters and supports custom parameters to enable dynamic parameter assignment at runtime. Note If you define variables in your node code, assign values to them here. If no variables are defined, skip this step. | |
Define how the task runs on a recurring schedule in the production environment.
| |
Define upstream and downstream dependencies for the task. Dependencies ensure orderly execution: downstream nodes start only after upstream nodes succeed. This guarantees timely delivery of valid business data. Use automatic parsing to quickly set dependencies, or add them manually. | |
Define the timeout period and retry policy for failed task runs. This prevents resource waste from long-running tasks and improves reliability. | |
Select the resource group for the compute task. The system uses resources from this group when scheduling the task. |
What to do next
After configuring task properties, submit and publish the task to the production environment. Then perform related O&M operations in the production environment. For details, see Operation Center.