A PAI Designer task calls and schedules machine learning tasks that you build on the PAI platform.
Prerequisites
Before you start, make sure that the following requirements are met:
-
You can create a machine learning workflow in PAI-Designer. For more information, see Create and manage workflows.
-
PAI is enabled as the compute source for your project. For more information, see Create a general-purpose project.
Limits
-
Dataphin supports importing only PAI Designer workflows that are created using MaxCompute compute resources.
-
You can create PAI Designer tasks only in Dataphin projects that use MaxCompute as the compute engine.
-
Only the new version of Machine Learning Designer integration is supported.
-
When you create a Machine Learning Designer workspace, you must select MaxCompute as the compute resource.
-
When you configure the PAI workspace, publish the scheduled workflow to the workspace using the AccessKey (AK) of the MaxCompute offline compute engine. Dataphin then retrieves all public and private workflows associated with the AK.
-
The amount of MaxCompute resources consumed by PAI tasks is not included in the project's total consumption that is displayed in the asset overview.
-
You must configure PAI parameters on the PAI Designer task.
-
If read or write data table nodes in a PAI Designer workflow access data from tables in other projects, automatic switching between the development and production environments is not supported for these tables. You must specify the data environment to access.
Procedure
-
On the Dataphin homepage, choose Development > Data Development from the top menu bar.
-
On the Development page, select a Project from the top menu bar. In Dev-Prod mode, you must also select an environment.
-
In the navigation pane on the left, choose Data Processing > Script Task. In the Script Task list, click the
icon and choose PAI Designer. -
In the New PAI Designer task dialog box, configure the following parameters.
Parameter
Description
Name
Enter a name for the offline computing task.
The name can be up to 256 characters in length. It cannot contain vertical bars (|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), or double quotation marks (").
Schedule Type
Select the scheduling type for the machine learning task.
-
Recurring Task Node: The created machine learning node is an auto triggered task. Dataphin automatically schedules it. For more information about auto triggered tasks, see Scheduling methods.
-
One-time Task Node: The created PAI Designer node is a one-time task. You must manually trigger it for subsequent scheduling. For more information about one-time tasks, see Manage one-time tasks.
Select Directory
Select the folder where you want to store the task.
If you do not have a folder, you can create a new folder as follows:
-
Above the task list on the left, click the
icon to open the New Folder dialog box. -
In the New Folder dialog box, enter a Name for the folder and select a location in Select Directory as needed.
-
Click OK.
Description
Enter a brief description of the machine learning task. The description can be up to 1,000 characters in length.
-
-
Click OK.
-
On the tab of the PAI Designer task, click Import Workflow above the code editor to import a developed workflow.
Note-
After the workflow is imported, it is displayed on the page. You can also click Edit Workflow at the top of the page to open and edit the workflow in Machine Learning Designer. For more information, see Create and manage workflows.
-
If the workflow is not displayed after import, click Refresh Workflow at the top of the page.
-
-
Click Property in the right-side pane. On the Property tab, configure the Basic Information, Runtime Parameter, Schedule Property (for auto triggered tasks), Schedule Dependency (for auto triggered tasks), Runtime Configuration, and Resource Configuration for the task.
-
Basic Information
Configure basic information for the task, such as its name, owner, and description. For more information, see Configure basic information of a task.
-
Runtime Parameter
If your task uses parameter variables, assign values on this tab. The variables are automatically replaced with the specified values during scheduling. For more information, see Configure runtime parameters for an offline task.
-
Schedule Property (for auto triggered tasks)
If the scheduling type is Recurring Task, configure the scheduling properties in addition to Basic Information. For more information, see Configure scheduling properties for an offline task.
-
Schedule Dependency (for auto triggered tasks)
If the scheduling type is Recurring Task, configure the scheduling dependencies in addition to Basic Information. For more information, see Configure scheduling dependencies for an offline task.
-
Runtime Configuration
Optionally, configure a task-level timeout and retry policy. If not configured, the default tenant-level settings are used. For more information, see Configure runtime settings for a compute task.
-
Resource Configuration
Configure a scheduling resource group for the task. The task consumes the resource quota of this group during scheduling. For more information, see Configure resources for a compute task.
-
-
On the tab of the PAI Designer task, save and commit the task.
-
Click the
icon above the code editor to save the code. -
Click the
icon above the code editor to commit the code.
-
-
On the Submitting Log page, confirm the Submission Content and the results of the Pre-check. Then, enter remarks. For more information, see Instructions on how to commit an offline computing task.
-
After you confirm the information, click Confirm and Commit.
What to do next
-
In Dev-Prod mode, after a task is committed, you must go to the release list to publish the task to the production environment. For more information, see Manage release tasks.
-
If you use Basic mode, you can schedule a committed PAI Designer task in the production environment. You can then view the published task in the Operation Center. For more information, see Manage integration and compute tasks and Manage one-time tasks.