Create a Shell offline computing task in Dataphin to run custom scripts on a schedule or on demand.
Limitations
-
You can add a dataset only after you enable the unstructured data feature.
-
Only Shell tasks in a Basic project support referencing datasets.
Permissions
The following project roles can use all datasets within the project in the task's Dataset properties:
-
Dev and Basic projects: project administrator, developer, and analyst.
-
Prod projects: project administrator and operator.
-
Custom project roles with the Dataset - Use permission.
Procedure
-
In the top navigation bar of the Dataphin homepage, choose Development > Data Development.
-
On the Development page, select a Project. In Dev-Prod mode, also select an environment.
-
In the navigation pane on the left, choose Data Processing > Compute Task. In the Compute Task list, click the
icon and select Shell. -
In the Create Shell Task dialog box, configure the following parameters.
Parameter
Description
Task name
Enter the task name.
Maximum 256 characters. The following characters are not allowed: | / \ : ? <> * "
Scheduling type
Select a scheduling type. Options:
-
Periodic Task: Runs automatically on a schedule.
-
Manual Task: Runs only when triggered manually.
Select directory
Select the folder to store the task.
To create a new folder:
-
Above the task list, click the
icon to open the Create Folder dialog box. -
In the Create Folder dialog box, enter a Name and optionally select a directory.
-
Click OK.
Use template
Turn on Use Template to apply a code template, then select a template and template version.
Template code is read-only — configure only the parameters. Create an offline computing template.
Python third-party package
Select Python third-party packages. Install a Python module.
NoteAfter adding a module, declare it in the task before importing it in your code. Configure referenced modules in the task's Python third-party package properties.
Description
Enter a task description. Maximum 1,000 characters.
-
-
Click OK.
-
In the code editor, write your Shell script. Click Run in the toolbar to execute it.
-
In the right-side pane, click Properties. In the Properties panel, configure General, Runtime resources, Python third-party package, Dataset, Runtime parameters, Scheduling properties, Scheduling dependencies, Runtime configurations, and Resource configurations.
-
General
Basic task information: name, owner, and description. Configure general information for a task.
-
Runtime resources
CPU and memory for the task. Defaults to 0.1 core and 256 MB. Configure runtime resources for an offline task.
-
Python third-party package
Select Python packages to include. Install a Python module.
-
Dataset
Select up to 5 datasets to reference. .
-
Runtime parameters
Assign values to parameter variables in your task code. During scheduled runs, variables are replaced with assigned values. Configure runtime parameters for an offline task.
-
Scheduling properties (for periodic tasks)
Required for Periodic Task scheduling type. Configure in addition to Basic Information. Configure scheduling properties for offline tasks.
-
Scheduling dependencies (for periodic tasks)
Required for Periodic Task scheduling type. Configure in addition to Basic Information. Configure offline task scheduling dependencies.
-
Runtime configurations
Task-level settings such as timeout and retry policy. Inherits tenant defaults if not configured. Configure runtime settings for a compute task.
-
Resource configurations
Assign a scheduling resource group. The task consumes the group's resource quota during scheduling. Configure resources for a compute task.
-
-
On the Shell task tab, save and submit the task.
-
In the toolbar, click the
icon to save. -
Click the
icon to submit.
-
-
On the Submission Details page, review the Content to Submit and pre-check results, then enter any remarks. Submission instructions for offline compute tasks.
-
Click OK and Submit.
Next steps
-
In Dev-Prod mode, release the submitted task to production from the release list. Manage release tasks.
-
In Basic mode, the task is immediately ready for scheduling. View and manage it in the operation center. Manage integration and compute tasks, Manage manual tasks.