Create a database SQL task to run SQL statements against external data sources in Dataphin.
Limitations
-
Stored procedures are supported but not validated or parsed.
-
Executing a database SQL task requires data source execution permissions. Table-level permission checks are not supported.Apply for, renew, and return data source permissions.
-
MySQL and Oracle data source SQL tasks cannot trigger data source quality rules.Create data quality rules for a Dataphin table.
-
Only sample data can be downloaded from query results.
-
Offline computing template nodes do not support the database SQL task type.
-
In projects that use the Dev-Prod mode, if you use a data source configured only for the production environment, running backfill or other operations in the development environment may alter production data. Use with caution.
Procedure
-
In the top navigation bar of the Dataphin homepage, choose Develop > Data Development.
-
In the top navigation bar of the Develop page, select a Project. If your project is in Dev-Prod mode, you must also select an environment.
-
In the left-side navigation pane, choose Data Processing > Compute Task. In the Compute Task list, click the
icon and choose Database SQL. -
In the Create Database SQL Task dialog box, configure the following parameters.
Parameter
Description
Basic information
Task name
Enter a name for the compute task.
The name can be up to 256 characters long and cannot contain vertical bars (
|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), or double quotation marks (").Scheduling type
Select a scheduling type:
-
Periodic task: Runs automatically on a recurring schedule.
-
Manual task: Must be triggered manually.
Select directory
Select the directory to store the task.
To create a directory, click New Folder:
-
Above the compute task list, click the
icon to open the New Folder dialog box. -
In the New Folder dialog box, enter a folder Name and, if necessary, select its location in the Select Directory list.
-
Click OK.
Use template
To use a code template, turn on the Use Template switch, and then select a template and a template version. By default, the latest version of the selected template is used.
Template code is read-only — configure only the template parameters.Create an offline computing template.
NoteIf a database SQL task that uses an AnalyticDB for PostgreSQL, StarRocks, Doris, SelectDB, or GaussDB (DWS) data source references an offline computing template, you cannot change the task type.
Description
Enter a brief description of the task, up to 1,000 characters.
Data source configuration
Data source type
Select the data source type. Supported types are listed in the Offline Development-Database SQL column in Data sources supported by Dataphin.
With a template, the data source type is locked to the template's type.
Data source
Select a data source, or click + New Data Source to create one.
With a template, the data source defaults to the template's data source but can be changed.
Catalog
Required for Presto and Trino data sources.
Database/Schema
If the data source type is MySQL, PostgreSQL, AnalyticDB for PostgreSQL, Oracle, Presto, GaussDB (DWS), Microsoft SQL Server, ClickHouse, Hologres, Doris, openGauss, StarRocks, DM, OceanBase (Oracle tenant mode), SelectDB, Trino, PolarDB-X 2.0, or Lindorm, you must configure the Database/Schema after you configure the data source.
With a template, the Database/Schema defaults to the template's schema but can be changed.
-
-
Click OK.
-
In the code editor on the task tab, write the code for the database SQL task. Database SQL tasks support cross-node output parameters.Assign a value to a cross-node output parameter and Best practices for cross-node parameters. After writing the code, click Run above the code editor.
-
On the right sidebar, click Properties. In the Properties panel, configure the task's Basic Information, Parameters, scheduling properties (for periodic tasks), scheduling dependencies (for periodic tasks), runtime configuration, and resource configuration.
-
Basic information
Configure the task name, owner, and description.Configure basic information for a task.
-
Parameters
Assign values to parameter variables in the Properties panel. Dataphin substitutes the assigned values at scheduling time.Configure runtime parameters for an offline task.
-
Scheduling properties (for periodic tasks)
Required for Periodic Task scheduling type, in addition to Basic Information.Configure scheduling properties for offline tasks.
-
Scheduling dependencies (for periodic tasks)
Required for Periodic Task scheduling type, in addition to Basic Information.Configure scheduling dependencies for offline tasks.
ImportantAutomatic dependency parsing can be inaccurate. Verify the dependencies based on your code.
-
Runtime configuration
Set a task-level runtime timeout and retry policy. Tenant-level defaults apply if not configured.Configure runtime settings for a compute task.
-
Resource configuration
Database SQL tasks use shared resources and do not support custom resource groups.Configure resources for a compute task.
-
-
On the task tab, save and commit the current task.
-
Click the
icon above the code editor to save the code. -
Click the
icon above the code editor to commit the code.
-
-
On the Commit Details page, verify the Content to Commit and Pre-check results, and enter a commit description.Commit an offline compute task.
-
Click Confirm and Commit.
What to do next
-
In Dev-Prod mode, publish the task to the production environment from the release list after committing.Manage release tasks.
-
In Basic mode, the task is scheduled in the production environment after submission. View published tasks in O&M:Manage integration and compute tasks, Manage manual tasks.