Serverless Kyuubi node development-DataWorks(DataWorks)-阿里云帮助中心

Usage notes

Compute resource limitations: Only bound EMR Serverless Spark compute resources are supported. You must ensure that the resource group and the compute resource can communicate over the network.

Resource group constraints: This task runs only in a Serverless resource group.
(Optional, for RAM users) The Resource Access Management (RAM) user for task development must be added to the workspace and assigned the Development or Workspace Administrator role (this role includes extensive permissions and must be granted with caution). For more information, see Add workspace members.

If you are using a root account, skip this step.

Create a node

For instructions, see Create a node.

Develop the node

Write task code in the SQL editor. Define variables by using the ${variable_name} syntax, and then assign values in the Scheduling Parameters section of the Scheduling Settings on the right. The system replaces variables with assigned values at runtime. For more information, see Scheduling parameter sources and expressions. Example:

SHOW TABLES;
SELECT * FROM kyuubi040702 WHERE age >= '${a}'; -- Use with a scheduling parameter.

Note

An SQL statement cannot exceed 130 KB.

Debug the node

In the Run Configuration, configure parameters such as Compute Resource and Resource Group.

Parameter	Description
Compute Resource	Select a bound EMR Serverless Spark compute resource. You must first bind an EMR Serverless Spark compute resource. If no compute resources are available, select Create Compute Resource from the drop-down list to create one.
Resource Group	Select a resource group that is bound to the workspace.
Script Parameters	If you define variables by using the `${parameter_name}` syntax in the node content, specify the Parameter name and Parameter Value in the Script Parameters section. The system replaces the variables with actual values at runtime. For more information, see Scheduling parameter sources and expressions.
ServerlessSpark Node Parameters	Built-in Spark property parameters. For more information, see open source Spark property parameters and EMR Serverless Spark configuration parameters. Configuration format: `"spark.eventLog.enabled": false` . Note You can set global Spark parameters at the workspace level to apply them to all DataWorks modules, and configure whether they take priority over module-level parameters. For more information, see Configure global Spark parameters.

On the toolbar at the top of the node editor, click Run to run the task.

Important
Before you deploy the node, synchronize the Serverlessspark Node Parameters from Run Configuration to the Serverlessspark Node Parameters section of Scheduling Settings.

Next steps

Configure node scheduling: If you need to run a node periodically, configure its Scheduling Policy in the Scheduling Settings panel on the right.
Publish a node: To run a task in the production environment, click the icon to publish the node. A node runs on schedule only after it is published to the production environment.
Task O&M: After a task is published, you can monitor the status of its periodic runs in the Operation Center. For more information, see Get started with Operation Center.