This topic explains how to use Resource Management to create different types of CDH resources and functions.
Prerequisites
-
You have registered a CDH cluster with DataWorks. All resource and function operations use CDH compute resources.
-
Your resource files have been developed and are ready to be uploaded from your local machine.
Access resource management
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
-
In the left-side navigation pane, click the Resource Management icon
to go to the Resource Management page. -
On the Resource Management page, click the
icon to create a resource or function. Alternatively, you can first create a directory to organize your resources by clicking Create Directory. Then, right-click the target directory, select Create, and choose the type of resource or function that you want to create.
Create and use resources
Resources
In Data Studio, you can upload local resources to a CDH cluster through DataWorks. The following table lists the supported resource types for developing CDH jobs or creating custom functions.
|
Resource type |
Description |
Supported upload methods |
|
|
Local |
OSS |
||
|
CDH Jar |
A compiled Java JAR package used to run Java programs. The file extension is |
|
|
|
CDH File |
You can upload any file type as a CDH File resource. Its use depends on the compute engine. |
||
Limitations
The following limitations apply when you upload resources:
-
Resource size:
-
Resource publishing: If you use a standard mode workspace, you must publish the resource to the production environment before you can use it.
NoteData source configurations may differ between the development and production environments. Before you query tables or use resources, confirm the data source configuration for the current environment.
-
Resource management: In DataWorks, you can view and manage only the resources that are uploaded through the DataWorks UI.
Create a resource
You can upload CDH resources from your local machine. After you create a resource, you can reference it directly in data development or register it as a custom function.
-
On the Resource Management page, create a resource, which opens the Create Resource and Function dialog box. Configure the resource Type, storage Path, and resource Name.
-
After you create the resource entry, you must upload a local file. The following table describes the key upload parameters:
Parameter
Description
Storage Path
The default path is
/user/admin/lib.NoteIf Kerberos authentication is enabled, you must first grant the current user write permissions to this directory.
Data Sources
Select an existing CDH data source.
Resource Group
Select a Serverless resource group that can connect to the CDH cluster.
-
In the top toolbar, Save and then Publish the resource. You can use only published resources in data development.
Use a resource
After you create a resource, you can reference it during data development. In the left-side navigation pane, click Resource Management, find the target resource or function, right-click it, and select Insert Resource Path. This action inserts a code snippet in the format ##@resource_reference{"Resource Name"} into your editor.
For example, in a CDH Hive node, the reference might look like ##@resource_reference{"example"}. The format may vary between different node types. Refer to the actual UI for the correct format.
In addition to using resources directly, you can also create a function from a resource and then use the function in your development nodes.
Create and use functions
Functions
Data Studio allows you to register resources as CDH functions. In data development or SQL queries, you can use both the built-in functions provided by Hive and the custom functions that you create.
Create a function
-
On the Resource Management page, create a function, which opens the Create Resource and Function dialog box. Configure the function Type, storage Path, and function Name.
-
Click Confirm to create the function. Then, configure the function details based on its type.
Before you configure a CDH function, ensure that you have registered the CDH cluster as a compute resource in DataWorks and have uploaded the required CDH resource. The following table describes the key parameters for a CDH function.
Parameter
Description
Function type
Select a function type: MATH (mathematical), AGGREGATE (aggregation), STRING (string manipulation), DATE (date), ANALYTIC (analytic), or OTHER (other).
Data Sources
Select an existing CDH data source from the drop-down list.
Class Name
-
The class name for the user-defined function (UDF), in the format
ResourceName.ClassName. The resource name can be a Java package name or a file resource name. -
When you create a custom function in DataWorks, you can use either JAR or File type CDH resources. If the resource type is JAR, the Class Name format is
PackageName.ActualClassName. You can obtain this value fromIntelliJ IDEAby using theCopy Referencecommand. For example, if the package name iscom.aliyun.cdh.examples.udfand the actual class name isUDAFExample, set the Class Name parameter tocom.aliyun.cdh.examples.udf.UDAFExample.
Note-
Do not include the
.jarsuffix when you enter the resource name. -
You must publish the resource before you can use it.
Resource List
For a CDH function, only visual mode is supported. This requires selecting a CDH Jar or CDH File resource.
Command Format
A usage example for the UDF.
-
-
In the top toolbar, Save and then Publish the function. You can use only published functions in data development.
Use a function
After you create and publish a function, you can reference it directly in data development or SQL queries.
-
When you edit a data development node, click Resource Management in the left-side navigation pane. Find the target function, right-click it, and select Insert Function.
The function name, such as
example_function(), is automatically inserted into the editor. -
When you edit a SQL query, you can use the created function directly in your SQL statement.
SELECT example_function(column_name) FROM table;
Manage resources and functions
After you create resources and functions, you can manage them from the Resource Management page. Click a resource or function to open it in the editor.
-
View historical versions: In the right-side pane of the editor, click the versions icon. You can view and compare saved or submitted versions to track changes.
NoteYou must select at least two versions to run a comparison.
-
Delete a resource or function: In the Resource Management pane, right-click the target item and select Delete.
To delete a resource or function from the production environment, you must publish this change. After the publishing task is complete, the item is deleted from the production environment.