Create and manage MaxCompute nodes

更新时间:
复制 MD 格式

DataWorks provides multiple MaxCompute node types and flexible scheduling configurations for task development. This topic covers how to create and manage MaxCompute nodes.

Prerequisites

Your account must be added to the workspace with the Development or Workspace Administrator role. The Workspace Administrator role has extensive permissions — grant it with caution. Add workspace members.

Create a MaxCompute node

  1. Log on to the DataWorks console. In the target region, click Data Development and O&M > Data Development in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.

  2. Bind a MaxCompute compute resource to the workspace and create a workflow.

    In Data Studio, workflows organize development by compute engine. Create a workflow before creating a node.
  3. The following example uses an ODPS SQL node.

    1. Right-click a workflow and choose Create Node > MaxCompute > ODPS SQL. Alternatively, click Create in the top menu bar and follow the prompts.

      Important

      If the Create Node > MaxCompute > ODPS SQL option is unavailable, click Computing Resource in the left-side navigation pane to verify that a MaxCompute compute resource is bound. Bind the resource and refresh the page before creating MaxCompute nodes.

    2. In the dialog box, enter a node name and click OK. The node editor opens for task development and configuration.

      image.png

Develop MaxCompute tasks

The following table lists the supported MaxCompute node types.

Note
  • Running a MaxCompute task displays a cost estimate. MaxCompute charges this fee, and the actual amount appears on your bill. MaxCompute billing items and methods.

  • A cost estimate error usually means the table does not exist or you lack required permissions. You can ignore the error and address it when the node runs.

Type

Scenario

Guide

ODPS SQL

Develops MaxCompute SQL tasks.

Develop an ODPS SQL task

SQL Snippet

Develops MaxCompute SQL tasks.

When multiple SQL procedures share similar logic but reference different tables with identical or compatible structures, you can abstract the common logic into an SQL Snippet and parameterize the input and output tables for reuse.

SQL Snippet overview

PyODPS 3

Develops MaxCompute PyODPS tasks. The PyODPS 3 node is based on Python 3.

Develop a PyODPS 3 task

PyODPS 2

Develops MaxCompute PyODPS tasks. The PyODPS 2 node is based on Python 2.

Develop a PyODPS 2 task

ODPS Spark

Develops MaxCompute Spark tasks.

Develop an ODPS Spark task

ODPS Script

Develops MaxCompute SQL script tasks.

Develop an ODPS Script task

ODPS MR

Develops MaxCompute MapReduce tasks.

Develop an ODPS MR task

Create tables, resources, and functions

Beyond task development, DataWorks supports tables, resources, and functions for greater efficiency.

Next steps

After development, proceed with the following operations:

  • Scheduling configuration: Configure periodic scheduling properties such as rerun settings and dependencies for tasks that run regularly. Overview of task scheduling configuration.

  • Task debugging: Test and run the node code to verify its logic. Task debugging process.

  • Task deployment: Deploy nodes to run them periodically based on their scheduling configurations. Deploy tasks.