Create a workspace-E-MapReduce(EMR)-阿里云帮助中心

All job development is performed within a workspace, so you must create one first. This topic describes how to create a workspace in the E-MapReduce console.

Prerequisites

You have registered an Alibaba Cloud account and completed real-name verification.
You must have an account with the permissions required to create a workspace:
- If you use your Alibaba Cloud account, see Assign roles to an Alibaba Cloud account for authorization details.
- If you use a RAM user or a RAM role, ensure that the AliyunEMRServerlessSparkFullAccess, AliyunOSSFullAccess, and AliyunDLFFullAccess access policies are attached to the RAM user or RAM role. Then, on the Access Control page, add the RAM user or RAM role and grant it the administrator role. For more information, see Grant permissions to a RAM user and Manage users and roles.
Data Lake Formation (DLF) is activated. For more information, see Quick Start. For a list of regions that support DLF, see Regions and endpoints.
Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create a bucket.

Precautions

You are responsible for managing and configuring the runtime environment for your code.

Create a subscription workspace

Go to the EMR Serverless Spark page.
1. Log on to the EMR console.
2. In the left-side navigation pane, choose EMR Serverless > Spark.
3. In the top navigation bar, select a region based on your requirements.
  Important
  You cannot change the region of a workspace after you create it.
Click Create Workspace.

On the Create Workspace page, configure the following parameters.

Parameter	Description	Example
Region	Select the region where your data is located.	China (Hangzhou)
Billing Method	Select Subscription.	subscription
Workspace Name	The name must be 1 to 64 characters long and can contain only Chinese characters, letters, digits, hyphens (-), and underscores (_). Note If you enter a name that is already in use, the system prompts you to provide a different one.	emr-serverless-spark
CU Quota	The maximum number of concurrent compute units (CUs) available for processing jobs in the workspace. Note The maximum CU quota for a workspace is 1,000 CUs. If you need a higher quota, submit a ticket.	1000
Workspace Directory	The directory used to store data files, such as job logs, runtime events, and resources. Select a bucket with OSS-HDFS enabled for native HDFS interface compatibility. If your use case does not require HDFS, you can select a standard OSS bucket. Note You can specify either a parent directory or a subdirectory for the OSS path based on your needs. Spark automatically creates a directory named after the workspace ID under the specified OSS path to store the following data: <workspace-ID>/spark/logs/: Spark job logs <workspace-ID>/spark/eventlogs/: Spark event logs <workspace-ID>/spark/snapshot/: Spark snapshot data	emr-oss-hdfs
DLF for Metadata Storage	Stores and manages your metadata. You can select a DLF or DLF-Legacy catalog. If DLF is activated, EMR defaults to a DLF catalog. If only DLF-Legacy is set up, EMR defaults to the DLF-Legacy catalog with an identical name to your UID. To use different data catalogs for different clusters, create a new catalog: Click Create Catalog. In the dialog box that appears, enter a Catalog Name and click Create Catalog. For more information, see Create a catalog. From the drop-down list, select the data catalog that you created.	emr-dlf
Execution Role	Specifies the role that Serverless Spark assumes to run jobs. The default role is AliyunEMRSparkJobRunDefaultRole. Serverless Spark uses this role to access your resources in other services, such as OSS and DLF. If you want to control the permissions of the execution role, you can use a custom execution role. For more information, see Execution role.	AliyunEMRSparkJobRunDefaultRole
(Optional) Advanced Settings	Tags: Tags are identifiers for your cloud resources. You can use tags to classify, search for, and aggregate resources that share the same characteristics, which improves resource management efficiency. You can bind up to 20 tags to each workspace. Each tag consists of a custom tag key and tag value. Tags also enable cost allocation and fine-grained management of pay-as-you-go resources. You can bind tags when you create a workspace, or add or modify tags at any time on the workspace list page. Binding tags to resources helps you classify them and optimize operations. For more information about tags, see What is a tag?.	Enter a custom tag key and tag value

Click Create Workspace.
Click Confirm Order and complete the payment.
After the payment is complete, you can see the Workspace being created on the EMR Serverless > Spark page. The workspace is usually created within 3 to 5 minutes.

Create a pay-as-you-go workspace

Go to the EMR Serverless Spark page.
1. Log on to the EMR console.
2. In the left-side navigation pane, choose EMR Serverless > Spark.
3. In the top navigation bar, select a region based on your requirements.
  Important
  You cannot change the region of a workspace after you create it.
Click Create Workspace.

On the Create Workspace page, configure the following parameters.

Parameter	Description	Example
Region	Select the region where your data is located.	China (Hangzhou)
Billing Method	Select Subscription.	pay-as-you-go
Workspace Name	The name must be 1 to 64 characters long and can contain only Chinese characters, letters, digits, hyphens (-), and underscores (_). Note If you enter a name that is already in use, the system prompts you to provide a different one.	emr-serverless-spark
Maximum Quota	Maximum compute units (CUs) available for concurrent job execution in the workspace. Note The maximum burst quota for a workspace is 5,000 CUs. If you need a higher quota, submit a ticket.	1000
Workspace Directory	The directory used to store data files, such as job logs, runtime events, and resources. Select a bucket with OSS-HDFS enabled for native HDFS interface compatibility. If your use case does not require HDFS, you can select a standard OSS bucket. Note You can specify either a parent directory or a subdirectory for the OSS path based on your needs. Spark automatically creates a directory named after the workspace ID under the specified OSS path to store the following data: <workspace-ID>/spark/logs/: Spark job logs <workspace-ID>/spark/eventlogs/: Spark event logs <workspace-ID>/spark/snapshot/: Spark snapshot data	emr-oss-hdfs
DLF for Metadata Storage	Stores and manages your metadata. You can select a DLF or DLF-Legacy catalog. If DLF is activated, EMR defaults to a DLF catalog. If only DLF-Legacy is set up, EMR defaults to the DLF-Legacy catalog with an identical name to your UID. To use different data catalogs for different clusters, create a new catalog: Click Create Catalog. In the dialog box that appears, enter a Catalog Name and click Create Catalog. For more information, see Create a catalog. From the drop-down list, select the data catalog that you created.	emr-dlf
Execution Role	Specifies the role that Serverless Spark assumes to run jobs. The default role is AliyunEMRSparkJobRunDefaultRole. Serverless Spark uses this role to access your resources in other services, such as OSS and DLF. If you want to control the permissions of the execution role, you can use a custom execution role. For more information, see Execution role.	AliyunEMRSparkJobRunDefaultRole
(Optional) Advanced Settings	Tags: Tags are identifiers for your cloud resources. You can use tags to classify, search for, and aggregate resources that share the same characteristics, which improves resource management efficiency. You can bind up to 20 tags to each workspace. Each tag consists of a custom tag key and tag value. Tags also enable cost allocation and fine-grained management of pay-as-you-go resources. You can bind tags when you create a workspace, or add or modify tags at any time on the workspace list page. Binding tags to resources helps you classify them and optimize operations. For more information about tags, see What is a tag?.	Enter a custom tag key and tag value

Click Create Workspace.

Prerequisites

Precautions

Create a subscription workspace

Create a pay-as-you-go workspace

Related documents