How to create and run Spark applications in the AnalyticDB for MySQL console-AnalyticDB(AnalyticDB)-阿里云帮助中心

Overview

An AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster is created.
A job resource group is created for the AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster.
You have granted the required permissions to a RAM user. For more information, see Authorize a RAM user.
A database account is created for the AnalyticDB for MySQL cluster.
- If you use an Alibaba Cloud account, you need to only create a privileged account.
- If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user.
AnalyticDB for MySQL is authorized to assume the AliyunADBSparkProcessingDataRole role to access other cloud resources.
The log storage path of Spark applications is configured.
Note
Log on to the AnalyticDB for MySQL console. Find the cluster that you want to manage and click the cluster ID. In the left-side navigation pane, choose Job Development > Spark JAR Development. Click Log Settings. In the dialog box that appears, select the default path or specify a custom storage path. You cannot set the custom storage path to the root directory of OSS. Make sure that the custom storage path contains at least one layer of folders.

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, click Job Development > Spark JAR Development.
On the Spark JAR Development page, click the icon to the right of Workspaces.

In the Create Application panel, set the following parameters.

Parameter	Description
Name	The name of the application or directory. Application and directory names are case-insensitive.
Type	Application: Creates a file. Directory: Creates a directory.
Parent Level	The parent directory of the file or directory.
Job type	Batch: A batch application. Streaming: A streaming application. SQL Engine: A Spark distributed SQL engine.

After you configure the parameters, click OK to create the application template.
After you create the application template, you can configure the Spark application in the Spark editor. For more information, see Spark application development.
After you configure the Spark application, you can perform the following operations:

Click Save to save the Spark application for future use.
Click Immediately to run the Spark application. The Applications tab displays the real-time execution status.

Note

Before you run a Spark application, select a job resource group and an application type.

On the Workspaces tab, find an application by its Application ID and perform one of the following operations:
- Log: View the driver log for the Spark application or the execution log of SQL statements.
- UI: Open the application's Spark web UI. Access is temporary, and if the session expires, reopen the UI.
- Details: View submission information for the application, such as the log path, web UI URL, cluster ID, and resource group name.
- Stop: Stop the currently running application.
- History: View the retry history for the application.
On the Tuning History tab, view the retry history for all applications.

Note
By default, a failed application is not automatically retried. To enable retries, configure the retry parameters (spark.adb.maxAttempts and spark.adb.attemptFailuresValidityInterval). For more information, see Spark application configuration parameters.