Spark editor
This topic describes how to create and run Spark applications in the AnalyticDB for MySQL console.
Overview
-
Create and run Spark batch or streaming applications.
-
View the driver log and submission information for your Spark application.
-
View the execution log of SQL statements.
Prerequisites
-
An AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster is created.
-
A job resource group is created for the AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster.
-
You have granted the required permissions to a RAM user. For more information, see Authorize a RAM user.
A database account is created for the AnalyticDB for MySQL cluster.
If you use an Alibaba Cloud account, you need to only create a privileged account.
If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user.
-
AnalyticDB for MySQL is authorized to assume the AliyunADBSparkProcessingDataRole role to access other cloud resources.
The log storage path of Spark applications is configured.
NoteLog on to the AnalyticDB for MySQL console. Find the cluster that you want to manage and click the cluster ID. In the left-side navigation pane, choose . Click Log Settings. In the dialog box that appears, select the default path or specify a custom storage path. You cannot set the custom storage path to the root directory of OSS. Make sure that the custom storage path contains at least one layer of folders.
Create and run a Spark application
-
Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
-
In the left-side navigation pane, click .
-
On the Spark JAR Development page, click the
icon to the right of Workspaces. -
In the Create Application panel, set the following parameters.
Parameter
Description
Name
The name of the application or directory. Application and directory names are case-insensitive.
Type
-
Application: Creates a file.
-
Directory: Creates a directory.
Parent Level
The parent directory of the file or directory.
Job type
-
Batch: A batch application.
-
Streaming: A streaming application.
-
SQL Engine: A Spark distributed SQL engine.
-
-
After you configure the parameters, click OK to create the application template.
-
After you create the application template, you can configure the Spark application in the Spark editor. For more information, see Spark application development.
-
After you configure the Spark application, you can perform the following operations:
-
Click Save to save the Spark application for future use.
-
Click Immediately to run the Spark application. The Applications tab displays the real-time execution status.
Before you run a Spark application, select a job resource group and an application type.
View Spark application information
-
On the Workspaces tab, find an application by its Application ID and perform one of the following operations:
-
Log: View the driver log for the Spark application or the execution log of SQL statements.
-
UI: Open the application's Spark web UI. Access is temporary, and if the session expires, reopen the UI.
-
Details: View submission information for the application, such as the log path, web UI URL, cluster ID, and resource group name.
-
Stop: Stop the currently running application.
-
History: View the retry history for the application.
-
-
On the Tuning History tab, view the retry history for all applications.
NoteBy default, a failed application is not automatically retried. To enable retries, configure the retry parameters (
spark.adb.maxAttemptsandspark.adb.attemptFailuresValidityInterval). For more information, see Spark application configuration parameters.