Spark SQL execution modes-AnalyticDB(AnalyticDB)-阿里云帮助中心

AnalyticDB for MySQL supports executing Spark SQL in two modes: batch and interactive. Both modes connect directly to the AnalyticDB for MySQL metadata service, allowing you to read from and write to AnalyticDB for MySQL databases and tables. This topic describes the usage notes, use cases, features, and startup procedures for both execution modes.

Batch execution

Usage notes

When executing SQL in batch mode, begin with the USE <database_name>; statement to select a database.
When you specify a table in an SQL statement, you must use thedatabase_name.table_name format.
When you run DML, DDL, or DQL statements in batch mode, the system returns only a success or failure status, not the data. A successful execution prints a sample of its results to the log. To view the data returned by an SQL statement, see View Spark application information.

Use cases

Executing multiple, interdependent SQL statements.
Enforcing strict resource isolation for SQL statements.
Processing large-scale data, such as running a one-time ETL job with SQL.
Uploading and iterating on complex third-party dependency packages.

Features

SQL submitted in batch mode runs in a dedicated Spark application, ensuring stability.
Supports independent configuration settings defined within an SQL statement, such asSET spark.sql.adaptive.coalescePartitions.minPartitionSize = 2MB;.
If a batch SQL statement contains a SELECT statement, the system prints a sample of the results to the log.

Startup method

In the SQL Console, select a job resource group and the Spark engine, enter your SQL statement, and click Run Now.

Interactive execution

Usage notes

In interactive mode, DML statements return a maximum of 1,000 rows.
DDL statements executed in interactive mode, such asCREATE TABLE, return only a success or failure status, not data from the table. This behavior is consistent with open-source conventions.
A Spark interactive resource group takes some time to start. If startup fails, wait a few moments and try again.

Use cases

Running ad hoc queries where a full result set is not required.
Executing a large number of DDL statements.
For immediate execution of DQL statements in environments where resource isolation and occasional failures are acceptable.

Features

Resource isolation is thread-level. Multiple users executing SQL in the same Spark application can interfere with each other.
You can apply thread-level configuration settings by using SQL statements.
Application-level configuration settings take effect only after you restart the Spark interactive resource group. To modify application-level settings, you must first stop the Spark interactive resource group, configure the parameters, and then restart the Spark interactive resource group.

Startup method

In the SQL Console, select a Spark interactive resource group and the Spark engine, enter your SQL statement, and click Run Now.