Manage SQL sessions

更新时间:
复制 MD 格式

A session is a Spark instance in an EMR Serverless Spark workspace. You use an sql session to run SQL queries and perform data science analysis. This topic describes how to create an sql session.

Create an SQL session

Once you create an sql session, you can select it when creating an sql job.

  1. Go to the Sessions page.

    1. Log on to the EMR console.

    2. In the left-side navigation pane, choose EMR Serverless > Spark.

    3. On the Spark page, click the name of the target workspace.

    4. On the EMR Serverless Spark page, click Sessions in the left-side navigation pane.

  2. On the SQL Sessions page, click Create SQL Session.

  3. On the Create SQL Session page, configure the following parameters and click create.

    Important

    Set the maximum concurrency for the selected resource queue to be at least the resources required by the notebook session. The specific value is displayed in the console.

    Parameter

    Description

    Name

    The name of the new sql session.

    The length must be 1 to 64 characters. Only letters, digits, hyphens (-), underscores (_), and spaces are supported.

    Resource Queue

    Select a resource queue for the sql session. You can select a queue designated for development or one shared between development and production.

    For more information about queues, see Manage resource queues.

    Engine Version

    The engine version for the sql session. For more information about engine versions, see Engine versions.

    Use Fusion Acceleration

    Fusion Engine can accelerate Spark workloads and reduce the total cost of tasks. For billing information, see Billing. For more information about Fusion Engine, see Fusion Engine.

    Automatic Stop

    Enabled by default. You can specify a custom idle timeout, after which the system automatically stops the sql session.

    Normal Network Connection

    Select an existing network connection to access data sources in a VPC or external services. For more information about how to create a network connection, see Network connectivity between EMR Serverless Spark and other VPCs.

    spark.driver.cores

    The number of CPU cores for the driver process in the Spark application. The default value is 1 CPU.

    spark.driver.memory

    The amount of memory for the driver process in the Spark application. The default value is 3.5 GB.

    spark.executor.cores

    The number of CPU cores for each executor process. The default value is 1 CPU.

    spark.executor.memory

    The amount of memory for each executor process. The default value is 3.5 GB.

    spark.executor.instances

    The number of executors for the Spark application. The default value is 2.

    Dynamic Resource Allocation

    This feature is disabled by default. If you enable it, you must configure the following parameters:

    • Minimum Number of Executors: The default value is 2.

    • Maximum Number of Executors: If spark.executor.instances is not set, the default value is 10.

    More Memory Configurations

    • spark.driver.memoryOverhead: The amount of non-heap memory to allocate to the driver. If this parameter is not set, Spark automatically allocates memory based on the default value of max(384 MB, 10% × spark.driver.memory).

    • spark.executor.memoryOverhead: The amount of non-heap memory available to each Executor. If this parameter is not set, Spark automatically allocates a default value of max(384 MB, 10% × spark.executor.memory).

    • spark.memory.offHeap.size: The amount of off-heap memory for Spark. Default value: 1 GB.

      This setting takes effect only when spark.memory.offHeap.enabled is set to true. By default, when you use Fusion Engine, this feature is enabled, and its non-heap memory is set to 1 GB.

    Spark Configuration

    Enter Spark configuration properties. Use spaces to separate key-value pairs. For example, spark.sql.catalog.paimon.metastore dlf.

    After you create an sql session, its status is initially Starting. The session is ready when its status changes to Running. On the SQL Sessions page, you can stop, edit, and delete existing sessions.

View execution records

After a job completes, you can view its execution records.

  1. On the SQL Sessions page, click the name of the desired session.

  2. Click the Execution Records tab.

    On this tab, you can view details for each job execution, such as the run ID, start time, and a link to the Spark UI.

    image

Related topics