Enable LDAP authentication for Spark Thrift Server

更新时间:
复制 MD 格式

Enable LDAP authentication to secure your Spark Thrift Server. This requires clients to provide a valid username and password to connect and run SQL queries, preventing unauthorized access to sensitive data and features.

Limitations

This procedure applies only to the following EMR Serverless Spark engine versions:

  • esr-4.x: esr-4.2.0 and later.

  • esr-3.x: esr-3.0.1 and later.

  • esr-2.x: esr-2.4.1 and later.

Prerequisites

Procedure

Step 1: Prepare the network

Before you begin, set up the network to ensure network connectivity between EMR Serverless Spark and your virtual private cloud (VPC). This allows the Spark Thrift Server to connect to the LDAP service for authentication. For instructions, see Network connectivity between EMR Serverless Spark and other VPCs.

Step 2: Configure Spark Thrift Server startup parameters

To enable LDAP authentication, first stop the target Spark Thrift Server session. In the Normal Network Connection drop-down list, select the connection that you created. In the Spark Configuration section, add the following parameters. After you finish, restart the session to apply the changes.

spark.hive.server2.authentication              LDAP
spark.hive.server2.authentication.ldap.url     ldap://<ldap_url>:<ldap_port>
spark.hive.server2.authentication.ldap.baseDN  <ldap_base_dn>

The parameters are described below. Replace the placeholders with your specific values.

  • <ldap_url> and <ldap_port>: The URL and port of your LDAP server. If you connect to the OpenLDAP service on an Alibaba Cloud EMR on ECS cluster, you can set <ldap_url> to the internal IP address of the master node and <ldap_port> to 10389.

    Note

    If your LDAP service is configured for high availability (HA), separate multiple LDAP server addresses with a space. For example: ldap://<ldap_url_1>:<ldap_port> ldap://<ldap_url_2>:<ldap_port>.

  • <ldap_base_dn>: Specify the base DN for LDAP service authentication. If you are connecting to the OpenLDAP service of an Alibaba Cloud EMR on ECS cluster, you can set it to ou=people,o=emr.

Step 3: Connect to Spark Thrift Server

Here are two common connection methods. Before you connect, replace the following placeholders with your information:

  • <endpoint>: The Endpoint (Public) or Endpoint (Internal) that you obtain from the Overview tab.

    Using an internal endpoint restricts access to the Spark Thrift Server to resources within the same VPC.

  • <token>: The token from the Tokens tab.

  • <port>: The port number. The port number is 443 for access over a public endpoint and 80 for access over an internal same-region endpoint.

  • <username> and <password>: The username and password for the LDAP service. If you connect to the OpenLDAP service on an Alibaba Cloud EMR on ECS cluster, use the username and password that you added on the Users page in the EMR on ECS console.

Method 1: Use Beeline

beeline -u 'jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>' -n <username> -p <password>

Method 2: Use a JDBC URL

To connect from other applications, such as a Java program, or to construct a JDBC URL, use the following format:

jdbc:hive2://<endpoint>:<port>/;transportMode=http;httpPath=cliservice/token/<token>;user=<username>;password=<password>