Access a VPC via direct connect

更新时间:
复制 MD 格式

This topic describes how MaxCompute can access services in a virtual private cloud (VPC), such as ApsaraDB RDS, HBase clusters, and Hadoop clusters. Supported access methods include MaxCompute SQL, UDFs, Spark, PyODPS/Mars, foreign tables, and data lakehouse architectures.

Important

This solution creates two security groups for return traffic: MaxCompute-vpc-xxx and MaxCompute-backup-vpc-xxx, where xxx is the VPC ID that you provide. Do not modify the rules of these security groups or use them to manage security rules for other components. The platform is not responsible for any issues resulting from such modifications.

Limitations

  • Region and zone restrictions: China (Hangzhou) - Zones H, I, J, and K; China (Beijing) - Zones F, G, H, I, and L; China (Shanghai) - Zones B, E, G, M, and N; China (Zhangjiakou) - Zones A, B, and C; China (Ulanqab) - Zones B and C; China (Shenzhen) - Zones C, D, E, and F; China (Hong Kong) - Zones B and C; China East 2 (Shanghai) Finance - Zone F; Japan (Tokyo) - Zones A and B; Singapore - Zones A, B, and C; Malaysia (Kuala Lumpur) - Zones A and B; Indonesia (Jakarta) - Zones A and B; Germany (Frankfurt) - Zones A, B, and C; US (Silicon Valley) - Zones A and B; US (Virginia) - Zones A and B.

  • Supported targets: VPC IP addresses or domain names, ApsaraDB RDS, HBase clusters, and Hadoop clusters.

Procedure

Step 1: Prepare account and project

Before you create a network connection between MaxCompute and the target service, make sure that the following prerequisites are met.

  1. Create a MaxCompute project. For data lakehouse scenarios, we recommend that you set the data type of the project to Hive-compatible.

  2. To access a service in a VPC, ensure that the Alibaba Cloud account that owns the VPC, the account used to access the MaxCompute project, and the administrator account for the target service all belong to the same primary Alibaba Cloud account.

Step 2: Create a direct network connection

1. Grant permissions

  • Authorize the operating user:

  • To allow MaxCompute to create elastic network interfaces (ENIs) in your VPC for connectivity, click Authorize while you are logged in to your Alibaba Cloud account.

2. Configure security group rules

In your VPC, create a dedicated security group to control MaxCompute's access to resources within the VPC.

We recommend that you create a new Basic Security Group. Avoid using other types of security groups or security groups that are already in use. MaxCompute will create an elastic network interface (ENI) in the VPC of the current user to access your services.

  • Configure the outbound rules of the security group to control which destination addresses MaxCompute jobs (via ENIs) can access. If you have no special requirements, you can leave the default outbound rules.

  • The inbound traffic to the ENI is return traffic. Therefore, you must allow all inbound traffic.

  1. 登录专有网络管理控制台

  2. 在左侧导航栏选择VPC,在左上角选择地域。

  3. VPC页面,单击目标专有网络Instance ID/Name

  4. 在专有网络详情页,选择Resource Management页签。

  5. Resource Management页签的VPC Resources区域,将鼠标悬停至Security Group数值上方,单击Add

    • For Security Group Type, select Basic Security Group.

      A Basic Security Group allows outbound traffic by default. An Advanced Security Groups denies outbound traffic by default, which prevents access to any services in the VPC.

    • For Network, select the VPC that contains the service you want to access.

    For more information about how to create a security group, see Create a security group.

  6. Configure the security group to support the MaxCompute network.

    1. In the Actions column of the target security group, click Manage Rules.

    2. On the Access Rule tab, select the Inbound tab. In the Actions column of the target rule, click Edit. Configure the following parameters to allow all Inbound traffic.

      • Set Authorization policy to Allow.

      • Set Priority to 1.

      • Set Protocol to All.

      • For Source, add the CIDR block of the VPC or the CIDR block of the vSwitch where the service that you want to access is deployed.

      • Destination is ALL(-1/-1) by default.

    For more information about how to configure rules, see Security group application guide and examples.

  7. For HBase, if you cannot grant network permissions to a security group, you must add the IP address of the ENI that is created by MaxCompute to a whitelist. Because ENI IP addresses may change, we recommend that you add the IP address range of the vSwitch to which the VPC belongs. To obtain the ENI IP address, log on to the ECS console and click Network Interfaces in the left-side navigation pane.

Note

During the network connection creation process, MaxCompute automatically creates two ENIs based on your bandwidth requirements. These ENIs are free of charge and are placed in this security group.

3. Create the network connection

An Alibaba Cloud account or a RAM user with the tenant-level Super_Administrator or Admin role for MaxCompute can create a connection between MaxCompute and a VPC in the MaxCompute console. For more information, see MaxCompute tenant-level roles. To create a connection:

  1. 登录MaxCompute控制台,在左上角选择地域。

  2. 在左侧导航栏,选择Manage Configurations > Network Connection 。

  3. Network Connection页面,单击Add Network Connection

  4. 在弹出的Add Network Connection对话框,根据界面提示文案配置项目信息,单击OK。首次新增需先前往授权,允许MaxCompute平台代理申请网卡,否则连接将创建失败。

  5. 在弹出的Add Network Connection对话框,根据界面提示文案配置项目信息,单击OK。首次新增需先前往授权,允许MaxCompute平台代理申请网卡,否则连接将创建失败。

    配置参数如下:

    参数

    是否必填

    描述

    Connection Name:

    必填

    自定义连接名称。格式如下:

    • 字母开头。

    • 只能包含字母、下划线(_)和数字。

    • 长度在1-63个字符。

    Type:

    必填

    默认为Passthrough

    Passthrough对应的即是专有网络连接方案。

    Region:

    必填

    系统根据左上角选择的地域自动生成。详情请参见开通地域

    VPC Selected:

    必填

    专有网络VPC是云上安全隔离的虚拟网络环境,提供了类似于传统数据中心的安全和可配置的私有网络空间。
    • VPC实例ID。如需创建新的专有网络,请参考创建/删除专有网络

    • 获取方式:

      1. 登录专有网络管理控制台

      2. 在左侧导航栏选择VPC,在左上角选择地域。

      3. VPC页面,获取专有网络Instance ID/Name

      若连接HBase、Hadoop集群,可以在对应控制台的网络连接信息处获取该信息。

    Switch:

    必填

    交换机用来划分子网,同一VPC内的不同交换机之间内网互通。通过在多个不同可用区的交换机中同时部署云产品资源,可以避免应用受到单一可用区故障的影响。
    • VPC网络绑定的交换机ID。如果没有可选交换机,请参考创建/删除交换机

    • 获取方式:

      1. 登录专有网络管理控制台

      2. 在左侧导航栏选择vSwitch,在左上角选择地域。

      3. vSwitch页面,获取交换机Instance ID/Name

      若连接HBase、Hadoop集群,可以在对应控制台的网络连接信息处获取该信息。

    Security group:

    必填

    安全组扮演云上虚拟防火墙的角色,通过管理安全组和规则,可提供精细化的网络安全隔离与访问控制。

    安全组ID,如需创建安全组,请参考Create a security group

4. Configure target service security group

After you complete the preceding operations to enable the ENI-based direct connection, you must add security rules to the target service. These rules authorize the security group representing MaxCompute to access specific service ports, such as 9200 and 31000.

For example, to access ApsaraDB RDS, you must add a rule to ApsaraDB RDS that allows access from the security group that you created in Step 2. If the service that you want to access supports only IP addresses instead of security groups, you must add the entire CIDR block of the vSwitch where the target service resides.

  • Configure the security group for the Hadoop cluster.

    • Configure the following information for the security group of the Hadoop cluster to ensure that MaxCompute can access the cluster.

      • Configure the inbound access rules for the security group in which the Hadoop cluster resides.

      • The authorization object is the security group that you created in Step 2 for the ENI.

      • Hive Metastore port: 9083

      • HDFS NameNode port: 8020

      • HDFS DataNode port: 50010

    • For example, when you connect to a Hadoop cluster that is created on Alibaba Cloud E-MapReduce, you must configure the security group rules as described above. For more information, see Create a security group.

  • Configure the security group for the HBase cluster.

    • Add the security group that is created for MaxCompute or the ENI IP address to the security group or IP address whitelist of the HBase cluster.

    • For example, when you connect to an Alibaba Cloud HBase cluster:

      1. Log on to the HBase Management Console. In the upper-left corner, select a region.

      2. In the navigation pane on the left, select Clusters.

      3. On the Clusters page, click the name of the target cluster.

      4. In the navigation pane on the left, select Access Control.

      5. On the Whitelist Setting and Security Group tabs, you can click Add Whitelist or Add Security Group. If you cannot add a security group, add the ENI's IP address on the Whitelist Setting tab. Because ENI IP addresses can change if the MaxCompute configuration is modified, we recommend that you add the CIDR block of the vSwitch to the whitelist.

    For more information about how to add a security group or IP address whitelist, see Set whitelists and security groups.

  • Configure the security group for ApsaraDB RDS.

    • Add the security group that is created for MaxCompute or the ENI IP address to the ApsaraDB RDS security group or IP address whitelist.

    • For example, when you connect to an ApsaraDB RDS instance:

      1. 登录RDS 控制台

      2. 在左侧导航栏,选择Instances,在左上角选择地域。

      3. 在左侧导航栏,单击Whitelist and SecGroup

      4. On the Whitelist Settings and Security Group tabs, add an IP address whitelist or a security group. Because the ENI IP address may change if the MaxCompute configuration is modified, we recommend that you add the CIDR block of the vSwitch to the whitelist.

    For more information about how to add a security group or an IP address whitelist, see Configure a security group or Configure an IP address whitelist.

Step 3: Access VPC resources

After you create the direct network connection, you must add the following configurations to access the VPC network using SQL or Spark.

For other types of jobs, adjust the configurations based on the job type.

SQL access

Spark access

使用Spark访问VPC网络时,在完成上述开通专线网络连接操作后,需要在spark-defaults.confDataWorks配置项中增加以下配置项:

spark.hadoop.odps.cupid.eni.enable = true
# The format is regionid:vpcid, where vpcid is the ID of the target VPC used when creating the network connection.
spark.hadoop.odps.cupid.eni.info = regionid:vpc-**********

(Optional) Step 4: Configure the whitelist

If your server has access control enabled, you must add the security group for the direct network connection to the server's whitelist.