Create an Amazon RDS for SQL Server data source

更新时间: 2026-01-17 21:41:47

By creating an Amazon RDS for SQL Server data source, you can enable Dataphin to read business data from or write data to Amazon RDS for SQL Server. This topic describes how to create an Amazon RDS for SQL Server data source.

Permission requirements

Only users who have the Create Data Source permission point in a custom global role and users who have the super administrator, data source administrator, domain architect, or project administrator role can create data sources.

Procedure

  1. On the Dataphin homepage, choose Management Hub > Datasource Management from the top navigation bar.

  2. On the Datasource page, click +Create Data Source.

  3. On the Create Data Source page, select Amazon RDS for SQL Server in the Relational Database section.

    If you have recently used Amazon RDS for SQL Server, you can also select it in the Recently Used section. Alternatively, you can enter keywords in the search box to quickly search for Amazon RDS for SQL Server.

  4. On the Create Amazon RDS For SQL Server Data Source page, configure the connection parameters.

    1. Configure the basic information of the data source.

      Parameter

      Description

      Datasource Name

      Enter a name for the data source. The name must meet the following requirements:

      • It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).

      • It cannot exceed 64 characters in length.

      Datasource Code

      After you configure the data source code, you can access Dataphin data source tables in Flink_SQL tasks or by using the Dataphin JDBC client in the format of data source code.table name or data source code.schema.table name for quick consumption. If you need to automatically switch data sources based on the task execution environment, access the tables in the variable format of ${data source code}.table or ${data source code}.schema.table. For more information, see Flink_SQL task development method.

      Important
      • The data source code cannot be modified after it is configured.

      • You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured.

      • In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.

      Data Source Description

      Enter a brief description of the data source. The description cannot exceed 128 characters.

      Data Source Configuration

      Select the data source that you want to configure:

      • If your business data source distinguishes between production and development data sources, select Production + Development Data Source.

      • If your business data source does not distinguish between production and development data sources, select Production Data Source.

      Tag

      You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

    2. Configure the connection parameters between the data source and Dataphin.

      If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.

      Note

      In most cases, the production data source and development data source should be configured as different data sources to isolate the development environment from the production environment and reduce the impact of the development data source on the production data source. However, Dataphin also supports configuring them as the same data source with identical parameter values.

      For Configuration Method, you can select either JDBC URL or Host. The default selection is JDBC URL.

      JDBC URL configuration method

      Parameter

      Description

      JDBC URL

      The format of the connection address is jdbc:sqlserver://host:port/dbname.

      Schema

      Enter the schema associated with the username.

      Username, Password

      Enter the authentication username and password. To ensure that tasks can be executed properly, make sure that the user has the required data permissions.

      Type

      Supports Directly Connectable Database, ApsaraDB, and Self-managed Database On ECS (VPC). You can select and configure based on your database type and business requirements.

      • Directly Connectable Database: Connect to the database directly through the default scheduling cluster or a registered scheduling cluster. This option is suitable for the following scenarios: ① Public network databases, ② Databases in the same network environment as the registered scheduling cluster. If you need to add an access whitelist, you can add the public network outbound IP address of the Dataphin default scheduling cluster: 47.102.192.174.

      • ApsaraDB: A database purchased on Alibaba Cloud. Supports access through VPC Proxy or Direct Connection.

        • VPC Proxy: When the database is in a VPC network environment on Alibaba Cloud, specify the authorized IP whitelist: 100.104.0.0/16.

          • Region: The region where the database is located. Only databases in the same region as your Dataphin instance are supported. If your Dataphin instance is in China (Shanghai), you can only select the China (Shanghai) region.

          • VPC ID: Enter the VPC ID of the VPC network where the database is located. You can log on to the Virtual Private Cloud console to view it. The following figure shows the VPC ID:

            image..png

          • VPC Instance ID: Enter the VPC instance ID of the database, which is VpcCloudInstanceId. You can obtain it by calling the DescribeDrdsInstance API. For more information, see DescribeDrdsInstance.

        • Direct Connection: Connect to the database directly through the default scheduling cluster or a registered scheduling cluster. If you need to add an access whitelist, you can add the public network outbound IP address of the Dataphin default scheduling cluster: 47.102.192.174.

      • Self-managed Database On ECS (VPC): A database that you create on an Alibaba Cloud ECS instance. Supports access through VPC. To access a database in a VPC network, configure the following information:

        • Region: The region where the database is located. Only databases in the same region as your Dataphin instance are supported. If your Dataphin instance is in China (Shanghai), you can only select the China (Shanghai) region.

        • VPC ID: Enter the VPC ID of the VPC network where the ECS instance is located. You can log on to the Virtual Private Cloud console to view it. The following figure shows the VPC ID:

          image..png

        • ECS ID: Enter the ECS ID of the ECS server where the database is deployed. You can log on to the ECS console to view it. The following figure shows the ECS ID:

          image..png

      Host configuration method

      • Host configuration method

        Parameter

        Description

        Server Address

        Enter the IP address and port number of the server.

        You can click +Add to add multiple sets of IP addresses and port numbers, and click the image icon to delete extra IP addresses and port numbers. You must keep at least one set.

        dbname

        Enter the database name.

      • Parameter configuration

        Parameter

        Description

        Parameter

        • Parameter name: Only supports selecting an existing parameter name.

        • Parameter value: When a parameter name is selected, the parameter value is required. It can only contain uppercase and lowercase letters, digits, periods (.), underscores (_), and hyphens (-), and cannot exceed 256 characters in length.

        Note

        You can click +Add Parameter to add multiple parameters, and click the image icon to delete extra parameters. You can add up to 30 parameters.

        Schema

        Enter the schema information associated with the username.

        Username, Password

        Enter the authentication username and password. To ensure that tasks can be executed properly, make sure that the user has the required data permissions.

      Note

      When the configuration method is set to Host and you complete creating the data source, if you need to switch to the JDBC URL configuration method, the system will concatenate the server's IP address and port number into a JDBC URL for filling.

    3. Configure advanced settings for the data source.

      Parameter

      Description

      loginTimeout

      The loginTimeout duration of the database (in seconds). The default value is 900 seconds (15 minutes).

      Note
      • If you have configured loginTimeout in the JDBC URL, the loginTimeout value will be the timeout value configured in the JDBC URL.

      • For data sources created before Dataphin V3.11, the default loginTimeout value is -1, which indicates no timeout limit.

      socketTimeout

      The socketTimeout duration of the database (in milliseconds). The default value is 1800000 milliseconds (30 minutes).

      Note
      • If you have configured socketTimeout in the JDBC URL, the socketTimeout value will be the timeout value configured in the JDBC URL.

      • For data sources created before Dataphin V3.11, the default socketTimeout value is -1, which indicates no timeout limit.

      Connection Retries

      If the database connection times out, the system will automatically retry the connection until the specified number of retries is reached. If the connection still fails after the maximum number of retries, the connection is considered failed.

      Note
      • The default number of retries is 1. You can configure a value between 0 and 10.

      • The connection retry count will be applied by default to offline integration tasks and global quality (requires the Asset Quality feature module to be enabled). In offline integration tasks, you can configure task-level retry counts separately.

      Note

      Rules for duplicate parameter values:

      • If a parameter exists in the JDBC URL, Advanced Settings parameters, and Host configuration method's parameter configuration, the value in the JDBC URL takes precedence.

      • If a parameter exists in both the JDBC URL and Advanced Settings parameters, the value in the JDBC URL takes precedence.

      • If a parameter exists in both the Advanced Settings parameters and Host configuration method's parameter configuration, the value in the Advanced Settings parameters takes precedence.

  5. Select a Default Resource Group, which will be used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.

  6. Perform a Test Connection or directly click OK to save and complete the creation of the Amazon RDS for SQL Server data source.

    When you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if all selected clusters fail the connection test, the data source can still be created normally.

    Test Connection tests the connection for the Public Scheduling Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Public Scheduling Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.

    • The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.

    • The test connection usually takes less than 2 minutes. If it times out, you can click the image icon to view the specific reason and retry.

    • Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.

      Note

      Only the test results for the Public Scheduling Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.

    • When the test result is Connection Failed, you can click the image icon to view the specific failure reason.

    • When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the image icon to view the log information.

    Important
上一篇: Create an Amazon RDS for PostgreSQL data source 下一篇: Create an Amazon RDS for Oracle data source
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈