Create a TDSQL for PostgreSQL data source-Dataphin(Dataphin)-阿里云帮助中心

This topic describes how to create a TDSQL for PostgreSQL data source, which enables Dataphin to read data from or write data to TDSQL for PostgreSQL.

Limitations

Only custom global roles with the Create Data Source permission and users with the following system roles can create data sources: Super Admin, Data Source Administrator, Business Segment Architect, and Project Administrator.

Procedure

On the Dataphin homepage, choose Management Center > Data Source Management from the top navigation bar.
On the Data Source page, click + New Data Source.
On the New Data Source page, in the Relational Database section, select TDSQL for PostgreSQL.
If you have recently used TDSQL for PostgreSQL, you can also select TDSQL for PostgreSQL in the Recently Used section. You can also enter keywords for TDSQL for PostgreSQL in the search box to quickly search.

On the Create TDSQL for PostgreSQL Data Source page, configure the connection parameters.

Configure basic information for the data source.

Parameter	Description
Data source name	The naming rules are as follows: Can only contain Chinese characters, uppercase and lowercase letters, digits, underscores (_), and hyphens (-). Cannot be longer than 64 characters.
Data source code	After you configure the data source code, you can reference tables from this data source in Flink SQL tasks by using the format `data_source_code.table_name` or `data_source_code.schema.table_name`. To automatically access the data source corresponding to the current environment, use the variable format `${data_source_code}.table` or `${data_source_code}.schema.table`. For more information, see Develop Dataphin data source tables. Important The data source code cannot be modified after it is configured. You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.
Data source description	A description of the data source, which can be up to 128 characters long.
Time zone	Date and time data in integration tasks is processed based on the selected time zone. The default time zone is Asia/Shanghai. Click Modify to select a different time zone. The options are as follows: GMT: GMT-12:00, GMT-11:00, GMT-10:00, GMT-09:30, GMT-09:00, GMT-08:00, GMT-07:00, GMT-06:00, GMT-05:00, GMT-04:00, GMT-03:00, GMT-03:00, GMT-02:30, GMT-02:00, GMT-01:00, GMT+00:00, GMT+01:00, GMT+02:00, GMT+03:00, GMT+03:30, GMT+04:00, GMT+04:30, GMT+05:00, GMT+05:30, GMT+05:45, GMT+06:00, GMT+06:30, GMT+07:00, GMT+08:00, GMT+08:45, GMT+09:00, GMT+09:30, GMT+10:00, GMT+10:30, GMT+11:00, GMT+12:00, GMT+12:45, GMT+13:00, GMT+14:00. Daylight Saving Time: Africa/Cairo, America/Chicago, America/Denver, America/Los_Angeles, America/New_York, America/Sao_Paulo, Asia/Bangkok, Asia/Dubai, Asia/Kolkata, Asia/Shanghai, Asia/Tokyo, Atlantic/Azores, Australia/Sydney, Europe/Berlin, Europe/London, Europe/Moscow, Europe/Paris, Pacific/Auckland, Pacific/Honolulu.
Data source configuration	Select the configuration that matches your environment setup: If you use separate environments for production and development, select Production + Development Data Source. Otherwise, select Production Data Source.
Tag	You can use tags to categorize and label the data source. For more information about how to create tags, see Manage data source tags.

Configure the connection parameters for the data source.

If you selected Production + Development Data Source, you must configure connection information for both production and development environments. If you selected Production Data Source, you only need to configure it for the production data source.

Note

We recommend using different instances for production and development data sources to ensure environment isolation and prevent development activities from impacting production.

For Configuration Method, choose either JDBC URL (the default) or Host.

JDBC URL

Parameter	Description
JDBC URL	The connection URL format is `jdbc:postgresql://host:port/dbname`.
Schema	Enter the schema that is associated with the username.
Username and Password	Enter the username and password for database authentication. To ensure that tasks run correctly, make sure the account has the required data permissions.

Host

Host information

Parameter

Description

Server address

Enter the server's IP address and port number.

Click + Add to add multiple IP address and port number pairs. Click the icon to remove extra pairs. You must keep at least one pair.

dbname

Enter the database name.

Parameter configuration

Parameter	Description
Parameter	Parameter Name: Select an existing parameter name or enter a custom one. A custom parameter name can contain only uppercase and lowercase letters, digits, periods (.), underscores (_), and hyphens (-). Parameter Value: This field is required if a parameter name is selected. The value can contain only uppercase and lowercase letters, digits, periods (.), underscores (_), and hyphens (-), and its length cannot exceed 256 characters. Note Click + Add Parameter to add more parameters, up to a maximum of 30. Click the icon to remove a parameter.
Schema	Enter the schema that is associated with the username.
Username and Password	Enter the username and password for database authentication. To ensure that tasks run correctly, make sure the account has the required data permissions.

Note

If you select Host as the configuration method and later switch to the JDBC URL method after creating the data source, the system automatically populates the JDBC URL field by combining the server's IP address and port number.

Configure advanced settings for the data source.

Parameter	Description
connectTimeout	The connection timeout in seconds. The default is 900 seconds (15 minutes). Note If you include a `connectTimeout` setting in the JDBC URL, that value takes precedence.
socketTimeout	The socket timeout in seconds. The default is 1800 seconds (30 minutes). Note If you include a `socketTimeout` setting in the JDBC URL, that value takes precedence.
Connection retries	If a connection times out, the system automatically retries it up to the specified number of times. If all retries fail, the connection attempt is considered failed. Note The default number of retries is 1. You can set this value to an integer from 0 to 10. The number of connection retries applies by default to offline integration tasks and Global Quality (requires the Asset Quality feature to be enabled). You can configure task-level retries for individual offline integration tasks.

Note

If a parameter is defined in multiple locations, its value is determined by the following order of precedence:

If a parameter is defined in the JDBC URL, Advanced settings, and Host configuration, the value from the JDBC URL is used.
If a parameter is defined in both the JDBC URL and Advanced settings, the value from the JDBC URL is used.
If a parameter is defined in both the Advanced settings and the Host configuration, the value from the Advanced settings is used.

Select Default resource group. This resource group is used to run tasks related to the current data source, such as database SQL, offline full-database migration, and data preview.
Click Test Connection to verify connectivity, or click OK to save the configuration and create the TDSQL for PostgreSQL data source.
Clicking Test Connection verifies connectivity to Dataphin. If you click OK directly, the system automatically tests the connection against all selected clusters. The data source is created even if all connection tests fail.
Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.
- The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.
- The test connection usually takes less than 2 minutes. If it times out, you can click the icon to view the specific reason and retry.
- Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.
  Note
  Only the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.
- When the test result is Connection Failed, you can click the icon to view the specific failure reason.
- When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the icon to view the log information.

上一篇: Create a GBase 8c data source 下一篇: Create an OushuDB data source