Create an ArgoDB data source-Dataphin(Dataphin)-阿里云帮助中心

Create an ArgoDB data source to enable Dataphin to read business data from or write data to ArgoDB.

Permissions

Only users with the permission to create data sources in a custom global role, or users assigned the super administrator, data source administrator, domain architect, or project administrator role can create data sources.

Procedure

On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
In the Big Data section of the Create Data Source page, select ArgoDB.

If you have recently used ArgoDB, you can also find it in the Recently Used section, or enter keywords in the search box to locate it.

On the Create ArgoDB Data Source page, configure the basic information.

Parameter	Description
Datasource Name	The name must meet the following requirements: The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-). The name cannot exceed 64 characters in length.
Datasource Code	After you configure the data source code, you can reference tables in the data source in a Flink_SQL node using the `data source code.table name` or `data source code.schema.table name` format. To automatically access the data source in the corresponding environment based on the current environment, use the variable format `${data source code}.table` or `${data source code}.schema.table`. For more information, see Development method for Flink_SQL nodes. Important The data source code cannot be modified after it is configured. You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.
Version	Currently, only version 5.2 is supported.
Data Source Description	A brief description of the data source, up to 128 characters.
Data Source Configuration	Select the data source configuration: If the data source is divided into a production data source and a development data source, select Production + Development Data Source. If the data source is not divided into a production data source and a development data source, select Production Data Source.
Tag	Categorize data sources by assigning tags. For information about how to create tags, see Manage data source tags.

Configure the connection parameters between the data source and Dataphin.

If you select Production + Development data source, configure the connection information for both environments. If you select Production data source, configure the connection information for the production environment only.

Note

In most cases, configure the production and development data sources as separate sources to isolate the two environments and minimize the impact of development on production. However, Dataphin also allows you to configure them as the same data source with identical parameter values.

Configure the parameters in the Cluster Configuration section.

Parameter	Description
NameNode	The hostname or IP address and port of the NameNode in the HDFS cluster. Example: `host=192.x.x.169,webUiPort=,ipcPort=8020`. In a TDH environment, the default values of `webUiPort` and `IPCport` are 50070 and 8020. Specify the ports based on your actual configuration.
Configuration File	Upload Hadoop configuration files such as `hdfs-site.xml` and `core-site.xml`, exported from your Hadoop cluster.
Authentication Type	If the HDFS cluster does not require authentication, select No Authentication. If the HDFS cluster requires authentication, Dataphin supports Kerberos. If you select Kerberos, configure the following authentication information: Kerberos configuration: KDDC Server: The unified service address of the KDC. Multiple addresses are supported and separated by semicolons (;). Krb5 file configuration: Upload the Krb5 file. HDFS configuration: HDFS keytab File: The keytab file for HDFS, which is the Kerberos authentication file. HDFS Principal: The Kerberos authentication principal name. The format is `XXXX/hadoopclient@xxx.xxx`.

Configure the parameters in the ArgoDB Configuration section.

Parameter	Description
JDBC URL	The JDBC URL for connecting to ArgoDB. The format is `jdbc:hive2//host:port/dbname`.
Authentication Type	If the ArgoDB cluster does not require authentication, select No Authentication. If the Inceptor cluster requires authentication, Dataphin supports LDAP or Kerberos: Kerberos: Upload a Keytab File and configure a Principal. The Keytab File is the Kerberos authentication file. The Principal format is `XXXX/hadoopclient@xxx.xxx`. LDAP: Configure the username and password for LDAP authentication.
Username	The username for ArgoDB.

Configure the parameters in the Metadatabase Configuration section.

Parameter	Description
Metadata Retrieval Method	You can retrieve metadata directly from the metadatabase or from HMS. To use HMS, upload the hive-site.xml configuration file. Supported authentication methods are No Authentication, LDAP, and Kerberos. For Kerberos authentication, also upload a Keytab File and configure a Principal.
Database Type	Select the database type based on the type of metadatabase used in your cluster. ArgoDB is supported.
JDBC URL	The JDBC URL of the ArgoDB metadatabase. The format is: `jdbc:hive2://hsot:port/dbname`.
Authentication Method	Three authentication methods are supported: No Authentication, LDAP, and Kerberos. For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Username, Password	The username and password for logging on to the metadatabase.

Select a Default Resource Group. This resource group runs tasks related to the data source, including database SQL, offline database migration, and data preview.
Click Test Connection or click OK to save the ArgoDB data source.

If you click Test Connection, the system verifies whether the data source can connect to Dataphin. If you click OK directly, the system tests the connection for all selected clusters automatically. The data source can still be created even if all connection tests fail.

上一篇: Create a Doris data source 下一篇: Create a Hudi data source