Create an ArgoDB data source
Create an ArgoDB data source to enable Dataphin to read business data from or write data to ArgoDB.
Permissions
Only users with the permission to create data sources in a custom global role, or users assigned the super administrator, data source administrator, domain architect, or project administrator role can create data sources.
Procedure
-
On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
-
On the Datasource page, click +Create Data Source.
-
In the Big Data section of the Create Data Source page, select ArgoDB.
If you have recently used ArgoDB, you can also find it in the Recently Used section, or enter keywords in the search box to locate it.
-
On the Create ArgoDB Data Source page, configure the basic information.
Parameter
Description
Datasource Name
The name must meet the following requirements:
-
The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).
-
The name cannot exceed 64 characters in length.
Datasource Code
After you configure the data source code, you can reference tables in the data source in a Flink_SQL node using the
data source code.table nameordata source code.schema.table nameformat. To automatically access the data source in the corresponding environment based on the current environment, use the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Development method for Flink_SQL nodes.ImportantThe data source code cannot be modified after it is configured.
You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.
Version
Currently, only version 5.2 is supported.
Data Source Description
A brief description of the data source, up to 128 characters.
Data Source Configuration
Select the data source configuration:
-
If the data source is divided into a production data source and a development data source, select Production + Development Data Source.
-
If the data source is not divided into a production data source and a development data source, select Production Data Source.
Tag
Categorize data sources by assigning tags. For information about how to create tags, see Manage data source tags.
-
-
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development data source, configure the connection information for both environments. If you select Production data source, configure the connection information for the production environment only.
NoteIn most cases, configure the production and development data sources as separate sources to isolate the two environments and minimize the impact of development on production. However, Dataphin also allows you to configure them as the same data source with identical parameter values.
-
Configure the parameters in the Cluster Configuration section.
Parameter
Description
NameNode
The hostname or IP address and port of the NameNode in the HDFS cluster.
Example:
host=192.x.x.169,webUiPort=,ipcPort=8020. In a TDH environment, the default values ofwebUiPortandIPCportare 50070 and 8020. Specify the ports based on your actual configuration.Configuration File
Upload Hadoop configuration files such as
hdfs-site.xmlandcore-site.xml, exported from your Hadoop cluster.Authentication Type
If the HDFS cluster does not require authentication, select No Authentication. If the HDFS cluster requires authentication, Dataphin supports Kerberos.
If you select Kerberos, configure the following authentication information:
-
Kerberos configuration:
-
KDDC Server: The unified service address of the KDC. Multiple addresses are supported and separated by semicolons (;).
-
Krb5 file configuration: Upload the Krb5 file.
-
-
HDFS configuration:
-
HDFS keytab File: The keytab file for HDFS, which is the Kerberos authentication file.
-
HDFS Principal: The Kerberos authentication principal name. The format is
XXXX/hadoopclient@xxx.xxx.
-
-
-
Configure the parameters in the ArgoDB Configuration section.
Parameter
Description
JDBC URL
The JDBC URL for connecting to ArgoDB. The format is
jdbc:hive2//host:port/dbname.Authentication Type
If the ArgoDB cluster does not require authentication, select No Authentication. If the Inceptor cluster requires authentication, Dataphin supports LDAP or Kerberos:
-
Kerberos: Upload a Keytab File and configure a Principal. The Keytab File is the Kerberos authentication file. The Principal format is
XXXX/hadoopclient@xxx.xxx. -
LDAP: Configure the username and password for LDAP authentication.
Username
The username for ArgoDB.
-
-
Configure the parameters in the Metadatabase Configuration section.
Parameter
Description
Metadata Retrieval Method
You can retrieve metadata directly from the metadatabase or from HMS.
To use HMS, upload the hive-site.xml configuration file. Supported authentication methods are No Authentication, LDAP, and Kerberos. For Kerberos authentication, also upload a Keytab File and configure a Principal.
Database Type
Select the database type based on the type of metadatabase used in your cluster. ArgoDB is supported.
JDBC URL
The JDBC URL of the ArgoDB metadatabase. The format is:
jdbc:hive2://hsot:port/dbname.Authentication Method
Three authentication methods are supported: No Authentication, LDAP, and Kerberos.
For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Username, Password
The username and password for logging on to the metadatabase.
-
-
Select a Default Resource Group. This resource group runs tasks related to the data source, including database SQL, offline database migration, and data preview.
-
Click Test Connection or click OK to save the ArgoDB data source.
If you click Test Connection, the system verifies whether the data source can connect to Dataphin. If you click OK directly, the system tests the connection for all selected clusters automatically. The data source can still be created even if all connection tests fail.