Before you can use OSS-based Hive foreign tables for offline integration in Dataphin with an E-MapReduce 5.x Hadoop compute engine, you must configure the required parameters.
Configuration instructions
Configure the required parameters in the core-site.xml file of the Hive data source or Hadoop compute engine, and then upload the file.
-
If Dataphin and OSS are in the same region, configure the
fs.oss.endpointparameter in thecore-site.xmlfile. -
If Dataphin and OSS are in different regions, you must also configure the
accessKeyIdandaccessKeySecretparameters in addition tofs.oss.endpoint.
You do not need to configure accessKeyId and accessKeySecret for internal endpoints.
Configuration examples
-
Dataphin and OSS are in the same region.
<property> <name>fs.oss.endpoint</name> <value>oss-cn-hangzhou-internal.aliyuncs.com</value> </property> -
Dataphin and OSS are in different regions.
<property> <name>fs.oss.endpoint</name> <value>oss-cn-hangzhou-internal.aliyuncs.com</value> </property> <property> <name>fs.oss.accessKeyId</name> <value>ak</value> </property> <property> <name>fs.oss.accessKeySecret</name> <value>ks</value> </property>Note-
Set
{value}forfs.oss.endpointbased on your region. For more information, see Regions and endpoints. -
For
fs.oss.accessKeyIdandfs.oss.accessKeySecret, set{value}to your AccessKey information. For more information about how to obtain an AccessKey, see Create AccessKey.
-
FAQ
Error during offline integration: com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[HDFSConnection-06], Description:[An IO exception occurred while establishing a connection with HDFS.]. - java.io.IOException: No FileSystem for scheme: oss
Add the following configuration to your core-site.xml file to resolve this error:
<property>
<name>fs.oss.impl</name>
<value>com.aliyun.jindodata.oss.JindoOssFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.oss.impl</name>
<value>com.aliyun.jindodata.oss.OSS</value>
</property>
<property>
<name>fs.jindofsx.data.cache.enable</name>
<value>false</value>
</property>
<property>
<name>fs.jindofsx.namespace.rpc.address</name>
<value>emr-cluster:8101</value>
</property>
Set {value} for fs.jindofsx.namespace.rpc.address based on your cluster configuration. If you are unsure of the value, contact the EMR product helpdesk.
Error during offline integration: Description:[An IO exception occurred while establishing a connection with HDFS.]. - java.io.IOException: ERROR: not found login secrets, please configure the accessKeyId and accessKeySecret
Add the following configuration to your core-site.xml file to resolve this error:
<property>
<name>fs.jindofsx.namespace.rpc.address</name>
<value>emr-cluster:8101</value>
</property>