数据集成报错排查

更新时间:
复制为 MD 格式

常见数据集成报错及排查思路

问题1

  • 现象描述

    读取阿里云mysql的字段类型为varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_bin的中文内容乱码

  • 原因分析

    dlink引擎虽然对于RDS MYSQL版本使用mysq8的驱动但是仍然引入了mysql5.1.47的驱动,所以应该是编码冲突了,客户侧可以执行sql:SHOW VARIABLES LIKE 'collation_connection'查询数据库服务端默认的编码,并将字段类型编码修改一致:alter table sff_utf8mb4 modify name varchar(255) CHARACTER SET utf8mb4 COLLATE 编码值;

  • 解决方案

    dlink在v5.2.0已支持,数据源版本选择RDS MYSQL即可,之前版本需要修改字段的编码与数据库默认collation_connection编码一致

问题2

  • 现象描述

    写入达梦clob类型报错:dm.jdbc.driver.DMException: 字符串截断

    at dm.jdbc.driver.DBError.throwz(SourceFile:754) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement.checkBindParameters(SourceFile:433) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement.beforeExectueWithParameters(SourceFile:468) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement.do_execute(SourceFile:562) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement$6.run(SourceFile:2098) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement$6.run(SourceFile:1) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at dm.jdbc.driver.DmdbPreparedStatement.execute(SourceFile:2114) ~[DmJdbcDriver18-8.1.3.62.jar:- 8.1.3.62 - Production]

    at com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter$Task.doOneInsert(CommonRdbmsWriter.java:524) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    数据库配置问题

  • 解决方案

    调整数据库参数:SP_SET_PARA_VALUE(2,'CLOB_LIKE_MAX_LEN',102400),重启数据库

问题3

  • 现象描述

    读取hologress报错:Query Table Group. sql: SELECT tgp.tablegroup_name, tgp.property_key, tgp.property_value
    FROM hologres.hg_table_properties tp, hologres.hg_table_group_properties tgp
    WHERE tp.table_namespace=?
    AND tp.table_name=?
    AND tp.property_key='table_group'
    AND tp.property_value=tgp.tablegroup_name;
    2025-09-16 14:45:51.531 [job-50876540] ERROR DlinkTransPreview - Exception when job run
    com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[HoloWriter-06], Description:[null]. - Failed to query table's Table Group info.
    at com.alibaba.dt.pipeline.plugin.center.exception.DataXException.asDataXException(DataXException.java:51) ~[plugin.center.base-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    目前集成需要读取holo表的table group信息,而只有holo内表才有table group,view、外表、系统表都是没有tg,因此非holo内表读取不支持

  • 解决方案

    可以使用querySql方式支持

问题4

  • 现象描述

    写入impala字段类型为varchar的中文内容查询乱码

  • 原因分析

    Impala默认使用optimized insert语法,其内部会进行批量插入的优化,但是optimized insert语法在处理中文字符时可能会出现乱码问题

  • 解决方案

    1. 建议将varchar类型改成string类型即可

    2. 或者在jdbc url上指定OptimizedInsert=0来禁用impala的默认优化即可

问题5

  • 现象描述

    读取orc格式decimal类型字段报错:java.io.EOFException: Reading BigInteger past EOF from compressed stream Stream for column 7 kind DATA position: 10701 length: 10701 range: 0 offset: 555634 limit: 555634 range 0 = 0 to 10701 uncompressed: 929 to 929

  • 原因分析

    客户定义的decimal类型精度是(3,0)也就是3个有效长度,但是写入的值超过了3位有效长度,导致读取时底层的数据和decimal精度不匹配,导致读取错误

  • 解决方案

    修改字段类型精度即可

问题6

  • 现象描述

    mc逻辑表导出到holo发现和直接count查询逻辑表条数不同

  • 原因分析

    由于逻辑表导出时选择了关联字段,会有很多维度逻辑表同步输出,而直接查询count逻辑表时是统计主表的条数。所以可能数据条数不一致。属于业务设计就是如此。

  • 解决方案

问题7

  • 现象描述

    api输入组件使用跨节点参数配置到请求终止值位置,运行报错

    错误堆栈: java.lang.NumberFormatException: For input string: "${total_hz}"

    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

    at java.lang.Integer.parseInt(Integer.java:569)

    at java.lang.Integer.parseInt(Integer.java:615)

    at com.alibaba.fastjson.util.TypeUtils.castToInt(TypeUtils.java:996)

    at com.alibaba.fastjson.JSONObject.getInteger(JSONObject.java:252)

    at com.alibaba.dataphin.pipeline.plugin.base.BasePluginConfig.getInteger(BasePluginConfig.java:470)

    at com.alibaba.dataphin.pipeline.plugins.input.restapi.RestApiInputConfig.getRequestEndValue(RestApiInputConfig.java:301)

    at com.alibaba.dataphin.pipeline.plugins.input.restapi.RestApiInputConfig.getEndIndex(RestApiInputConfig.java:317)

    at com.alibaba.dataphin.pipeline.plugins.input.restapi.RestApiInputConfig.fillScheduleJson(RestApiInputConfig.java:169)

    at com.alibaba.dataphin.pipeline.plugin.base.JsonAble.toScheduleJson(JsonAble.java:37)

    at com.alibaba.dataphin.pipeline.domain.model.pipeline.Step.fillScheduleJson(Step.java:158)

    at com.alibaba.dataphin.pipeline.plugin.base.JsonAble.toScheduleJson(JsonAble.java:37)

    [omitted 7 frames]

    at com.alibaba.dataphin.pipeline.domain.model.pipeline.Pipeline.fillScheduleJson(Pipeline.java:156)

  • 原因分析

    产品上不支持请求参数里使用跨节点参数

  • 解决方案

问题8

  • 现象描述

    mysql一键建表超时异常

  • 原因分析

    由于专有云配置反向VPC的ip列表遗漏了dataphin-pipeline所在容器的ip列表。所以网络不通

  • 解决方案

    拉上运维同学添加dataphin-pipeline的id列表到白名单里面后重试

问题9

  • 现象描述

    2025-02-11 11:42:45.880 [Thread-22] ERROR StarRocksStreamLoadVisitor - StreamLoad response: {"Status":"Fail","BeginTxnTimeMs":0,"Message":"too many filtered rows","NumberUnselectedRows":0,"CommitAndPublishTimeMs":0,"Label":"aecae9b8-a444-424e-ac93-6713527df252","LoadBytes":19689745,"StreamLoadPlanTimeMs":1,"NumberTotalRows":56559,"WriteDataTimeMs":523,"TxnId":248643660,"LoadTimeMs":524,"ErrorURL":"http://172.23.21.31:8040/api/_load_error_log?file=error_log_2f42e57951e0444d_d654ea5d66a4c191","ReadDataTimeMs":48,"NumberLoadedRows":56537,"NumberFilteredRows":22}

    2025-02-11 11:42:45.880 [Thread-22] INFO StarRocksStreamLoadVisitor - Executing GET from http://172.23.21.31:8040/api/_load_error_log?file=error_log_2f42e57951e0444d_d654ea5d66a4c191.

    2025-02-11 11:43:41.929 [Metric-collector-1] INFO KettleMetricCollector - Total 169645 records, 36109619 bytes | Speed 587.72KB/s, 2827 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.291s | All Task WaitReaderTime 10.905s | Percentage 100.00%

    2025-02-11 11:44:41.930 [Metric-collector-1] INFO KettleMetricCollector - Total 169645 records, 36109619 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.291s | All Task WaitReaderTime 10.905s | Percentage 100.00%

    2025-02-11 11:44:55.601 [Thread-22] WARN StarRocksStreamLoadVisitor - Get Error URL failed. http://172.23.21.31:8040/api/_load_error_log?file=error_log_2f42e57951e0444d_d654ea5d66a4c191

    org.apache.http.conn.HttpHostConnectException: Connect to 172.23.21.31:8040 [/172.23.21.31] failed: Connection timed out (Connection timed out)

    at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[httpclient-4.5.4.jar:4.5.4]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksStreamLoadVisitor.doHttpGet(StarRocksStreamLoadVisitor.java:203) ~[starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksStreamLoadVisitor.doStreamLoad(StarRocksStreamLoadVisitor.java:78) ~[starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager.asyncFlush(StarRocksWriterManager.java:169) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager.access$000(StarRocksWriterManager.java:21) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager$1.run(StarRocksWriterManager.java:138) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

    Caused by: java.net.ConnectException: Connection timed out (Connection timed out)

    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_152]

    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_152]

    at java.net.Socket.connect(Socket.java:643) ~[na:1.8.0_152]

    at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[httpclient-4.5.4.jar:4.5.4]

    ... 15 common frames omitted

    2025-02-11 11:44:55.601 [Thread-22] WARN StarRocksWriterManager - Failed to flush batch data to StarRocks, retry times = 0

    java.io.IOException: Failed to flush data to StarRocks.

    too many filtered rows

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksStreamLoadVisitor.doStreamLoad(StarRocksStreamLoadVisitor.java:87) ~[starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager.asyncFlush(StarRocksWriterManager.java:169) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager.access$000(StarRocksWriterManager.java:21) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.starrockswriter.manager.StarRocksWriterManager$1.run(StarRocksWriterManager.java:138) [starrockswriter-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

  • 原因分析

    由于配置的每批次数据load大小太大导致上传超时

  • 解决方案

    调整每批次上传大小的配置后重新运行

问题10

  • 现象描述

    写入fusioninsight的orc格式表,写入string类型的字段,该字段有500kb左右大小,写入报错:Caused by: java.lang.NullPointerException: null

    at java.lang.System.arraycopy(Native Method) ~[na:1.8.0_152]

    at org.apache.orc.impl.DynamicByteArray.add(DynamicByteArray.java:115) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.orc.impl.StringRedBlackTree.addNewKey(StringRedBlackTree.java:48) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.orc.impl.StringRedBlackTree.add(StringRedBlackTree.java:60) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.orc.impl.writer.StringTreeWriter.writeBatch(StringTreeWriter.java:70) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushInternalBatch(WriterImpl.java:297) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.hadoop.hive.qÅl.io.orc.WriterImpl.addRow(WriterImpl.java:316) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:91) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:69) ~[hive-exec-3.1.0-hw-ei-302002.jar:3.1.0-hw-ei-302002]

    at com.alibaba.datax.plugin.writer.fusioninsight.hdfswriter.FiHdfsWriterHelper.doOrcFileStartWrite(FiHdfsWriterHelper.java:331) [fusioninsight_hdfswriter-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    这个是orc的bug:https://issues.apache.org/jira/browse/ORC-299即写入大字段且记录数不少于1万行时会触发bug

  • 解决方案

    可以将目标表换成parquet格式或者在hive输出组件的性能配置中配置{"orc.column.encoding.direct":"字段名"}多个字段以逗号分隔

问题11

  • 现象描述

    kingbase2odps运行报错

    2025-01-23 15:16:01.250 [0-0-0-reader] ERROR CommonRdbmsReader$Task - error occurred when reading.

    com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[DBUtilErrorCode-12], Description:[不支持的数据库类型. 请注意查看 DataX 已经支持的数据库类型以及数据库版本.]. - 您的配置文件中的列配置信息有误. 因为DataX 不支持数据库读取这种字段类型. 字段名:[tenant_level], 字段名称:[1111], 字段Java类型:[java.lang.Object]. 请尝试使用数据库函数将其转换datax支持的类型 或者不同步该字段 .

    at com.alibaba.dt.pipeline.plugin.center.exception.DataXException.asDataXException(DataXException.java:51) ~[plugin.center.base-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.handleUnusualColumnType(CommonRdbmsReader.java:361) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.buildRowMeta(CommonRdbmsReader.java:354) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:230) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.kingbasees8reader.KingBaseES8R6Reader$Task.startRead(KingBaseES8R6Reader.java:75) [kingbasees8r6reader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:71) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

    2025-01-23 15:16:01.252 [0-0-0-reader] ERROR ReaderRunner - Reader runner Received Exceptions:

  • 原因分析

    由于客户的kingbase底层数据库字段类型是tinyint。但sql执行结果的meta类型返回的type对应code为1111.而1111在java的sql.type对应是other类型。other类型目前dlink是不支持读取的。所以抛出异常

  • 解决方案

    引导客户调查引擎为何返回的是1111类型的tinyint。或者让客户手动修改一下表。调整一下字段类型为int来绕过

问题12

  • 现象描述

    drds2odps运行报错

    2025-02-26 09:05:04.745 [DlinkTrans - MaxCompute_1] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 755.962s | Percentage 0.00%

    2025-02-26 09:05:04.746 [DlinkTrans - MaxCompute_1] INFO DirtyRecordCheckThread - dirty record check thread begin to stop ...

    2025-02-26 09:05:04.746 [dirty_record_check_thread] INFO DirtyRecordCheckThread - dirty record check thread stopped ...

    2025/02/26 09:05:04 - DlinkTrans - ERROR (version 8.0.0.0-28, build 8.0.0.0-28 from 2017-12-22 02.21.20 by xin.fanx) : 错误被检测到!

    2025-02-26 09:05:04.747 [job-1049200] ERROR DlinkLogbackListener - DlinkTrans - 错误被检测到!

    2025-02-26 09:05:04.749 [job-1049200] ERROR DlinkTrans - Exception when job run

    java.lang.RuntimeException: dlink trans run error

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.doSchedule(DlinkTrans.java:220) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.start(DlinkTrans.java:125) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.runTrans(Engine.java:91) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.entry(Engine.java:174) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.main(Engine.java:253) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    java.lang.OutOfMemoryError: Java heap space

    at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2213) ~[na:na]

    at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1992) ~[na:na]

    at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3413) ~[na:na]

    at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:471) ~[na:na]

    at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3115) ~[na:na]

    at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2344) ~[na:na]

    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2739) ~[na:na]

    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2491) ~[na:na]

    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2449) ~[na:na]

    at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1381) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.query(DBUtil.java:707) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.query(DBUtil.java:693) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:221) ~[na:na]

    at com.alibaba.datax.plugin.reader.drdsreader.DrdsReader$Task.startRead(DrdsReader.java:139) ~[na:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:71) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) ~[na:1.8.0_152]

  • 原因分析

    由于专有云的DRDS版本差异导致无法使用流式读取数据导致内存溢出

  • 解决方案

    通过设置fetchSize为:-2147483648来强制开启低版本的DRDS的流式读取能力来防止内存溢出

问题13

  • 现象描述

    写入cdp hive报错:RetryInvocationHandler - java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "xc-prod-dtstackpoc06":9000; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost, while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xc-prod-dtstackpoc06:9000 after 2 failover attempts. Trying to failover after sleeping for 2215ms.

  • 原因分析

    凡是日志中报java.net.UnknownHostException的,先确认是否上传core-site.xml和hdfs-site.xml,如果已上传就直接找运维配置hadoop集群的hosts相关节点

  • 解决方案

问题14

  • 现象描述

    读取mysql报错:com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of 'net_write_timeout' on the server.

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_152]

    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_152]

    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_152]

    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_152]

    at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3562) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3462) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3905) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:871) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1999) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:374) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:354) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:6313) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:224) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.tidbreader.TiDBReader$Task.startRead(TiDBReader.java:83) [tidbreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

    Caused by: java.io.EOFException: Can not read response from server. Expected to read 578 bytes, read 287 bytes before connection was unexpectedly lost.

    at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3014) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3522) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    ... 11 common frames omitted

  • 原因分析

    当数据库服务端写数据到客户端时,net_write_timeout 控制何时超时,出现该问题说明数据库传输数据过程中被超时 kill;

    当数据库服务正在从客户端读取数据时,net_read_timeout 控制何时超时,即客户端执行数据读取,等待多少秒仍未执行成功时自动断开连接;

    1. 部分场景下写端长时间卡主或执行慢也会导致读端因为长时间不执行拉数据导致超时报错,检查一下写端否有报错,如有先解决写端报错;

    2. 同时检查输入组件是否配置切分键,如果配置切分键,将并发数据再调大点,减少每个task的数据量;

    3. 在jdbcurl中增加net_write_timeout配置,如不生效还需调整服务端配置

    4. 如果以上都不行,则需要查看数据库端日志,一般会有错误信息,查看是否OOM等

  • 解决方案

问题15

  • 现象描述

    写入集群模式redis性能较差

  • 原因分析

    dataphin产品优化

  • 解决方案

    已优化

问题16

  • 现象描述

    oracle2Hive保存已经存在的管道任务报内部异常

    错误堆栈: java.lang.NullPointerException

    at com.alibaba.dataphin.pipeline.application.pipeline.PipelineApplicationService.getNodeUpstreamRelations(PipelineApplicationService.java:1095)

    at com.alibaba.dataphin.pipeline.application.pipeline.PipelineApplicationService.getNodeRelations(PipelineApplicationService.java:974)

    at com.alibaba.dataphin.pipeline.application.pipeline.PipelineApplicationService$$FastClassBySpringCGLIB$$836a0191.invoke(<generated>)

    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)

    at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386)

    at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85)

    [omitted 1 frames]

    at com.alibaba.dataphin.pipeline.application.pipeline.PipelineApplicationService$$EnhancerBySpringCGLIB$$c1a30409.getNodeRelations(<generated>)

    at com.alibaba.dataphin.pipeline.adapter.rpc.PipelineOpenServiceImpl.parseNodeRelations(PipelineOpenServiceImpl.java:693)

    at com.alibaba.dataphin.pipeline.adapter.ide.controller.PipelineCommonController.getNodeRelations(PipelineCommonController.java:556)

    at com.alibaba.dataphin.pipeline.adapter.ide.controller.PipelineCommonController$$FastClassBySpringCGLIB$$4dfa06b4.invoke(<generated>)

  • 原因分析

    由于Oracle数据源没配置Schema导致影响下游配置任务时出现空指针异常

  • 解决方案

问题17

  • 现象描述

    hologres根据表名字关键字查询表列表出现超时异常

  • 原因分析

    由于客户数据库性能差。表过多导致加载超时

  • 解决方案

    让客户调整数据库性能后重试

问题18

  • 现象描述

    logical2mysql运行报错

    A single execution can only return one query result. This session contains temporary tables and will return the result of the last query.

    2025-01-09 16:58:44.460 [job-1197957] INFO OsLMClient - sql run success.

    2025-01-09 16:58:44.463 [job-1197957] ERROR BaseOsInstance - instance close failed.

    java.lang.NoSuchMethodError: com.google.common.util.concurrent.SimpleTimeLimiter.create(Ljava/util/concurrent/ExecutorService;)Lcom/google/common/util/concurrent/SimpleTimeLimiter;

    at com.alibaba.dt.oneservice.sdk.utils.OsTimeLimiter.callWithTimeoutWithOutException(OsTimeLimiter.java:54) ~[oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.task.AbstractHiveTask.closeTask(AbstractHiveTask.java:74) ~[oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.task.BaseOsTask.close(BaseOsTask.java:200) ~[oneservice-sdk-base-4.4.0.jar:na]

    at java.util.ArrayList.forEach(ArrayList.java:1257) ~[na:1.8.0_152]

    at com.alibaba.dt.oneservice.sdk.job.OsJob.close(OsJob.java:170) ~[oneservice-sdk-base-4.4.0.jar:na]

    at java.util.ArrayList.forEach(ArrayList.java:1257) ~[na:1.8.0_152]

    at com.alibaba.dt.oneservice.sdk.executor.BaseOsExecutor.close(BaseOsExecutor.java:98) ~[oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.executor.DefaultOsExecutor.close(DefaultOsExecutor.java:38) ~[oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.BaseOsInstance.close(BaseOsInstance.java:233) ~[oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.OsLMClient.runSql(OsLMClient.java:132) [oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.dt.oneservice.sdk.OsLMClient.parse(OsLMClient.java:84) [oneservice-sdk-base-4.4.0.jar:na]

    at com.alibaba.datax.plugin.reader.logicaltablereader.LogicalTableReader$Job.parsePhysicalTable(LogicalTableReader.java:209) [logicaltablereader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.logicaltablereader.LogicalTableReader$Job.init(LogicalTableReader.java:105) [logicaltablereader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTransRunner.initJobReader(DlinkTransRunner.java:53) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTransPreview.doInit(DlinkTransPreview.java:289) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTransPreview.start(DlinkTransPreview.java:226) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.runTransPreview(Engine.java:102) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.entry(Engine.java:175) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.main(Engine.java:249) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    由于低版本二方包冲突导致加载的版本不一致而出现方法不存在的异常

  • 解决方案

    历史问题。协调客户升级到v4.5.0及以上版本来修复。

问题19

  • 现象描述

    salesforce2odps出现遭遇1972年以前的时间字段同步到odps后差9秒

  • 原因分析

    由于dlink使用的odps客户端不一样处理结果不一样。目前odps官方提示需要及时升级。

    为了保证MaxCompute在多个时区DATETIME类型数据的正确性,MaxCompute服务、Java SDK以及客户端将会进行版本更新(-oversea后缀的Java SDK或客户端版本),更新后可能影响MaxCompute中已经存储的早于1928年的DATETIME类型数据的显示。

    对于非中国东八区的区域,建议您同步更新Java SDK或客户端版本,以保证在1900-01-01之后的SQL计算结果及Tunnel传输数据的准确性和一致性。对于早于1900-01-01的DATETIME数据,SQL的计算显示结果和Tunnel传输数据仍然可能存在343秒的差异。对于新版本SDK或客户端,之前已经上传的早于1928-01-01的DATETIME数据,在新版本中日期时间会减少352秒。

    如果继续使用不带有-oversea后缀的SDK或客户端,SQL计算结果和Tunnel传输数据将存在差异。早于1900-01-01的数据差异为9秒,1900-01-01~1928-01-01的数据差异为352秒。

    https://help.aliyun.com/zh/MaxCompute/user-guide/time-zone-configuration-operations?spm=a2c6h.13066369.question.4.2e4acdebKVjfj6

  • 解决方案

    使用非datetime字段存储,v4.5.0已经升级SDK,需要验证是否解决该问题

问题20

  • 现象描述

    读取es报错:com.alibaba.dt.pipeline.plugin.center.exception.UnRetrieableException: column:[hashCode] is not exists in elastic,the configured columns must be all exist
    at com.aibaba.datax.plugin.reader.elasticsearchreader.ESReader$Task.getRowMeta(ESReader.java:161) ~[elasticsearchreader-0.0.1-SNAPSHOT.jar:na]
    at com.aibaba.datax.plugin.reader.elasticsearchreader.ESReader$Task.init(ESReader.java:124) ~[elasticsearchreader-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    客户该es配置的是es索引别名,该索引别名对应多个索引,其中一个索引缺少一个字段,导致集成任务在根据数据构建rowmeta时无法构建字段元数据

  • 解决方案

    需要客户将多有的es索引字段结构保持一致

问题21

  • 现象描述

    读取ftp文件报错:Caused by: java.lang.NullPointerException: null
    at java.io.Reader.<init>(Reader.java:78) ~[na:1.8.0_152]
    at java.io.InputStreamReader.<init>(InputStreamReader.java:97) ~[na:1.8.0_152]
    at com.alibaba.datax.plugin.unstructuredstorage.reader.UnstructuredStorageReaderUtil.readFromStream(UnstructuredStorageReaderUtil.java:92) ~[plugin-unstructured-storage-util-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    该ftp文件是存在的,但是创建的文件流对象是null。因为ftp服务端没有开启下载权限,需要在ftp侧开启download权限即可

  • 解决方案

    在ftp服务端权限管控

问题22

  • 现象描述

    oracle2starrocks添加新字段后运行失败,提示

    Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "ba_code": 标识符无效

  • 原因分析

    由于前端在新增字段时未识别为大写导致配置成了小写字段名导致oracle无法识别该字段

  • 解决方案

    重新获取一下字段,而不是通过字段管理添加新字段临时绕过

问题23

  • 现象描述

    oracle2tdh运行失败

    Caused by: java.sql.SQLException: ORA-01555: 快照过旧: 回退段号 345 (名称为 "_SYSSMU345_4005072484$") 过小

    at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:494) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:446) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1054) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:623) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:252) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:612) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:213) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.T4CStatement.fetch(T4CStatement.java:1009) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.OracleStatement.fetchMoreRows(OracleStatement.java:3353) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.InsensitiveScrollableResultSet.fetchMoreRows(InsensitiveScrollableResultSet.java:736) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.InsensitiveScrollableResultSet.absoluteInternal(InsensitiveScrollableResultSet.java:692) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at oracle.jdbc.driver.InsensitiveScrollableResultSet.next(InsensitiveScrollableResultSet.java:406) ~[ojdbc8-12.2.0.1.jar:12.2.0.1.0]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.hasNext(CommonRdbmsReader.java:249) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:228) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    由于客户oracle配置的快照区过小,在Dlink任务运行过程中上游数据在频繁写入导致当前快照不可读

  • 解决方案

    优化当前Dlink任务,让该任务尽快运行结束,或者调整oracle配置调大该设置,或者调整该任务启动时间,尽量避免上游频繁写入的时间段运行

问题24

  • 现象描述

    通过copy模式写入adbpg报错: AdbpgCopyProxy - copy from stream meet a exception,byteSize:10485711, error message: Database connection failed when starting copy

  • 原因分析

    通过copy模式写入时,adbpg数据库不会对每次的写入不会激活connection,也就是说创建connection后,虽然通过copy模式一直在写入,但是数据库端认为connection一直在空闲,所以是链接空闲超时被杀掉

  • 解决方案

    在数据库端调大链接空闲时间即可

问题25

  • 现象描述

    经Dlink智能分析,该任务最可能的错误原因是:

    异常数据源(dsType): Hdfs writer

    异常信息(errorMsg): Code:[HdfsWriter-02], Description:[您填写的参数值不合法.]. - 您配置的path: [/user/hive/warehouse_ext/DL/ods_cc_as_dev/s_ods02_owods02_tf_master_retail_invoice_20ss] 不存在, 请先在hive端创建对应的数据库和表.

    异常参数(params): null

    异常堆栈(errorStack): com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[HdfsWriter-02], Description:[您填写的参数值不合法.]. - 您配置的path: [/user/hive/warehouse_ext/DL/ods_cc_as_dev/s_ods02_owods02_tf_master_retail_invoice_20ss] 不存在, 请先在hive端创建对应的数据库和表.

    at com.alibaba.dt.pipeline.plugin.center.exception.DataXException.asDataXException(DataXException.java:51)

    at com.alibaba.datax.plugin.writer.cdp.hdfswriter.HdfsWriter$Job.prepare(HdfsWriter.java:272)

    at com.alibaba.datax.plugin.writer.cdp.hivewriter.HiveWriter$Job.prepare(HiveWriter.java:52)

    at com.alibaba.dt.dlink.core.trans.DlinkTransRunner.prepareJobWriter(DlinkTransRunner.java:90)

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.doPrepare(DlinkTrans.java:295)

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.start(DlinkTrans.java:115)

    at com.alibaba.dt.dlink.core.Engine.runTrans(Engine.java:90)

    at com.alibaba.dt.dlink.core.Engine.entry(Engine.java:173)

    at com.alibaba.dt.dlink.core.Engine.main(Engine.java:249)

    异常类(errorClass): com.alibaba.dt.pipeline.plugin.center.exception.DataXException

    上层异常信息(parentErrorMsg): null

    上层异常类(parentErrorClass): null

    child_process.returncode: 1

    2024-11-08 12:56:57.647 DQE callback - checkDqeStatus request polling, retry count:0

    2024-11-08 12:56:57.712 Task status is not SUCCESS, ignore check!

    2024-11-08 12:56:57.772 No outputData produced.

    2024-11-08 12:56:57.644 Dlink command exit with code: 1

  • 原因分析

    由于core-site.xml中的defaultFS地址不对导致hdfs无法获取到该路径而运行异常

  • 解决方案

    引导客户更新数据源的core-site.xml后重试运行

问题26

  • 现象描述

    odps2adbmysql

    大数据量写入一定时间后出现Speed 0。导致触发无流量超时失败

  • 原因分析

    由于客户adbmysql的配置太低。大规模写入导致worker负载太高,TPS太高导致无法支撑写入

    adb同学回复问题原因:

    这边看是存储的性能问题,TPS跌0的话是因为我们的TPS是一条SQL完全执行完后才去算出来的,如果一个实例都是赞批较大的SQL,写入RT较高的话,一段时间内没有执行成功的写入SQL,TPS就会出现跌0,但是事实上还是在不断地写进存储的。存储老师这边会调整一个尾块的cache试试降低max RT,如果还不行可能需要扩容试试。

    存储侧单个max RT有时会达到1s+,这是因为写入大宽表,攒批比较大我们这边有个写入cache超了,导致触发了cache的换入换出就比较慢,现在有两个优化手段可以做:

    1. 我们帮忙调大写入cache的内存值

    2. 这张大宽表可以调整下block_size,通过调整size,然后在低峰期build下全表

  • 解决方案

    引导客户配置控制流量速度和并发来降速运行。或者升级adbmysql规格

问题27

  • 现象描述

    多输入表管道任务提交出现如下异常

    org.springframework.dao.DataIntegrityViolationException: ### Error querying database. Cause: org.postgresql.util.PSQLException: ERROR: value too long for type character varying(254)### The error may exist in URL [jar:file:/home/admin/dataphin-pipeline/dataphin-pipeline.jar!/BOOT-INF/lib/dataphin-pipeline-adapter-persistence-3.14.0.jar!/pipeline/persistence/mybatis/mapper/pipeline/stream-pipeline-mapper.xml]### The error may involve com.alibaba.dataphin.pipeline.adapter.persistence.realtime.mybatis.mapper.StreamPipelineMapper.addSubmitStep-Inline### The error occurred while setting parameters### SQL: INSERT INTO "dev_29".od_pipeline_stream_submit_step (id, step_name, step_type, parent_step_id, parent_step_name , status, tenant_id, gmt_create, gmt_modify, biz_data) VALUES (?, ?, ?, ?, ? , ?, ?, now(), now(), ?) RETURNING id### Cause: org.postgresql.util.PSQLException: ERROR: value too long for type character varying(254); ERROR: value too long for type character varying(254); nested exception is org.

  • 原因分析

    由于服务端将多表所有表名字作为提交步骤的名字。由于表超过128个表出现过长的名字导致数据库字段长度越界异常

  • 解决方案

    引导客户通过多个管道任务进行提交或者出变更单进行表结构订正

问题28

  • 现象描述

    读写hive on oss报错:error occurred when check path[/warehouse/hive/henlius_dev.db/ztfibcb001] exists

    java.io.IOException: ErrorCode : 3005, ErrorMessage : Caused by error 6403: [E1010]HTTP/1.1 403 Forbidden: <?xml version="1.0" encoding="UTF-8"?><Error> <Code>AccessDenied</Code> <Message>The bucket you access does not belong to you.</Message> <RequestId>673DB054E891D33535F1D04E</RequestId> <HostId>fx-emr-datastore.cn-shanghai.oss-dls.aliyuncs.com</HostId> <EC>0003-00000001</EC> <RecommendDoc>https://api.aliyun.com/troubleshoot?q=0003-00000001</RecommendDoc></Error> [ErrorCode]: 1010

    at com.aliyun.jindodata.call.JindoGetFileStatusCall.execute(JindoGetFileStatusCall.java:47) ~[jindo-sdk-4.6.4.jar:na]

    at com.aliyun.jindodata.common.JindoHadoopSystem.getFileStatus(JindoHadoopSystem.java:640) ~[jindo-sdk-4.6.4.jar:na]

    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1683) ~[hadoop-common-3.2.1.jar:na]

  • 原因分析

    这是因为访问底层oss文件无权限,一般是配置的ak账号无权限。如果配置的hive计算源或者数据源显示配置了ak则使用配置的ak否则使用core-site.xml里的ak账号

  • 解决方案

    检查core-site.xml文件的ak是否有权限,替换下有权限的账号重新上传xml文件即可

问题29

  • 现象描述

    sqlserver2odps

    上游sqlserver表存在数字开头的列名字,运行读取数据出现列名字被擦出开头数字后将数字作为数值返回导致下游数据写入不正确

  • 原因分析

    由于SqlServer的特性,数字开头的列名字在不使用""包裹时返回数据不正确,会将数字作为该列的值返回,将去掉数字的列名字作为新的列名字返回

  • 解决方案

    引导用户使用脚本模式将列名字使用""包裹后重新运行

问题30

  • 现象描述

    sap的RFC读取表数据出现类型错误

    2024-11-21 18:13:42.472 [0-0-0-reader] ERROR ReaderRunner - Reader runner Received Exceptions:

    java.lang.NumberFormatException: For input string: "150.00-"

    at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) ~[na:1.8.0_152]

    at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) ~[na:1.8.0_152]

    at java.lang.Double.parseDouble(Double.java:538) ~[na:1.8.0_152]

    at java.lang.Double.valueOf(Double.java:502) ~[na:1.8.0_152]

    at com.alibaba.datax.plugin.reader.saptablereader.SapTableUtils.getColumnFromField(SapTableUtils.java:185) ~[saptablereader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.saptablereader.SapTableReader$Task.startRead(SapTableReader.java:634) ~[saptablereader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

    org.pentaho.di.core.exception.KettleException:

    Input.Error.NormalException

    Reader adaptor run error

    at com.alibaba.dt.dlink.core.trans.adaptor.BaseReaderStepAdaptor.processRow(BaseReaderStepAdaptor.java:117)

    at com.alibaba.dt.dlink.core.trans.adaptor.OptimizeReaderStepAdaptor.processRow(OptimizeReaderStepAdaptor.java:65)

    at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)

    at java.lang.Thread.run(Thread.java:882)

    Caused by: java.lang.RuntimeException: Reader adaptor run error

    at com.alibaba.dt.dlink.core.trans.adaptor.BaseReaderStepAdaptor.processRow(BaseReaderStepAdaptor.java:104)

    ... 3 more

    Caused by: java.lang.NumberFormatException: For input string: "150.00-"

    at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)

    at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)

    at java.lang.Double.parseDouble(Double.java:538)

    at java.lang.Double.valueOf(Double.java:502)

    at com.alibaba.datax.plugin.reader.saptablereader.SapTableUtils.getColumnFromField(SapTableUtils.java:185)

    at com.alibaba.datax.plugin.reader.saptablereader.SapTableReader$Task.startRead(SapTableReader.java:634)

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57)

    ... 1 more

  • 原因分析

    由于sap的RFC读取数据,原始数据中存在150.00-是属于合法的double类型数据。通过function获取该列的类型为Double进行类型转换时在Java中不合法导致出现异常

  • 解决方案

    无法绕过

问题31

  • 现象描述

    读写hive报错:2024-11-22 10:19:25.457 FAILED: InvocationException: Target: [java.util.List com.alibaba.dt.oneservice.api.api.SqlParserApi.sqlLegalityVerify(com.alibaba.dt.oneservice.api.query.UserQuery,java.lang.String,java.lang.String,java.util.List,boolean,boolean)] Message: [HTTP-ERROR 504: [null]] Cause: [com.alibaba.dt.onedata.rpc.core.http.HttpStatusNotOKException: HTTP-ERROR 504: [null]]

  • 原因分析

    通过rpc调用oneservice接口报504,一般是应用服务有问题,或者是调用超时等

  • 解决方案

    这里是客户的hive元数据库mysql应该有变动,导致dataphin访问计算源表时会查询mysql元数据库,结果连接不通导致最终超时。需要客户自行排查是否mysql数据库开启了防火墙、白名单等变更

问题32

  • 现象描述

    读取逻辑表报错:Caused by: java.net.UnknownHostException: dataphin-oneservice.dataphin.service: Name or service not known

  • 原因分析

    集群没有配置域名

  • 解决方案

    需要联系运维配置域名或者DNS等等

问题33

  • 现象描述

    使用字段计算组件GET_JSON_OBJECT函数概率性报错:java.lang.ClassCastException: java.util.LinkedHashMap$Entry cannot be cast to java.util.HashMap$TreeNode

    at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1832) ~[na:1.8.0_152]

    at java.util.HashMap$TreeNode.treeify(HashMap.java:1949) ~[na:1.8.0_152]

    at java.util.HashMap.treeifyBin(HashMap.java:772) ~[na:1.8.0_152]

    at java.util.HashMap.putVal(HashMap.java:644) ~[na:1.8.0_152]

    at java.util.HashMap.put(HashMap.java:612) ~[na:1.8.0_152]

    at com.alibaba.datax.plugin.trans.calculation.functions.strings.GetJsonObjectFunction.call(GetJsonObjectFunction.java:115) ~[na:na]

  • 原因分析

    函数实现逻辑有bug,多线程并发存在线程安全问题

  • 解决方案

    客户可以将并发配置1绕过

问题34

  • 现象描述

    Kafka2Odps运行长时间无流量失败

    1-26 01:08:01.740 [job-2105164] ERROR Engine -

    经Dlink智能分析,该任务最可能的错误原因是:

    异常数据源(dsType): all all

    异常信息(errorMsg): ZERO_FLOW_TIMEOUT - 无流量持续时间:[1801s] 超过配置阈值:[1800s],通常由于输入数据源或者输出数据源负载较大、数据量较大、网络带宽打满等原因,导致输入端拉取不到数据或者输出端写不进数据,需要客户侧分析数据源、排查网络、优化管道任务(如问题出现在数据库输入表数据量较大,可为输入表添加整型类型的主键或者索引,在管道任务配置切分键,过滤条件尽可能使用索引、并增大并发等)等,如果确实会存在较长时间无流量,则可以调整管道任务的无流量超时配置

    异常参数(params): null

    异常堆栈(errorStack): com.alibaba.dt.dlink.core.exception.DlinkException: ZERO_FLOW_TIMEOUT - 无流量持续时间:[1801s] 超过配置阈值:[1800s],通常由于输入数据源或者输出数据源负载较大、数据量较大、网络带宽打满等原因,导致输入端拉取不到数据或者输出端写不进数据,需要客户侧分析数据源、排查网络、优化管道任务(如问题出现在数据库输入表数据量较大,可为输入表添加整型类型的主键或者索引,在管道任务配置切分键,过滤条件尽可能使用索引、并增大并发等)等,如果确实会存在较长时间无流量,则可以调整管道任务的无流量超时配置

    at com.alibaba.dt.dlink.core.metric.KettleMetricCollector.logTransMetric(KettleMetricCollector.java:346)

    at com.alibaba.dt.dlink.core.metric.KettleMetricCollector.lambda$startTransMetricReport$4(KettleMetricCollector.java:302)

    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:186)

    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:300)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)

    at java.lang.Thread.run(Thread.java:882)

    异常类(errorClass): com.alibaba.dt.dlink.core.exception.DlinkException

    上层异常信息(parentErrorMsg): null

    上层异常类(parentErrorClass): null

  • 原因分析

    由于客户配置了到达指定位点结束的策略,而该topic中很多分区没有数据导致都同步等待最长10m,从而触发了超时无流量失败

  • 解决方案

    引导客户修改结束策略为:无数据等待一分钟结束的结束策略后重试

问题35

  • 现象描述

    hive2hbase运行异常

    2024-11-26 08:20:37.371 [0-0-0-writer] ERROR WriterRunner - Writer Runner Received Exceptions:

    com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[Hbasewriter-12], Description:[获取hbase BufferedMutator 时出错.]. - callTimeout=60000, callDuration=97786: callTimeout=60000, callDuration=68432: Call to ztdbtestxc21/10.50.4.21:60020 failed on local exception: javax.security.sasl.SaslException: No common protection layer between client and server row 'quickdecision_300003801:velocity_two_7' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=ztdbtestxc21,60020,1731403864912, seqNum=-1

    at com.alibaba.dt.pipeline.plugin.center.exception.DataXException.asDataXException(DataXException.java:65) ~[plugin.center.base-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.Hbase23xHelper.getBufferedMutator(Hbase23xHelper.java:123) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.HbaseAbstractTask.<init>(HbaseAbstractTask.java:38) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.NormalTask.<init>(NormalTask.java:23) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.Hbase23xWriter$Task.init(Hbase23xWriter.java:67) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.WriterRunner.run(WriterRunner.java:48) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:750) [na:1.8.0_422]

    Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=97786: callTimeout=60000, callDuration=68432: Call to ztdbtestxc21/10.50.4.21:60020 failed on local exception: javax.security.sasl.SaslException: No common protection layer between client and server row 'quickdecision_300003801:velocity_two_7' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=ztdbtestxc21,60020,1731403864912, seqNum=-1

    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3014) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3006) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:464) ~[hbase-client-2.3.4.jar:2.3.4]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.Hbase23xHelper.checkHbaseTable(Hbase23xHelper.java:214) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.emr5.hbase23xwriter.Hbase23xHelper.getBufferedMutator(Hbase23xHelper.java:113) ~[emr5_hbase23xwriter-0.0.1-SNAPSHOT.jar:na]

    ... 5 common frames omitted

    Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68432: Call to ztdbtestxc21/10.50.4.21:60020 failed on local exception: javax.security.sasl.SaslException: No common protection layer between client and server row 'quickdecision_300003801:velocity_two_7' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=ztdbtestxc21,60020,1731403864912, seqNum=-1

    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HTable.get(HTable.java:383) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HTable.get(HTable.java:357) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1166) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:463) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:467) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:464) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.RpcRetryingCallable.call(RpcRetryingCallable.java:58) ~[hbase-client-2.3.4.jar:2.3.4]

    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107) ~[hbase-client-2.3.4.jar:2.3.4]

    ... 10 common frames omitted

    Caused by: javax.security.sasl.SaslException: Call to ztdbtestxc21/10.50.4.21:60020 failed on local exception: javax.security.sasl.SaslException: No common protection layer between client and server

  • 原因分析

    由于客户上传的hbase-site.xml中未指定匹配的rpc加密协议,导致无法写入数据

    需要在xml中配置hbase.rpc.protection配置项【authentication、integrity、privacy】,调整为服务端匹配的配置值后重试

  • 解决方案

    需要在xml中配置hbase.rpc.protection配置项【authentication、integrity、privacy】,调整为服务端匹配的配置值后重试

    如下配置项

    <property>

    <name>hbase.rpc.protection</name>

    <value>privacy</value>

    </property>

问题36

  • 现象描述

    MangoDB读取数据,部分列没有读取到数据。其中三列数据为空

  • 原因分析

    由于客户的MangoDB中列名字是大小写驼峰格式,数据集成脚本写的列名字为小写。所以没有匹配上列名字导致无法读取数据

  • 解决方案

    引导用户修改列名字,严格按照MangoDB的列名字填写后重新运行

问题37

  • 现象描述

    通过openApi创建的管道任务在页面提交报错:write javaBean error, fastjson version 1.2.76, class com.alibaba.dataphin.pipeline.common.facade.event.PipelineNodeActionEvent, write javaBean error, fastjson version 1.2.76, class com.alibaba.dataphin.pipeline.plugin.base.DefaultOutputPluginConfig, fieldName : 0

  • 原因分析

    是因为通过api创建的管道配置不正确,可能的原因是输出组件配置的columnMappings不正确,该配置项的最简配置为:[{

    "order": 0,

    "sourceColumn": "id",

    "targetColumn": "id"

    }]

  • 解决方案

    openapi修改columnMappings的配置,按照columnMappings需要的格式配置即可

问题38

  • 现象描述

    hive读取出现异常:

    java.io.EOFException: Reading BigInteger past EOF from compressed stream Stream for column 24 kind DATA position: 8 length: 8 range: 0 offset: 799619 limit: 799619 range 0 = 0 to 8 uncompressed: 5 to 5

    at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$DecimalTreeReader.next(RecordReaderImpl.java:1236) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1890) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:3143) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.next(OrcRawRecordMerger.java:263) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.next(OrcRawRecordMerger.java:547) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1156) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1140) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1073) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1059) ~[hive-exec-1.1.0-cdh5.16.2.jar:1.1.0-cdh5.16.2]

    at com.alibaba.datax.plugin.reader.hdfsreader.HdfsHelper.doOrcFileStartRead(HdfsHelper.java:278) [hdfsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.hdfsreader.HdfsHelper.lambda$orcFileStartRead$3(HdfsHelper.java:220) [hdfsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.hadoop.BaseDfsUtil.callRunnable(BaseDfsUtil.java:510) ~[plugin-hadoop-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.hadoop.BaseDfsUtil.lambda$hdfsOperation$2(BaseDfsUtil.java:527) ~[plugin-hadoop-util-0.0.1-SNAPSHOT.jar:na]

    at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_152]

    at javax.security.auth.Subject.doAs(Subject.java:360) ~[na:1.8.0_152]

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1904) ~[hadoop-common-2.6.0-cdh5.16.2.jar:na]

    at com.alibaba.datax.plugin.hadoop.BaseDfsUtil.hdfsOperation(BaseDfsUtil.java:525) ~[plugin-hadoop-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.hdfsreader.HdfsHelper.orcFileStartRead(HdfsHelper.java:220) [hdfsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.hdfsreader.HdfsReader$Task.startRead(HdfsReader.java:379) ~[hdfsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) ~[na:1.8.0_152]

    2024-10-15 12:23:51.771 [0-0-0-reader] ERROR BaseDfsUtil - error occurred where call hadoop api

  • 原因分析

    由于底层decima字段通过dlink写入时结构固定位decimal(38,18)。由于上游通过sql写入了22位整型数字导致dlink无法读取

  • 解决方案

    通过版本升级来解决

问题39

  • 现象描述

    java.sql.SQLException: Incorrect string value: '\xF0\xA0\x82\x86\xE6\x88...' for column 'plan_remark' at row 1

    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965) ~[na:na]

    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3978) ~[na:na]

    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3914) ~[na:na]

    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530) ~[na:na]

    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683) ~[na:na]

    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2495) ~[na:na]

    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1903) ~[na:na]

    at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1242) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter$Task.doOneInsert(CommonRdbmsWriter.java:457) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter$Task.doBatchInsert(CommonRdbmsWriter.java:440) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter$Task.startWriteWithConnection(CommonRdbmsWriter.java:360) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter$Task.startWrite(CommonRdbmsWriter.java:385) ~[na:na]

    at com.alibaba.datax.plugin.writer.mysqlwriter.MysqlWriter$Task.startWrite(MysqlWriter.java:126) ~[na:na]

    at com.alibaba.dt.dlink.core.trans.WriterRunner.run(WriterRunner.java:55) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:750) ~[na:1.8.0_342]

  • 原因分析

    由于上游读取数据包含表情符号字符,下游Mysql无法存储该格式字段

  • 解决方案

    引导客户调整下游mysql表存储各位为utf8mb4来重新运行

问题40

  • 现象描述

    测试数据源连接报错user does not have permission to submit application to queue default

  • 原因分析

    测试计算源走的队列是默认队列

  • 解决方案

    在jdbc url上替换参数

问题41

  • 现象描述

    Caused by: com.alibaba.dt.pipeline.plugin.center.exception.UnRetrieableException: Code:[DBUtilErrorCode-10], Description:[连接数据库失败. 请检查您的 账号、密码、数据库名称、IP、Port或者向 DBA 寻求帮助(注意网络环境).]. - 无法在 Active Directory 中对用户 DP.Rea***@Rexel.com.cn 进行身份验证(Authentication=ActiveDirectoryPassword)。 java.net.ConnectException: Connection timed out (Connection timed out)

    at com.alibaba.datax.plugin.rdbms.util.RdbmsException.asConnException(RdbmsException.java:38) ~[plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:680) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:569) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:529) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.testConnWithoutRetry(DBUtil.java:936) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.lambda$chooseJdbcUrl$1(DBUtil.java:247) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    ... 15 common frames omitted

    Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: 无法在 Active Directory 中对用户 DP.Rea***@Rexel.com.cn 进行身份验证(Authentication=ActiveDirectoryPassword)。 java.net.ConnectException: Connection timed out (Connection timed out)

    at com.microsoft.sqlserver.jdbc.SQLServerMSAL4JUtils.getCorrectedException(SQLServerMSAL4JUtils.java:449) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerMSAL4JUtils.getSqlFedAuthToken(SQLServerMSAL4JUtils.java:111) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.getFedAuthToken(SQLServerConnection.java:5996) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.onFedAuthInfo(SQLServerConnection.java:5963) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.processFedAuthInfo(SQLServerConnection.java:5797) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.TDSTokenHandler.onFedAuthInfo(tdsparser.java:322) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:130) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:42) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:6855) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:5402) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$300(SQLServerConnection.java:94) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:5334) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7739) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4384) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:3823) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:3348) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:3179) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:1953) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1263) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    at java.sql.DriverManager.getConnection(DriverManager.java:674) ~[na:1.8.0_152]

    at java.sql.DriverManager.getConnection(DriverManager.java:217) ~[na:1.8.0_152]

    at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:677) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    ... 19 common frames omitted

    Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.net.ConnectException: Connection timed out (Connection timed out)

    at com.microsoft.sqlserver.jdbc.SQLServerMSAL4JUtils.getCorrectedException(SQLServerMSAL4JUtils.java:447) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    ... 40 common frames omitted

    Caused by: java.lang.RuntimeException: java.net.ConnectException: Connection timed out (Connection timed out)

    at com.microsoft.sqlserver.jdbc.SQLServerMSAL4JUtils.getCorrectedException(SQLServerMSAL4JUtils.java:439) ~[resource.fs_aa521113-15c2-4f21-8a76-615654f335f4.jar:na]

    ... 40 common frames omitted

  • 原因分析

    微软对应的一个域名是动态解析的ip,ip变换了

  • 解决方案

    从acjar日志获取了拦截地址,修改了白名单

问题42

  • 现象描述

    写入selectDB报错: RetryExec - I/O exception (java.net.SocketException) caught when processing request to {}->http://10.64.22.45:9030: Connection reset

    RetryExec - Retrying request to {}->http://10.64.22.45:9030

    Failed to flush batch data to SelectDB, retry times = 0

    java.net.SocketException: Connection reset

    at java.net.SocketInputStream.read(SocketInputStream.java:210) ~[na:1.8.0_152]

    at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_152]

    at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:220) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) ~[httpcore-4.4.6.jar:4.4.6]

    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[httpclient-4.5.4.jar:4.5.4]

    at com.alibaba.datax.plugin.writer.selectdbwriter.SelectDBStreamLoadObserver.put(SelectDBStreamLoadObserver.java:199) ~[selectdbwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.selectdbwriter.SelectDBStreamLoadObserver.streamLoad(SelectDBStreamLoadObserver.java:80) ~[selectdbwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.selectdbwriter.SelectDBWriterManager.asyncFlush(SelectDBWriterManager.java:183) [selectdbwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.selectdbwriter.SelectDBWriterManager.access$000(SelectDBWriterManager.java:36) [selectdbwriter-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.writer.selectdbwriter.SelectDBWriterManager$1.run(SelectDBWriterManager.java:154) [selectdbwriter-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    通常情况下,写入selectDB、doris、starrocks报connection reset都是配置的BE节点的端口号不正确,没有开启相应的服务导致被重制链接,需要检查数据源配置的BE节点端口号是否正确。因为写入selectDB是通过stream load的方式写入,该方式会连接BE节点

  • 解决方案

    检查selectDB数据源的BE节点的端口号是否正确,该端口号与jdbc的端口号是不一样的

问题43

  • 现象描述

    mysql2hive运行偶现hdfs写入异常

    java.io.FileNotFoundException: File does not exist: /dtInsight/hive/warehouse/zk_data_prod.db/ods_klgo_zk_odp_stock_summary_history_df/ds=20241021/source=jilin_klgo__6b1dcca5_849f_4ce8_8fc4_99943494fb1f/ods_klgo_zk_odp_stock_summary_history_df (inode 147685406) Holder DFSClient_NONMAPREDUCE_1935927339_1 does not have any open files.

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2898)

    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)

    at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)

    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2777)

    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)

    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)

    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)

    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)

    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)

    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:422)

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)

    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_342]

    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_342]

    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_342]

    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_342]

    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) ~[hadoop-common-3.1.1.7.1.5.0-257.jar:na]

    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) ~[hadoop-common-3.1.1.7.1.5.0-257.jar:na]

    at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088) ~[hadoop-hdfs-client-3.1.1.7.1.5.0-257.jar:na]

    at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1866) ~[hadoop-hdfs-client-3.1.1.7.1.5.0-257.jar:na]

    at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668) ~[hadoop-hdfs-client-3.1.1.7.1.5.0-257.jar:na]

    at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) ~[hadoop-hdfs-client-3.1.1.7.1.5.0-257.jar:na]

  • 原因分析

    由于客户emr集群近一周资源紧张导致运行不稳定导致hdfs读写异常。

  • 解决方案

    引导客户优化任务优先级,控制集群资源使用量后再观察稳定性

问题44

  • 现象描述

    管道研发页面提交或着运行报:数据源与环境不一致:[env:DEV, dsId:xxxx]

  • 原因分析

    这个错误说明管道中配置的数据源ID和当前环境不一致,比如当前环境是dev,但是管道中配置的数据源id是生产的数据源id。

    可能的原因有:

    1. 如果使用的是脚本模式,则可能是在脚本模式中显示指定了生产数据源id,需要将其dsId配置项删除

    2. 如果是通过open api创建,则需要在open api的代码中使用开发数据源的ds id

    3. 可能是浏览器缓存问题,清空缓存刷新即可

  • 解决方案

问题45

  • 现象描述

    {"record":[{"byteSize":5,"index":0,"rawData":95010,"type":"LONG"},{"byteSize":24,"index":1,"rawData":"6716cf5b5a0a65a5cba8528a","type":"STRING"},{"byteSize":21,"index":2,"rawData":"202410141439430294790","type":"STRING"},{"byteSize":20,"index":3,"rawData":"20241014143943029479","type":"STRING"},{"byteSize":0,"index":4,"rawData":"","type":"STRING"},{"byteSize":7,"index":5,"rawData":"1610376","type":"STRING"},{"byteSize":4,"index":6,"rawData":"8407","type":"STRING"},{"byteSize":5,"index":7,"rawData":22200,"type":"LONG"},{"byteSize":0,"index":8,"rawData":"","type":"STRING"},{"byteSize":1,"index":9,"rawData":0,"type":"LONG"},{"byteSize":1,"index":10,"rawData":1,"type":"LONG"},{"byteSize":1,"index":11,"rawData":2,"type":"LONG"},{"byteSize":5,"index":12,"rawData":"10001","type":"STRING"},{"byteSize":0,"index":13,"rawData":"","type":"STRING"},{"byteSize":0,"index":14,"rawData":"","type":"STRING"},{"byteSize":21,"index":15,"rawData":"automatic night audit","type":"STRING"},{"byteSize":21,"index":16,"rawData":"automatic night audit","type":"STRING"},{"byteSize":0,"index":17,"rawData":"","type":"STRING"},{"byteSize":1,"index":18,"rawData":0,"type":"LONG"},{"byteSize":1,"index":19,"rawData":0,"type":"LONG"},{"byteSize":8,"index":20,"rawData":1729440000000,"type":"DATE"},{"byteSize":8,"index":21,"rawData":1729548123333,"type":"DATE"},{"byteSize":8,"index":22,"rawData":1729548123333,"type":"DATE"},{"byteSize":1,"index":23,"rawData":0,"type":"LONG"},{"byteSize":1,"index":24,"rawData":0,"type":"LONG"},{"byteSize":1,"index":25,"rawData":0,"type":"LONG"},{"byteSize":4,"index":26,"rawData":"0.00","type":"DOUBLE"},{"byteSize":0,"index":27,"rawData":"","type":"STRING"},{"byteSize":1,"index":28,"rawData":0,"type":"LONG"},{"byteSize":1,"index":29,"rawData":0,"type":"LONG"},{"byteSize":1,"index":30,"rawData":1,"type":"LONG"},{"byteSize":1,"index":31,"rawData":0,"type":"LONG"},{"byteSize":1,"index":32,"rawData":0,"type":"LONG"},{"byteSize":0,"index":33,"rawData":"","type":"STRING"},{"byteSize":1,"index":34,"rawData":0,"type":"LONG"},{"byteSize":1,"index":35,"rawData":1,"type":"LONG"},{"byteSize":1,"index":36,"rawData":0,"type":"LONG"},{"byteSize":1,"index":37,"rawData":1,"type":"LONG"},{"byteSize":0,"index":38,"rawData":"","type":"STRING"},{"byteSize":0,"index":39,"rawData":"","type":"STRING"},{"byteSize":0,"index":40,"rawData":"","type":"STRING"},{"byteSize":0,"index":41,"rawData":"","type":"STRING"},{"byteSize":0,"index":42,"rawData":"","type":"STRING"},{"byteSize":0,"index":43,"rawData":"","type":"STRING"},{"byteSize":1,"index":44,"rawData":0,"type":"LONG"},{"byteSize":0,"index":45,"rawData":"","type":"STRING"},{"byteSize":4,"index":46,"rawData":"0.00","type":"DOUBLE"},{"byteSize":1,"index":47,"rawData":0,"type":"LONG"},{"byteSize":1,"index":48,"rawData":0,"type":"LONG"},{"byteSize":0,"index":49,"rawData":"","type":"STRING"},{"byteSize":0,"index":50,"rawData":"","type":"STRING"},{"byteSize":0,"index":51,"rawData":"","type":"STRING"},{"byteSize":0,"index":52,"rawData":"","type":"STRING"},{"byteSize":0,"index":53,"rawData":"","type":"STRING"},{"byteSize":0,"index":54,"rawData":"","type":"STRING"},{"byteSize":0,"index":55,"rawData":"","type":"STRING"},{"byteSize":0,"index":56,"rawData":"","type":"STRING"},{"byteSize":8,"index":57,"rawData":"dataphin","type":"STRING"},{"byteSize":7,"index":58,"rawData":"account","type":"STRING"},{"byteSize":8,"index":59,"rawData":1729760552502,"type":"DATE"}],"type":"writer","message":"字段类型转换错误:你目标字段为[INT]类型,实际字段值为[]."}

    2024-10-24 17:02:32.895 [0-0-0-writer] ERROR DlinkTaskPluginCollector - 脏数据:

    {"record":[{"byteSize":5,"index":0,"rawData":95011,"type":"LONG"},{"byteSize":24,"index":1,"rawData":"6716cf5b5a0a65a5cba8528b","type":"STRING"},{"byteSize":21,"index":2,"rawData":"202410141455470294830","type":"STRING"},{"byteSize":20,"index":3,"rawData":"20241014145547029483","type":"STRING"},{"byteSize":0,"index":4,"rawData":"","type":"STRING"},{"byteSize":7,"index":5,"rawData":"1610376","type":"STRING"},{"byteSize":4,"index":6,"rawData":"8311","type":"STRING"},{"byteSize":5,"index":7,"rawData":20000,"type":"LONG"},{"byteSize":0,"index":8,"rawData":"","type":"STRING"},{"byteSize":1,"index":9,"rawData":0,"type":"LONG"},{"byteSize":1,"index":10,"rawData":1,"type":"LONG"},{"byteSize":1,"index":11,"rawData":2,"type":"LONG"},{"byteSize":5,"index":12,"rawData":"10001","type":"STRING"},{"byteSize":0,"index":13,"rawData":"","type":"STRING"},{"byteSize":0,"index":14,"rawData":"","type":"STRING"},{"byteSize":21,"index":15,"rawData":"automatic night audit","type":"STRING"},{"byteSize":21,"index":16,"rawData":"automatic night audit","type":"STRING"},{"byteSize":0,"index":17,"rawData":"","type":"STRING"},{"byteSize":1,"index":18,"rawData":0,"type":"LONG"},{"byteSize":1,"index":19,"rawData":0,"type":"LONG"},{"byteSize":8,"index":20,"rawData":1729440000000,"type":"DATE"},{"byteSize":8,"index":21,"rawData":1729548123466,"type":"DATE"},{"byteSize":8,"index":22,"rawData":1729548123466,"type":"DATE"},{"byteSize":1,"index":23,"rawData":0,"type":"LONG"},{"byteSize":1,"index":24,"rawData":0,"type":"LONG"},{"byteSize":1,"index":25,"rawData":0,"type":"LONG"},{"byteSize":4,"index":26,"rawData":"0.00","type":"DOUBLE"},{"byteSize":0,"index":27,"rawData":"","type":"STRING"},{"byteSize":1,"index":28,"rawData":0,"type":"LONG"},{"byteSize":1,"index":29,"rawData":0,"type":"LONG"},{"byteSize":1,"index":30,"rawData":1,"type":"LONG"},{"byteSize":1,"index":31,"rawData":0,"type":"LONG"},{"byteSize":1,"index":32,"rawData":0,"type":"LONG"},{"byteSize":0,"index":33,"rawData":"","type":"STRING"},{"byteSize":1,"index":34,"rawData":0,"type":"LONG"},{"byteSize":1,"index":35,"rawData":1,"type":"LONG"},{"byteSize":1,"index":36,"rawData":0,"type":"LONG"},{"byteSize":1,"index":37,"rawData":1,"type":"LONG"},{"byteSize":0,"index":38,"rawData":"","type":"STRING"},{"byteSize":0,"index":39,"rawData":"","type":"STRING"},{"byteSize":0,"index":40,"rawData":"","type":"STRING"},{"byteSize":0,"index":41,"rawData":"","type":"STRING"},{"byteSize":0,"index":42,"rawData":"","type":"STRING"},{"byteSize":0,"index":43,"rawData":"","type":"STRING"},{"byteSize":1,"index":44,"rawData":0,"type":"LONG"},{"byteSize":0,"index":45,"rawData":"","type":"STRING"},{"byteSize":4,"index":46,"rawData":"0.00","type":"DOUBLE"},{"byteSize":1,"index":47,"rawData":0,"type":"LONG"},{"byteSize":1,"index":48,"rawData":0,"type":"LONG"},{"byteSize":0,"index":49,"rawData":"","type":"STRING"},{"byteSize":0,"index":50,"rawData":"","type":"STRING"},{"byteSize":0,"index":51,"rawData":"","type":"STRING"},{"byteSize":0,"index":52,"rawData":"","type":"STRING"},{"byteSize":0,"index":53,"rawData":"","type":"STRING"},{"byteSize":0,"index":54,"rawData":"","type":"STRING"},{"byteSize":0,"index":55,"rawData":"","type":"STRING"},{"byteSize":0,"index":56,"rawData":"","type":"STRING"},{"byteSize":8,"index":57,"rawData":"dataphin","type":"STRING"},{"byteSize":7,"index":58,"rawData":"account","type":"STRING"},{"byteSize":8,"index":59,"rawData":1729760552502,"type":"DATE"}],"type":"writer","message":"字段类型转换错误:你目标字段为[INT]类型,实际字段值为[]."}

  • 原因分析

    由于客户上游mysql中customer_id字段是字符串,映射的下游为int类型。而上游数据中存在空白字符串而不是null导致类型转换异常,报脏数据异常

  • 解决方案

    引导客户调整下游映射字段类型为字符串解决

问题46

  • 现象描述

    odps2adbMYSQL运行异常:

    Caused by: java.net.SocketException: Connection reset

    at java.net.SocketInputStream.read(SocketInputStream.java:210) ~[na:1.8.0_152]

    at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_152]

    at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3011) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3472) ~[mysql-connector-java-5.1.47.jar:5.1.47]

    ... 18 common frames omitted

  • 原因分析

    由于ADBMYSQL数据源系统性不稳定导致运行异常。由于ADBMYSQL研发人员暂未调查清楚原因。目前通过调整写入并发数为1.写入批次大小为2M和256条运行稳定

  • 解决方案

    调整写入并发数为1.写入批次大小为2M和256条后重新运行

问题47

  • 现象描述

    KingBase2Odps运行报错

    2024-10-30 10:06:02.797 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select id,product_name,product_price,competitor_name,competitor_price,cv_difference,pi_value,project_code,date_value,project_type,"Equipment_name","product_CV","competitor_CV" from products_comparison

    ] jdbcUrl:[jdbc:kingbase8://10.120.169.38:33321/yggzt_cpch?currentSchema=public].

    2024-10-30 10:06:02.797 [DlinkTrans - MaxCompute_1] ERROR DlinkLogbackListener - MaxCompute_1 - org.pentaho.di.core.exception.KettleException:

    Input.Error.NormalException

    Writer adaptor run error

    at com.alibaba.dt.dlink.core.trans.adaptor.BaseWriterStepAdaptor.processRow(BaseWriterStepAdaptor.java:212)

    at com.alibaba.dt.dlink.core.trans.adaptor.OptimizeWriterStepAdaptor.processRow(OptimizeWriterStepAdaptor.java:64)

    at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)

    at java.lang.Thread.run(Thread.java:882)

    Caused by: java.lang.RuntimeException: Writer adaptor run error

    at com.alibaba.dt.dlink.core.trans.adaptor.BaseWriterStepAdaptor.processRow(BaseWriterStepAdaptor.java:200)

    ... 3 more

    Caused by: com.alibaba.dt.pipeline.plugin.center.exception.DataXException: Code:[OdpsWriter-09], Description:[写入数据到 ODPS 目的表失败.]. - 写入 ODPS 目的表失败. 请联系 ODPS 管理员处理.

    at com.alibaba.dt.pipeline.plugin.center.exception.DataXException.asDataXException(DataXException.java:58)

    at com.alibaba.datax.plugin.writer.odpswriter.OdpsWriter$Task.startWrite(OdpsWriter.java:398)

    at com.alibaba.dt.dlink.core.trans.WriterRunner.run(WriterRunner.java:55)

    ... 1 more

    Caused by: java.lang.NullPointerException: Record columns [[{"name":"id","type":"LONG"},{"name":"product_name","type":"STRING"},{"name":"product_price","type":"DOUBLE"},{"name":"competitor_name","type":"STRING"},{"name":"competitor_price","type":"DOUBLE"},{"name":"cv_difference","type":"DOUBLE"},{"name":"pi_value","type":"STRING"},{"name":"project_code","type":"STRING"},{"name":"date_value","type":"STRING"},{"name":"project_type","type":"STRING"},{"name":"Equipment_name","type":"STRING"},{"name":"product_CV","type":"DOUBLE"},{"name":"competitor_CV","type":"DOUBLE"}]] has no sourceColName ["Equipment_name"].

    at org.apache.commons.lang3.Validate.notNull(Validate.java:222)

    at com.alibaba.dt.dlink.core.trans.adaptor.BaseWriterStepAdaptor.initColumnMappingIndex(BaseWriterStepAdaptor.java:85)

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.RecordReceiverAdaptor.getFromReader(RecordReceiverAdaptor.java:148)

    at com.alibaba.datax.plugin.writer.odpswriter.OdpsWriter$Task.startWrite(OdpsWriter.java:392)

    ... 2 more

    2024/10/30 10:06:02 - MaxCompute_1.0 - 完成处理 (I=0, O=0, R=1, W=0, U=0, E=1)

  • 原因分析

    由于客户的KingBase是企业定制版导致无法获取元数据,用户使用脚本模式提交人,使用到的表中字段名存在大小写的驼峰命名字段,Columns定义中增加了""包裹的同时在columnMapping中也增加了""包裹,在列映射中所有的sourceColName和dstColName的名字都会作为查询结果的列名字进行匹配查找。所以这里增加了""包裹也作为列名字的一部分来匹配读取结果的数据,而读取的表结果数据的sql返回列头并不包含""的列头。所以报来源列找不到的异常

  • 解决方案

    将columnMapping中的""包裹去掉。在存在大小写驼峰命名的列的RDBMS类型的数据源需要将列名字增加相应的转意包裹符号(如``,"")时仅仅需要在column中增加,在columnMapping中不需要增加

问题48

  • 现象描述

    同步任务(datax)sftp读取报错:com.jcraft.jsch.JSchException: Algorithm negotiation fail
    at com.jcraft.jsch.Session.receive_kexinit(Session.java:583)
    at com.jcraft.jsch.Session.connect(Session.java:320)
    at com.jcraft.jsch.Session.connect(Session.java:183)
    at com.alibaba.datax.plugin.reader.ftpreader.SftpHelper.loginFtpServer(SftpHelper.java:45)

  • 原因分析

    datax同步任务使用的jsch客户端版本为0.15.1与客户ssh支持算法协议版本不一致导致Algorithm negotiation fail。

  • 解决方案

    使用数据集成管道任务即可,dlink引擎使用的jsch版本为0.15.4版本,支持的算法协议版本与客户的ssh一致

问题49

  • 现象描述

    元数据报错,提示ddl解析失败

  • 原因分析

    fastsql代码不支持starrocks使用list分区

  • 解决方案

    引导客户使用表达式分区,产品已在V4.4版本修复

问题50

  • 现象描述

    2024-09-03 04:50:54.029 [0-1-1-writer] INFO CommonRdbmsWriter$Task - batchWrite use time:538ms,batchSize:4096,batchBytes:1250680

    child_process.returncode: -9

    2024-09-03 04:51:21.890 DQE callback - checkDqeStatus request polling, retry count:0

    2024-09-03 04:51:21.976 Task status is not SUCCESS, ignore check!

    2024-09-03 04:51:22.062 No outputData produced.

    2024-09-03 04:51:21.886 Dlink command exit with code: 47

    2024-09-03 04:51:22.125 ===============================================================

    2024-09-03 04:51:22.125 Current task status: FAILED

    2024-09-03 04:51:22.125 Elapsed time: 1.447 h( Estimated: 10m )

    2024-09-03 04:51:22.125 ---------------- voldemort task ends ----------------

  • 原因分析

    由于内存使用超过限制被资源调度终止运行

  • 解决方案

    引导客户调整降低并发或者限制流量速度来提高运行稳定性

问题51

  • 现象描述

    hdfs2oracle出现脏数据

    2024-09-03 16:38:42.415 [0-0-0-writer] ERROR DlinkTaskPluginCollector - 脏数据:

    {"exception":"ORA-01722: 无效数字\n","record":[{"byteSize":4,"index":0,"rawData":"2331","type":"STRING"},{"byteSize":8,"index":1,"rawData":"P1233137","type":"STRING"},{"byteSize":7,"index":2,"rawData":"注塑1080C","type":"STRING"},{"byteSize":6,"index":3,"rawData":"233112","type":"STRING"},{"byteSize":6,"index":4,"rawData":"H1车间注塑","type":"STRING"},{"byteSize":4,"index":5,"rawData":"单色注塑","type":"STRING"},{"byteSize":4,"index":6,"rawData":"单色注塑","type":"STRING"},{"byteSize":23,"index":7,"rawData":"2024-09-03 16:00:49.026","type":"STRING"},{"byteSize":3,"index":8,"rawData":127,"type":"LONG"},{"byteSize":1,"index":9,"rawData":0,"type":"LONG"},{"byteSize":3,"index":10,"rawData":127,"type":"LONG"},{"byteSize":4,"index":11,"rawData":1199,"type":"LONG"},{"byteSize":1,"index":12,"rawData":0,"type":"LONG"},{"byteSize":2,"index":13,"rawData":-1,"type":"LONG"},{"byteSize":2,"index":14,"rawData":-1,"type":"LONG"},{"byteSize":8,"index":15,"rawData":"Infinity","type":"DOUBLE"},{"byteSize":18,"index":16,"rawData":"11.999722222222223","type":"DOUBLE"},{"byteSize":2,"index":17,"rawData":-1,"type":"LONG"},{"byteSize":8,"index":18,"rawData":"20240902","type":"STRING"}],"type":"writer"}

    2024/09/03 16:38:42 - Oracle_prod_pp_dp_production_plan_complete_statis_dh.0 - 完成处理 (I=0, O=3204, R=3204, W=0, U=0, E=1)

    2024-09-03 16:38:42.825 [DlinkTrans - Oracle_prod_pp_dp_production_plan_complete_statis_dh] INFO DlinkLogbackListener - Oracle_prod_pp_dp_production_plan_complete_statis_dh - 完成处理 (I=0, O=3204, R=3204, W=0, U=0, E=1)

    2024-09-03 16:38:42.975 [0-0-0-writer] ERROR DlinkTaskPluginCollector -

    java.sql.SQLSyntaxErrorException: ORA-01722: 无效数字

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) ~[ojdbc6-11.2.0.3.jar:11.2.0.3.0]

  • 原因分析

    由于orc的hdfs表前端传了nullFormat导致后端读数据遇见空时替换成了\n导致无法转成目标的数字类型

  • 解决方案

    引导客户通过脚本模式删除nullFormat的配置后重新运行,V4.2.0已修复

问题52

  • 现象描述

    hdfs2oracle出现脏数据

    标签码表任务提交异常

    系统内部错误:系统内部错误:

    ### Error updating database. Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "g_dep_rel_coordinate_id_dep_coordinate_id_tenant_uidx"

    Detail: Key (app_name, coordinate_id, dep_coordinate_id, draft_online_type, tenant_id)=(PIPE_LINE, 43140_OFFLINE_PIPELINE, 7028769715231872_PROJECT, ONLINE, 300000907) already exists.

    ### The error may exist in class path resource [dataphin-global-dependence-sdk/mapper/global-dependence-relation-mapper.xml]

    ### The error may involve com.alibaba.dataphin.global.dependence.sdk.adapter.persistence.mybatis.mapper.GlobalDependenceRelationMapper.batchInsert-Inline

    ### The error occurred while setting parameters

    ### SQL: INSERT INTO "prod".global_dependence_relation (app_name, tenant_id, object_id, object_type, coordinate_id , hierarchy_num, hierarchy_content, env, draft_online_type, mode , cuttable, project_id, dep_object_id, dep_object_type, dep_coordinate_id , dep_hierarchy_num, dep_hierarchy_content, dep_env, dep_draft_online_

  • 原因分析

    由于标签提交该管道任务时没有设置Mode导致全局依赖表中mode字段为null。而从页面提交时传了mode为BASIC导致全局依赖未识别到需要删除的记录而直接插入导致主键冲突

  • 解决方案

    标签平台优化提交码表任务提交逻辑,确保设置Mode参数再提交,V4.4已修复

问题53

  • 现象描述

    自定义hive jdbc数据源抽取hive数据报错:java.sql.SQLException: Method not supported

    at org.apache.hive.jdbc.HiveBaseResultSet.getDate(HiveBaseResultSet.java:277) ~[resource.fs_57d6b34a-86c5-47ec-bcf2-ad0b75c299c0.jar:na]

    at com.alibaba.datax.plugin.reader.rdbmsreader.SubCommonRdbmsReader$Task.transportOneRecord(SubCommonRdbmsReader.java:115) ~[rdbmsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:233) [plugin-rdbms-util-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.rdbmsreader.RdbmsReader$Task.startRead(RdbmsReader.java:85) [rdbmsreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) [na:1.8.0_152]

  • 原因分析

    对于自定义rdbms类型组件,dlink框架使用标准的jdbc协议去读取数据,该问题原因是hive jdbc驱动没有实现 Date getDate(int columnIndex, Calendar cal)方法导致

  • 解决方案

    1. 客户对jdbc驱动进行二次封装,实现 Date getDate(int columnIndex, Calendar cal)方法

    2. 修改hive表数据类型,不使用date类型

问题54

  • 现象描述

    2024-09-19 00:30:36.971 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select MANDT,OBJNR,STAT,INACT,CHGNR,ZSLTTIMESTAMP from cp2.CRM_JEST where (ZSLTTIMESTAMP >=20240918000000 and ZSLTTIMESTAMP <20240919000000)

    ] jdbcUrl:[jdbc:sqlserver://172.16.243.100:1433;DatabaseName=master;schema=cp2].

    2024-09-19 00:30:36.971 [0-0-0-reader] INFO CommonRdbmsReader$Task - fetchSize:1024

    2024-09-19 00:30:36.972 [0-0-0-reader] INFO DBUtil - start to open connection:jdbc:sqlserver://172.16.243.100:1433;DatabaseName=master;schema=cp2

    2024-09-19 00:30:36.994 [0-0-0-reader] INFO DBUtil - opened connection for jdbc:sqlserver://172.16.243.100:1433;DatabaseName=master;schema=cp2

    2024-09-19 00:30:37.962 [Metric-collector-2] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%

    2024-09-19 00:31:37.963 [Metric-collector-2] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%

    2024-09-19 00:32:37.964 [Metric-collector-2] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%

    2024-09-19 00:33:37.965 [Metric-collector-2] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%

    2024-09-19 00:34:37.966 [Metric-collector-2] INFO KettleMetricCollector - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%

    2024-09-19 00:35:36.970 [Metric-collector-2] INFO VMInfo -

    ZERO_FLOW_TIMEOUT - 无流量持续时间:[1801s] 超过配置阈值:[1800s],通常由于输入数据源或者输出数据源负载较大、数据量较大、网络带宽打满等原因,导致输入端拉取不到数据或者输出端写不进数据,需要客户侧分析数据源、排查网络、优化管道任务(如问题出现在数据库输入表数据量较大,可为输入表添加整型类型的主键或者索引,在管道任务配置切分键,过滤条件尽可能使用索引、并增大并发等)等,如果确实会存在较长时间无流量,则可以调整管道任务的无流量超时配置

  • 原因分析

    由于客户的sqlserver服务端没正常响应sql执行。导致无法拉取到数据,一直等待30m后超时自动失败

  • 解决方案

    引导客户调查sqlserver数据库引擎问题

问题55

  • 现象描述

    ftp组件读取excel文件报错:The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)

  • 原因分析

    客户配置的文件格式为xls,实际文件格式未xlsx

  • 解决方案

    将ftp输入组件的文件格式配置为xlsx即可

问题56

  • 现象描述

    写入oracle报错:java.sql.BatchUpdateException: ORA-24816: 在实际的 LONG 或 LOB 列之后提供了扩展的非 LONG 绑定数据

  • 原因分析

    通过sql插入到oracle表中,要求lob或long字段一定要在最后,即long或者lob字段后面不能存在非lob或者long类型字段

  • 解决方案

    通过将oracle输出组件转换成脚本模式,并调整column和columnMapping中的字段顺序,将lob或者long类型字段调整到最后即可

问题57

  • 现象描述

    sqlserver连通性异常:Connection reset ClientConnectionId:

  • 原因分析

    由于客户仅仅只添加了ip白名单,但网络未打通导致无法连接

  • 解决方案

    引导客户打通网络后重试

问题58

  • 现象描述

    计算源连通性测试报:Error while cleaning up the server resources

  • 原因分析

    CDP通过select 1测试连通性时会启动MR任务提交到Yarn资源。由于环境是CDP测试环境,YARN资源非常少。只有4G导致无法跑SQL进行测试导致Hive的连通性测试失败

  • 解决方案

    引导客户终止YARN上占用资源的任务后重新测试连通性后通过

问题59

  • 现象描述

    数据源连通性通过但配置集成管道获取元数据异常:

    从租户300010444的数据源7235506923571614976获取元数据失败DataSourceNotAvailableException com.microsoft.sqlserverjdbc.SQLServerException:

    The TCP/IP connection to the host TSCZCMSDB01, port 1433 has failed. Error: "null. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a

  • 原因分析

    由于dataphin服务所在物理集群扩容,导致metadata服务无法请求客户的SQLServer机器,导致获取元数据失败。连通性测试通过是因为Datasource服务运行在旧的物理机器上。没有网络问题

  • 解决方案

    引导客户调整网络基础设施,打通网络后重新尝试

问题60

  • 现象描述

    自定义域名的oss读取报bucket不存在且提示的名字并不是bucket的配置名字,而是一个子路径

  • 原因分析

    由于客户的oss服务使用的自定义域名,由于没在数据源那里配置cname属性。导致抛出奇奇怪怪的异常

  • 解决方案

    引导客户配置了xxx.xx的自定义域名作为cname。任务运行成功

问题61

  • 现象描述

    客户走ftp读取文件超时

  • 原因分析

    一开始以为是被动模式问题,不能高并发运行管道,后面看了下可能也和数据库本身有问题

  • 解决方案

    配置完单线程运行任务,然后不在测试文件夹中读取文件

问题62

  • 现象描述

    管道界面报错--元数据内部错误。导致客户选不到索引文档,客户反馈索引文档是真实存在的,数据源的连通性是通的

  • 原因分析

    数据源的url后面多带上了/

  • 解决方案

    数据源保存时校验是否时/结尾,如果是阻塞保存,产品上已优化

问题63

  • 现象描述

    管道任务执行报错报oom

  • 原因分析

    客户使用了hive orc格式的表

  • 解决方案

    让客户改成parquet格式重跑成功了

问题64

  • 现象描述

    starrocks连接失败

  • 原因分析

    没有加公网白名单

  • 解决方案

    客户自行处理

问题65

  • 现象描述

    Reason: actual column number in csv file is less than schema column number.actual number: 22, schema column number: 107; line delimiter: [

    ], column separator: [ ],

  • 原因分析

    客户原始数据有问题,切分行的时候切分错了

  • 解决方案

    客户自行处理数据

问题66

  • 现象描述

    hive往 doris写入数据 要比往 pg库写入慢 一万倍。调整Doris写入Batch大小后提速不明显

  • 原因分析

    由于采用StreamLoad方式的默认batch太小。调整放大20倍性能提升了

  • 解决方案

    引导客户调整批次大小配置后重试,默认大小可以尝试调整128M和100000条记录

问题67

  • 现象描述

    通过OpenAPI创建异步提交的管道任务出现内部异常

  • 原因分析

    根据输出名计算hashcode来判断是否是同一个管道任务在提交中。

    客户没有配置输出名【api文档标记该参数可为空】

    所以每次创建都是空白导致识别为同一个管道任务导致这里查处多条违反mybatis的结果限制。

    如果都写成固定同样的输出名执行多次创建不同的管道任务也会出现同样的异常。需要修复。

  • 解决方案

    引导客户配置正确的输出名字再尝试提交

问题68

  • 现象描述

    Mongo2Odps并发数为3运行失败,并发为1运行成功

    is invalid. Size must be between 0 and 16793600(16MB) First element

  • 原因分析

    客户的MongoDB设置了流量限制导致并发为3时运行失败

  • 解决方案

    引导客户自行调查MongoDB的服务端限制

问题69

  • 现象描述

    kafka2hive

    配置同步时间的范围不存在消息会同步时间范围外的消息到hive

  • 原因分析

    如果配置的起始时间和结束时间的范围内不存在消息则会继续向后消费一段时间消息

  • 解决方案

    引导客户通过脚本模式增加skipExceedRecord为true来控制该行为,引擎默认调整为跳过,防止读取越过界限的消息

问题70

  • 现象描述

    客户es状态不对,接口是200但是返回es的状态是red

  • 原因分析

    状态是red的es不可用

  • 解决方案

    客户自行处理

问题71

  • 现象描述

    客户配置sqlserver数据源失败

  • 原因分析

    客户的db中带有特殊字符()

  • 解决方案

    客户自行处理

问题72

  • 现象描述

    oracle读取超时,一直不断开或者报错

  • 原因分析

    数据库性能问题,且未配置切分健,未配置无流量超时设置

  • 解决方案

    引导客户配置切分键和无流量超时时间

问题73

  • 现象描述

    db2只能查看到100张表

  • 原因分析

    代码问题,4.1已解决

  • 解决方案

    客户申请升级解决

问题74

  • 现象描述

    db2只能查看到100张表

  • 原因分析

    代码问题,4.1已解决

  • 解决方案

    客户申请升级解决

问题75

  • 现象描述

    teradata不支持读取视图的字段信息

  • 原因分析

    客户之前走脚本模式绕过,后续继续使用脚本模式

  • 解决方案

    客户自行处理。teradata通过解析建视图语句能实现查看到字段信息

问题76

  • 现象描述

    openguass的任务如果集中调度会导致运行失败,只能单个重试

  • 原因分析

    可能是我们的查询索引字段关系的sql导致性能问题,内存被打满了

  • 解决方案

    推荐客户先别用copy模式

问题77

  • 现象描述

    管道任务打开hive输入组件查询表的字段元数据报错:从租户xxx的数据源xx获取元数据失败NullPointerException

  • 原因分析

    客户配置的hive元数据是mysql方式,但是hive表的存储格式iceberge,而metadata应用处理iceberge表存在bug,导致报空指针

  • 解决方案

    绕过:让客户将hive的元数据配置为hms即可,已在V4.3版本修复

问题78

  • 现象描述

    管道任务打开hive输入组件查询表的字段元数据报错:从租户xxx的数据源xx获取元数据失败NullPointerException

  • 原因分析

    客户配置的hive元数据是mysql方式,但是hive表的存储格式iceberge,而metadata应用处理iceberge表存在bug,导致报空指针

  • 解决方案

    绕过:让客户将hive的元数据配置为hms即可

问题79

  • 现象描述

    2024-08-17 00:38:57.842 [job-69418143] ERROR DlinkTrans - Exception when job run

    com.mongodb.MongoCommandException: Command failed with error 352: 'Unsupported OP_QUERY command: splitVector. The client driver may require an upgrade. For more details see https://dochub.mongodb.org/core/legacy-opcode-removal' on server dds-bp1bfe94eab36b841220-pub.mongodb.rds.aliyuncs.com:3717. The full response is { "ok" : 0.0, "errmsg" : "Unsupported OP_QUERY command: splitVector. The client driver may require an upgrade. For more details see https://dochub.mongodb.org/core/legacy-opcode-removal

    at com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.Mongo.execute(Mongo.java:772) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.Mongo$2.execute(Mongo.java:759) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124) ~[mongo-java-driver-3.2.2.jar:na]

    at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114) ~[mongo-java-driver-3.2.2.jar:na]

    at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplitCollection(CollectionSplitUtil.java:135) ~[mongodbreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplit(CollectionSplitUtil.java:47) ~[mongodbreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.datax.plugin.reader.mongodbreader.MongoDBReader$Job.split(MongoDBReader.java:59) ~[mongodbreader-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTransRunner.doReaderSplit(DlinkTransRunner.java:143) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.doSplit(DlinkTrans.java:344) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTrans.start(DlinkTrans.java:119) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.runTrans(Engine.java:90) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.entry(Engine.java:173) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.main(Engine.java:249) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

  • 原因分析

    mongodb版本不匹配,客户端请求的时候发送的命令服务端不识别

  • 解决方案

    绕过:让客户手动把数据源版本改成3.4+,mongodb6.0目前在写入的时候还是会报错,存在兼容问题

问题80

  • 现象描述

    执行hang住了,没有报错堆栈也没有写入的堆栈信息

  • 原因分析

    在尝试连接starrrock时,会去准备打开sr端口,如果端口没开防火墙连接就会被hang住

  • 解决方案

    需要sr的ip添加沙箱白名单

问题81

  • 现象描述

    添加完白名单之后还是报错

    Caused by: java.net.ConnectException: Connection timed out (Connection timed out)

    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_152]

    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_152]

    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_152]

    at java.net.Socket.connect(Socket.java:643) ~[na:1.8.0_152]

    at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[httpclient-4.5.4.jar:4.5.4]

    at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[httpclient-4.5.4.jar:4.5.4]

    ... 15 common frames omitted

  • 原因分析

    如果是starrocks在公共云环境,可能由于网络问题导致无法写入

  • 解决方案

    v4.3.2中支持外部集群运行starrocks数据源

问题82

  • 现象描述

    pg2odps运行报列不存在异常:

    java.lang.RuntimeException: dlink trans run error

    at com.alibaba.dt.dlink.core.trans.DlinkTransPreview.doPreview(DlinkTransPreview.java:446) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.trans.DlinkTransPreview.start(DlinkTransPreview.java:238) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.runTransPreview(Engine.java:102) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.entry(Engine.java:175) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at com.alibaba.dt.dlink.core.Engine.main(Engine.java:249) [dlink-engine-0.0.1-SNAPSHOT.jar:na]

    Caused by: java.lang.RuntimeException: dlink trans run error

    at com.alibaba.dt.dlink.core.trans.DlinkTransPreview.doPreview(DlinkTransPreview.java:437) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    ... 4 common frames omitted

    com.alibaba.dt.pipeline.plugin.center.exception.DlinkBizException: Code:[DBUtilErrorCode-06], Description:[执行数据库 Sql 失败, 请检查您的配置的 column/table/where/querySql或者向 DBA 寻求帮助.]. - 执行的SQL为: select factoryid,wo_code,lot_code,mname,mcd,fname,fcd,pname,pcd,totalnostatetime,headNoStateTime,tailNoStateTime,passNoStateTime,sample_count,noStateTime,totalStateTime from "aipass"."v_nostate"

    at com.alibaba.datax.plugin.rdbms.util.RdbmsException.asQueryException(RdbmsException.java:105) ~[na:na]

    at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:237) ~[na:na]

    at com.alibaba.datax.plugin.reader.postgresqlreader.PostgresqlReader$Task.startRead(PostgresqlReader.java:87) ~[na:na]

    at com.alibaba.dt.dlink.core.trans.adaptor.engine.ReaderRunnerAdaptor.run(ReaderRunnerAdaptor.java:57) ~[dlink-engine-0.0.1-SNAPSHOT.jar:na]

    at java.lang.Thread.run(Thread.java:882) ~[na:1.8.0_152]

    Caused by: org.postgresql.util.PSQLException: ERROR: column "headnostatetime" does not exist

    建议:Perhaps you meant to reference the column "v_nostate.headNoStateTime".

    位置:82

    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2565) ~[na:na]

  • 原因分析

    由于客户的pg库中列名字采用双引号包装的驼峰命名格式,在集成管道中拼接SQL的列名采用了驼峰的格式但没有双引号包装,导致PG引擎识别为小写的非驼峰的列名,导致报列不存在的异常

  • 解决方案

    引导客户调整底层pg表列名字避免采用驼峰命名或者采用脚本模式配置双引号包装列名来实现,PG的元数据支持大小写需要参考Oracle的使用配置,支持自动获取大小写列名字

问题83

  • 现象描述

    hologres配置数据源提示账号密码错误

  • 原因分析

    由于产品上提示hologres配置需要账号和密码,用户进入了安全中心添加了账号和密码,然后输入后提示hologres的账号和密码错误,其实是需要配置ak和sk。目前hologres的数据源配置仅仅支持AccessKey、SecurityKey配置。

  • 解决方案

    引导客户增加ak和sk。通过配置ak和sk进行连接hologres

适用于

  • Dataphin数据集成模块,不分部署模式和版本