FAQ about startup failures

更新时间:
复制 MD 格式

Error messages and solutions for task startup failures.

Failed to load main class

  • Error message

    错误:找不到或无法加载主类 com.alibaba.proxima.CentauriRunner.
    OK
    OK
    OK
    OK
    错误:找不到或无法加载主类 com.alibaba.proxima.CentauriRunner
    FAILED: Run job failed.
    REPORT:
    https://dm.guide/report/Run job fail?data-dm-guide-action=4&data-dm-guide-extra-msg=ID:6192c00f-dc6f-46c5-9a13-93f1722920a4
    2021-06-18 16:28:38 INFO ============================================================
    2021-06-18 16:28:38 INFO Exit code of the Shell command 1
    2021-06-18 16:28:38 INFO --- Invocation of Shell command completed ---
    2021-06-18 16:28:38 ERROR Shell run failed!
    2021-06-18 16:28:38 ERROR Current task status: ERROR
    2021-06-18 16:28:38 INFO Cost time is: 1.25s
    /home/admin/alisatasknode/taskinfo//20210618/dide/16/28/32/95u0koh57ra79t6aft71l38t/T3_1231618871.log-END-EOF
  • Solution

    MaxCompute cannot load the Proxima CE executable JAR package. Use the application link or search for DingTalk group 11782920 to join the MaxCompute Developer Community and contact the technical support team.

Incorrect delimiter

  • Error message

    FAILED: ODPS-0123131:User defined function exception - Traceback:
    ProximaCEException(code=20003, msg=参数校验异常, detailMsg=数据向量维度[=1]和config配置的向量维度[=128]不一致,)
            at com.alibaba.proxima.utils.VectorConvert.convert(VectorConvert.java:17)
            at com.alibaba.proxima.mr.BuildMapper.map(BuildMapper.java:58)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:497)
            at com.aliyun.MaxCompute.mapred.bridge.utils.MapReduceUtils.runMapper(MapReduceUtils.java:120)
            at com.aliyun.MaxCompute.mapred.bridge.LotMapperUDTF.run(LotMapperUDTF.java:807)
            at com.aliyun.MaxCompute.udf.impl.batch.BatchStandaloneUDTFEvaluator.run(BatchStandaloneUDTFEvaluator.java:53)
  • Solution

    Use the -vector_separator command-line parameter to specify the correct delimiter. The default delimiter is a tilde (~). For more information, see Optional parameters.

    Note

    Do not enclose the delimiter in single or double quotes. Use the character itself. For example, ',' is interpreted as the literal string ',' rather than the comma delimiter ,.

Incorrect JDK version in resource group

  • Error message

    OK
    Exception in thread "main" java.lang.UnsupportedClassVersionError: com/alibaba/proxima/CentauriRunner : Unsupported major.minor version 52.0
            at java.lang.ClassLoader.defineClass1(Native Method)
            at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
            at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
            at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
            at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
            at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
            at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
            at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
            at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)
    FAILED: Run job failed.
    2021-08-26 16:27:51 INFO ========================================================================
    2021-08-26 16:27:51 INFO Exit code of the Shell command 100
    2021-08-26 16:27:51 INFO —— Invocation of Shell command completed ——
    2021-08-26 16:27:51 ERROR Shell run failed!
    2021-08-26 16:27:51 ERROR Current task status: ERROR
    2021-08-26 16:27:51 INFO Cost time is: 1.206s
    /home/admin/alisatasknode/taskinfo//20210826/phoenixprod/16/27/36/6psm65y7cs39edodxp0sdcxq/T3_6801991809.log-END-EOF
  • Solution

    When you create a MaxCompute task, multiple gateway resource groups are available in the scheduling configuration. The default resource group requires JDK 1.8 or later. This error typically occurs because the JDK version is too low. Switch to a different resource group to resolve this issue.

Volume directory not found

  • Error message

    MaxCompute-0010000: System internal error - Lost volume dir
    2021-08-27 11:49:45.689 [main] INFO  c.a.proxima.utils.CheckSignUtil - [] - project:taobao_machinelearning appId:201160 check sign pass. times(ms):314
    [500] com.aliyun.odps.OdpsException: ODPS-0010000: System internal error - Lost volume dir.
            at com.aliyun.odps.rest.RestClient.handleErrorResponse(RestClient.java:395)
            at com.aliyun.odps.rest.RestClient.request(RestClient.java:330)
            at com.aliyun.odps.rest.RestClient.request(RestClient.java:284)
            at com.aliyun.odps.Volume.reload(Volume.java:113)
            at com.aliyun.odps.Volumes.exists(Volumes.java:119)
            at com.aliyun.odps.Volumes.exists(Volumes.java:102)
            at com.alibaba.proxima.config.ConfigConvert.volumeProcess(ConfigConvert.java:182)
            at com.alibaba.proxima.config.ConfigConvert.convert(ConfigConvert.java:31)
            at com.alibaba.proxima.CentauriRunner.main(CentauriRunner.java:236)
    Caused by: com.aliyun.odps.rest.RestException: RequestId=612862E170BEC39964445860,Code=InternalServerError,Message=ODPS-0010000: System internal error - Lost volume dir.
            ... 9 more
    FAILED: Run job failed.
    2021-08-27 11:58:25 INFO =====================================================================
    2021-08-27 11:58:25 INFO Exit code of the Shell command 100
  • Solution

    This error can occur when the Volume directory exists but is corrupted. Try manually deleting the directory and re-running the task. Use the following ODPS SQL commands:

    vfs -ls /; --该命令会输出前缀为'proxima_v2/xxx'的目录
    vfs -rm -r -f /proxima_v2/xxx; --删除该目录(与runLog里面打印的Volume目录一致)。与下述命令二选一
    vfs -rmv /proxima_v2; --删除整个Volume。与上述命令二选一

Error: exceeds the allowed maximum length of '2097152'

  • Error message

    MaxCompute-0420031: Invalid xml in HTTP request body - The request body is malformed or the server version doesn’t match this sdk/client. XML Schema validation failed: Element 'Value': [facet 'maxLength'] The value has a length of '7238452'; this exceeds the allowed maximum length of  '2097152'.
  • A common cause of this error is an incorrect -classpath startup parameter. See 运行 to set the correct -classpath parameter, and then re-run the task.

Error: java.lang.UnsatisfiedLinkError: no jniproxima in java library.path

  • Error message

    17:57:05.229 [main] DEBUG org.bytedeco.javacpp.Loader - Loading library jniproxima
    17:57:05.229 [main] DEBUG org.bytedeco.javacpp.Loader - Failed to load for jniproxima: java.lang.UnsatisfiedLinkError: no jniproxima in java.library.path
    Can not load proxima core:java.lang.UnsatisfiedLinkError: no jniproxima in java.library.path
    java.lang.UnsatisfiedLinkError: no jniproxima in java.library.path
            at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
            at java.lang.Runtime.loadLibrary0(Runtime.java:870)
            at java.lang.System.loadLibrary(System.java:1122)
            at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1738)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1345)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1157)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1133)
            at com.alibaba.proxima2.core.global.proxima.<clinit>(proxima.java:12)
            at java.lang.Class.forName0(Native Method)
            at java.lang.Class.forName(Class.java:348)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1212)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1157)
            at org.bytedeco.javacpp.Loader.load(Loader.java:1133)
            at com.alibaba.proxima2.core.IndexPluginBroker.<clinit>(IndexPluginBroker.java:16)
            at com.alibaba.proxima2.ce.utils.ProximaUtil.<clinit>(ProximaUtil.java:24)
            at com.alibaba.proxima2.ce.utils.ConfigParser.commandLineParserProcess(ConfigParser.java:128)
            at com.alibaba.proxima2.ce.utils.ConfigParser.parse(ConfigParser.java:36)
            at com.alibaba.proxima2.ce.ProximaCERunner.main(ProximaCERunner.java:139)
    Caused by: java.lang.UnsatisfiedLinkError: /home/ads/.javacpp/cache/main_sub0.jar/com/alibaba/proxima2/linux-x86_64/libjniproxima.so: /home/ads/.javacpp/cache/main_sub...
            at java.lang.ClassLoader$NativeLibrary.load(Native Method)
            at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
            at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
            at java.lang.Runtime.load0(Runtime.java:809)
            at java.lang.System.load(System.java:1086)
            at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1685)
            ... 14 more
    17:57:05.240 [main] INFO com.alibaba.proxima2.ce.utils.ConfigParser - odps.stage.mapper.split.size: 128M
    Running job in console.
    17:57:05.405 [main] INFO com.alibaba.proxima2.ce.utils.ConfigParser - projectName: alimama_kgb_algo, appId:200696
    17:57:05.921 [main] INFO com.alibaba.proxima2.ce.utils.CheckSignUtil - checkSign, projectName:alimama_kgb_algo, appId:200696, fc_name:vector_retrieval, result:{"code"...
    17:57:06.080 [main] INFO com.alibaba.proxima2.ce.utils.CheckSignUtil - checkSign, projectName:alimama_kgb_algo, appId:200696, fc_name:vector_retrieval_v2, result:{"co...
    ProximaCEException(code=20002, msg=xxx, detailMsg=project:alimama_kgb_algo appId:200696 xxx)
            at com.alibaba.proxima2.ce.utils.CheckSignUtil.checkSign(CheckSignUtil.java:39)
            at com.alibaba.proxima2.ce.utils.ConfigParser.checkSign(ConfigParser.java:452)
            at com.alibaba.proxima2.ce.utils.ConfigParser.parse(ConfigParser.java:38)
            at com.alibaba.proxima2.ce.ProximaCERunner.main(ProximaCERunner.java:139)
    ProximaCEException(code=20002, msg=xxx, detailMsg=xxx.)
            at com.alibaba.proxima2.ce.utils.ConfigParser.checkSign(ConfigParser.java:455)
            at com.alibaba.proxima2.ce.utils.ConfigParser.parse(ConfigParser.java:38)
            at com.alibaba.proxima2.ce.ProximaCERunner.main(ProximaCERunner.java:139)
    FAILED: Run job failed.
    2022-09-28 17:57:06 INFO ================================================================
  • Solution

    This error typically occurs when the MaxCompute instance fails to load the Proxima SDK, usually due to an outdated machine or insufficient environment configuration. This is rare. Re-run the task, and the system will schedule it on a compatible instance.