MaxCompute UDF (Python) FAQ

更新时间:
复制 MD 格式

This topic describes the frequently asked questions (FAQ) about MaxCompute user-defined functions (UDFs) that are written in Python.

Class or resource issues

This section describes common class and resource issues that occur when you call a MaxCompute UDF.

  • Symptom 1: The error message function 'xxx' cannot be resolved is returned.

    • Causes:

      • Cause 1: You are calling the MaxCompute UDF from the wrong project. The UDF is not in the current MaxCompute project. For example, the UDF is registered in a development project, but you are calling it from a production project.

      • Cause 2: The class or resource of the MaxCompute UDF is incorrect.

      • Cause 3: The resource type that the MaxCompute UDF depends on is incorrect. For example, a PY file has the resource type PY, but the get_cache_file method in the UDF code requires the FILE type.

      • Cause 4: The resource that the MaxCompute UDF depends on is outdated. A delay can occur when you upload a resource from DataWorks to MaxCompute, which means the resource may not be the latest version.

      • Cause 5: The Python environment version is incorrect. By default, MaxCompute runs jobs in a Python 2 environment. An error occurs if the Python code contains non-ASCII characters.

    • Solutions:

      • Solution for Reason 1: In the project where the error occurred, you can execute the list functions; command from the MaxCompute client to verify that the MaxCompute User-Defined Function (UDF) exists.

      • Solution for Cause 2: On the MaxCompute client, run the desc function <function_name>; command and verify that the Class and Resources in the output are correct.

        If they are incorrect, run the create function <function_name> as <'package_to_class'> using <'resource_list'>; command to register the function again. In this command, package_to_class is Python_script_name.class_name and resource_list includes all the file, table, and archive resources, or third-party packages that you need to reference in MaxCompute.

        For more information, see Register a function.

      • Solution for Cause 3: On the MaxCompute client, run the desc resource <resource_name>; command and check that the Type in the output is correct. If the resource type is incorrect, run the add <file_type> <file_name>; command to add the resource again.

        • If the resource is referenced using get_cache_file in the UDF code, the resource is a file resource. The resource type must be FILE.

        • If the resource is referenced using get_cache_table in the UDF code, the resource is a table resource. The resource type must be TABLE.

        • If the resource is referenced using get_cache_archive in the UDF code, the resource is an archive resource. The resource type must be ARCHIVE.

        For more information, see Add a resource.

      • Solution for Reason 4: You can execute the desc resource <resource_name>; command in the MaxCompute client and check the LastModifiedTime in the output to verify the last modification time.

      • Solution for Cause 5: Add the #coding:utf-8 or # -*- coding: utf-8 -*- encoding declaration at the beginning of the Python code. Alternatively, add the set odps.sql.python.version=cp37; statement before the SQL statement that calls the UDF. Then, submit them together to run the job in a Python 3 environment.

  • Symptom 2: When you use get_cache_archive('xxx.zip') in a MaxCompute UDF, one of the following error messages is returned: IOError: Download resource: xxx.zip failed, odps.distcache.DistributedCacheError, or fuxi job failed: Download resource failed: xxx.zip.

    • Causes:

      • Cause 1: The archive resource does not exist. The archive resource was not specified when you registered the MaxCompute UDF.

      • Cause 2: The resource type of the archive is incorrect. It is not ARCHIVE.

      • Cause 3: The name or extension of the archive resource is inconsistent with the actual file. For example, the archive resource is named xxx.zip, but the uploaded file is xxx.tar.gz. The system tries to decompress the file in the ZIP format, which causes a decompression failure.

      • Cause 4: Two UDFs in the same job depend on resources that have the same name but are in different projects.

    • Solutions:

      • Solution for Cause 1: Run the desc function <function_name>; command in the MaxCompute client to check whether the Resources field in the output contains the compressed resource package specified in the error message.

        If it is not included, run the create function <function_name> as <'package_to_class'> using <'resource_list'>; command to register the function again. Add the missing archive resource to resource_list.

        For more information, see Register a function.

      • Solution for Cause 2: Run the desc resource <resource_name>; command on the MaxCompute client to check if the Type in the output is ARCHIVE.

        If the type is not ARCHIVE, run the add archive <file_name>; command to upload the resource again.

        For more information, see Add a resource.

      • Solution for Cause 3: On the MaxCompute client, run the desc function <function_name>; command to check whether the name and extension of the compressed package resource in the Resources section of the output match the actual filename and extension.

        If they are inconsistent, run the add archive <file_name>; command to upload the resource again. The file_name must be the same as the name and extension of the actual archive resource.

      • Solution for Cause 4: Check all UDFs that the job depends on, including UDFs in views. Examine the project and name of each UDF and its corresponding resource. If resources with the same name exist in different projects, change the name of the dependent UDF or resource.

  • Symptom 3: When you use get_cache_table(table_name) in a MaxCompute UDF, the error message odps.distcache.DistributedCacheError: Table resource "xxx_table_name" not found is returned.

    • Causes:

      • Cause 1: The table resource does not exist. The table resource was not specified when you registered the MaxCompute UDF.

      • Cause 2: The resource type of the table is incorrect. It is not TABLE.

    • Solutions:

      • Solution for Cause 1: On the MaxCompute client, run the desc function <function_name>; command and check whether the Resources in the output contains the table resource from the error message.

        If it is not included, run the create function <function_name> as <'package_to_class'> using <'resource_list'>; command to register the function again. Add the missing table resource to resource_list.

        For more information, see Register a function.

      • Solution for Cause 2: On the MaxCompute client, run the desc resource <resource_name>; command and check whether the Type in the output is TABLE.

        If the type is not TABLE, run the add table <table_name>; command to upload the table resource again.

        For more information, see Add a resource.

  • Symptom 4: When a MaxCompute UDF references a third-party package, the error message ImportError: No module named 'xxx' is returned.

    • Cause

      • Cause 1: The `resource type` of the third-party `package` is incorrect. The `resource type` must be ARCHIVE.

      • Cause 2: The third-party `package` was not specified when the `MaxCompute` UDF was registered.

      • Cause 3: The `path` to the third-party `package` is not added to the `MaxCompute` UDF code.

      • Cause 4: The third-party `package` is a WHEEL `package`, but the `file` has an incorrect extension or does not correspond to the Python `environment` version.

      • Cause 5: The third-party `package` is not a WHEEL `package` or a pure Python `package`, but it contains a setup.py `file`.

      • Cause 6: The name of the Python `file` for the `MaxCompute` UDF conflicts with the name of the third-party module that you want to `reference`. For example, if the Python `file` for the UDF is A.py, the system `imports` A.py by `default` instead of the module from the third-party `package` when you run `import A`.

    • Solutions:

      • Solution for Cause 1: From the MaxCompute client, execute the desc resource <resource_name>; command and check whether the Type parameter in the output is ARCHIVE.

        If the type is not ARCHIVE, run the add archive <file_name>; command to upload the resource again.

        For more information, see Add a resource.

      • Solution for Cause 2: Use the MaxCompute client to execute the desc function <function_name>; command, and check whether the Resources parameter in the output includes third-party packages.

        If they are not included, run the create function <function_name> as <'package_to_class'> using <'resource_list'>; command to register the function again. Add the third-party package to resource_list.

        For more information, see Register a function.

      • Solution for Cause 3: Check whether the path to the third-party package is added to the user-defined function (UDF) code. Specifically, check that the code includes sys.path.insert(0, 'work/path_to_third_party_package'). For example, assume that the module name is A and the corresponding Python file is A.py. The following examples describe how to determine the resource package path and add the path to your code:

        • If the Python file is in the resource_dir folder and you directly compress the resource_dir folder into resource-of-A.zip, the path in sys.path.insert is work/resource-of-A.zip/resource_dir/.

        • If the Python file is in the resource_dir folder and you compress all files within the resource_dir folder into resource-of-A.zip, the path in sys.path.insert is work/resource-of-A.zip/.

        • If the Python file is in the resource_dir/path1/path2 folder and you compress all files within the resource_dir folder into resource-of-A.zip, the path in sys.path.insert is work/resource-of-A.zip/path1/path2/.

        Note

        By default, ARCHIVE resources are placed in the ./work/ directory, which is relative to the UDF execution path.

      • Solution for Cause 4: WHEEL files are different for Python 2 and Python 3 environments. For Python 2, the WHEEL file name must contain cp27-cp27m-manylinux1_x86_64. For Python 3, the WHEEL file name must contain cp37-cp37m-manylinux1_x86_64. Download the appropriate WHEEL file. You can directly change the extension of the downloaded WHEEL file to .zip. You do not need to compress the WHEEL file into another ZIP file.

      • Solution for Cause 5: You must first compile the setup.py file to generate a WHEEL package in an environment that is compatible with MaxCompute. Then, upload the resource and register the function. For more information about how to compile a third-party package, see Use a third-party package that requires compilation.

      • Solution for Cause 6: Rename the Python file for the MaxCompute UDF.

  • Symptom 5: When a MaxCompute UDF references a Python 3 standard library, the error message ImportError: No module named enum is reported.

    • Cause: Python 3 is not enabled for the MaxCompute project. By default, the MaxCompute UDF runs in a Python 2 environment, which cannot recognize Python 3 standard libraries.

    • Solution: Add the set odps.sql.python.version=cp37; statement before the SQL statement that calls the UDF and submit them together.

  • Symptom 6: The error message ModuleNotFoundError: No module named 'six' is returned.

    • Cause: The path to the third-party package is not added to sys.path. This prevents the Python UDF from importing the package.

    • Solution: For more information, see Run Scipy in a MaxCompute UDF. Change include_package_path('six.zip') to sys.path.insert(0, 'work/six.zip').

  • Symptom 7: The error message failed to get Udf info from xxx.py is reported.

    • Cause: The base class is imported with incorrect syntax in the user-defined table-valued function (UDTF) or user-defined aggregate function (UDAF). For example, import odps.udf.BaseUDTF or import odps.udf.BaseUDAF.

    • Solution: Change the import statement to from odps.udf import BaseUDTF or from odps.udf import BaseUDAF.

Performance issues

  • Symptom: The error message kInstanceMonitorTimeout is returned.

  • Cause: The UDF processing time is too long, which causes a timeout. By default, a time limit is imposed on UDF data processing. A batch of records, typically 1,024, must be processed within 1,800 seconds. This time limit applies to processing a small batch of records, not the total runtime of the worker. SQL typically processes data at a rate of more than 10,000 records per second. This limit is in place only to prevent an infinite loop in the UDF, which can cause prolonged CPU usage.

  • Solution:

    • Add logs to the MaxCompute UDF code to check for infinite loops. You can also print time information in the logs to check whether the processing time for a single record meets expectations. Add the following log printing information to your code. After the job runs successfully, you can view the log information in StdOut in Logview.

      • Python 2 environment

        sys.stdout.write('your log')
        sys.stdout.flush()
      • Python 3 environment

        print('your log', flush=True)
    • If the actual computation is large and the UDF is expected to run for a long time, you can adjust the following parameters to prevent timeout errors.

      Parameter

      Description

      set odps.function.timeout=xxx;

      Adjusts the UDF runtime timeout period. The default value is 1800s. You can increase this value as needed. The value must be in the range of 1s to 3600s.

      set odps.sql.executionengine.batch.rowcount=xxx;

      Adjusts the number of data rows that MaxCompute processes at a time. The default value is 1024. You can decrease this value as needed.

Network issues

  • Symptom: An error occurs when you call a MaxCompute UDF to access the internet.

  • Cause: MaxCompute UDFs do not support internet access.

  • Solution: Fill out and submit the Network Connection Request form based on your business needs. The MaxCompute technical support team will contact you promptly to enable network access. For instructions on how to fill out the form, see Network access process.

Sandbox issues

  • Symptom: The error message RuntimeError: xxx has been blocked by sandbox is returned.

  • Cause: Some function calls in the Python UDF are blocked by the sandbox.

  • Solution:

    • Before the SQL statement that calls the Python UDF, add the set odps.isolation.session.enable=true; setting. Then, submit them together.

    • If you use a Python 3 UDF, the set odps.isolation.session.enable=true; setting is enabled by default.

Encoding issues

This section describes common encoding issues that occur when calling a MaxCompute UDF.

  • Symptom 1: The error message SyntaxError: Non-ASCII character '\xe8' in file xxx. on line yyy is returned.

    • Cause: The Python file for the MaxCompute UDF contains non-ASCII characters and runs in a Python 2 environment.

    • Solution:

      • Add the set odps.sql.python.version=cp37; statement before the SQL statement that calls the UDF. Then, submit them together to run the job in a Python 3 environment.

      • Change the default encoding of the Python 2 interpreter to UTF-8. To do this, add the following statements at the beginning of the Python file.

        import sys
        reload(sys)
        sys.setdefaultencoding('utf-8')
  • Symptom 2: When you call a Python 2 UDF, the error message UnicodeEncodeError: 'ascii' code can't encode characters in position x-y: ordinal not in range(128) is returned.

    • Cause: The return value type in the function signature is STRING, but the MaxCompute UDF returns a Python object of the UNICODE type. Assume that the object is named ret. By default, MaxCompute attempts to convert the return value ret to the STR type using the ASCII encoding format and returns str(ret). If ret contains only ASCII characters, it is successfully converted to the STR type. However, if ret contains non-ASCII characters, the conversion fails and an error is returned.

    • Solution: Add the following statement to the evaluate method in the Python code.

      return ret.encode('utf-8')
  • Symptom 3: When you call a Python 3 UDF, the error message UnicodeDecodeError: 'utf-8' codec can't decode byte xxx in position xxx: invalid continuation byte is returned.

    • Cause: The input parameter type in the function signature is STRING. However, the input string cannot be decoded to a Python object of the STR type using UTF-8 when you call the Python 3 UDF.

    • Solutions:

      • Avoid writing non-UTF-8 encoded strings to MaxCompute tables.

        For example, a Python 2 UDF returns a Python object of the STR type that is encoded in GBK. This object can be written to a MaxCompute table, but it cannot be read by a Python 3 UDF. Convert the data to UTF-8 encoding before the Python 2 UDF returns it. For example, return ret.decode('gbk').encode('utf-8').

      • In the SQL statement, use the built-in function is_encoding to filter out non-UTF-8 encoded data in advance. The following code provides an example.

        select py_udf(input_col) from example_table where is_encoding(input_col, 'utf-8', 'utf-8') = true;
      • Change the input parameter type in the function signature in the Python code to BINARY. In the SQL statement, convert the STRING type column to the BINARY type and use it as an input parameter for the Python 3 UDF. The following code provides an example.

        select py_udf(cast(input_col as binary)) from example_table;

Function signature issues

This section describes common function signature issues that occur when calling a MaxCompute UDF.

  • Symptom 1: The error message resolve annotation of class xxx for UDTF/UDF/UDAF yyy contains invalid content '<EOF>' is returned.

    • Cause: The input or output parameter of the MaxCompute UDF is a complex data type, but the function signature is invalid.

    • Solution: Modify the complex data type in the function signature to ensure that the signature is valid. For more information about function signatures, see Function signatures and data types.

  • Symptom 2: The error message TypeError: expected <class 'xxx'> but <class 'yyy'> found, value:zzz is returned.

    • Cause: The return value type specified in the function signature is inconsistent with the data type that the MaxCompute UDF code actually returns.

    • Solution: Confirm the expected return result. Modify the function signature or the MaxCompute UDF code to ensure that the data types are consistent.

  • Symptom 3: The error message Semantic analysis exception - evaluate function in class xxx.yyy for user defined function zz does not match annotation ***->*** is returned.

    • Cause: The number of input parameters specified in the function signature is inconsistent with the number of input parameters in the corresponding method in the MaxCompute UDF code.

    • Solution: Confirm the actual number of input parameters. Modify the function signature or the MaxCompute UDF code to ensure that the number of input parameters is consistent.

Third-party package issues

  • Symptom: The error message GLIBCXX_x.x.x not found is returned.

  • Cause: The GLIBCXX version that the .so linked library file depends on is later than the version supported by MaxCompute. The same applies to GLIBC and CXXABI.

  • Solution: Use a compatible WHEEL package or recompile the .so linked library file in a compatible environment. The following list shows the latest versions of dependencies that are supported for binary executable files or .so linked library files in MaxCompute.

    GLIBC <= 2.17
    CXXABI <= 1.3.8
    GLIBCXX <= 3.4.19
    GCC <= 4.2.0

UDTF-related issues

  • Symptom: The error message Semantic analysis exception - expect 2 aliases but have 0 is returned.

  • Cause: The output column names are not specified in the Python UDTF code.

  • Solution: Specify the column names in the as clause of the SELECT statement that calls the Python UDTF. The following command provides an example.

    select my_udtf(col0, col1) as (ret_col0, ret_col1, ret_col2) from tmp1;

UDAF-related issues

  • Symptom 1: The error message Script exception - ValueError: unmarshallable object is returned.

    • Cause: The buffer in the Python UDAF code is not a Marshal object.

    • Solution: When you assign a value to the buffer, make sure that the value is a Marshal object. For example, if you want to use two buffers of the LIST and DICT types in a Python User-Defined Aggregate Function (UDAF), the new_buffer method should be defined as return [list(), dict()]. When you use buffer/pbuffer in the iterate/merge/terminate method, the LIST-type buffer corresponds to buffer[0]/pbuffer[0], and the DICT-type buffer corresponds to buffer[1]/pbuffer[1]. If an element of the buffer is of the LIST or DICT type, the element must also be a Marshal object.

  • Symptom 2: The error message Python UDAF buffer size overflowed: 2821486749 is returned.

    • Cause: The size of the buffer in the Python UDAF exceeds 2 GB after it is processed by Marshal. The buffer is used incorrectly. The size of the buffer should not increase with the data volume.

    • Solution: Redesign the logic of the Python UDAF. The size of the buffer should not increase with the data volume. For example, if you declare a buffer as a list, you cannot continuously add data to the buffer during the iterate and merge phases. For more information about Python UDAFs, see UDAF overview.