Other functions

更新时间:
复制 MD 格式

MaxCompute SQL provides other common functions for development. You can select functions as needed. This topic describes the command formats, parameters, and examples of functions such as CAST, FAILIF, and HASH.

Function

Feature

BETWEEN AND expression

Filters data that meets specified range conditions.

CASE WHEN expression

Returns different values based on the result of an expression.

CAST

Converts the result of an expression to a target data type.

COALESCE

Returns the first non-NULL value in a parameter list.

COMPRESS

Compresses an input parameter of the STRING or BINARY type using the GZIP algorithm.

CRC32

Calculates the cyclic redundancy check value of a string or binary data.

DECOMPRESS

Decompresses an input parameter of the BINARY type using the GZIP algorithm.

FAILIF

Returns true or a custom error message based on the result of an expression.

GET_IDCARD_AGE

Returns the current age based on a Chinese ID card number.

GET_IDCARD_BIRTHDAY

Returns the date of birth based on a Chinese ID card number.

GET_IDCARD_SEX

Returns the gender based on a Chinese ID card number.

GET_USER_ID

Obtains the ID of the current account.

HASH

Calculates a hash value based on input parameters.

IF

Checks whether a specified condition is true.

MAX_PT

Returns the maximum value of a level-1 partition in a partitioned table.

NULLIF

Compares two input parameters to check whether they are equal.

NVL

A result is returned if a parameter has a NULL value.

ORDINAL

Sorts input variables in ascending order and returns the value at a specified position.

PARTITION_EXISTS

Queries whether a specified partition exists.

SAMPLE

Samples all read column values and filters out rows that do not meet the sampling conditions.

SHA

Calculates the SHA-1 hash value of a string or binary data.

SHA1

Calculates the SHA-1 hash value of a string or binary data.

SHA2

Calculates the SHA-2 hash value of a string or binary data.

STACK

Splits a specified group of parameters into a specified number of rows.

STR_TO_MAP

Splits a string into key-value pairs based on specified separators.

TABLE_EXISTS

Queries whether a specified table exists.

TRANS_ARRAY

A user-defined table-valued function (UDTF) that transposes one row into multiple rows. It converts an array that is stored in a column and delimited by a fixed separator into multiple rows.

TRANS_COLS

A UDTF that transposes one row into multiple rows. It splits different columns into different rows.

UNIQUE_ID

Returns a random ID. This function is more efficient than the UUID function.

UUID

Returns a random ID.

BASE64

  • Command format

    string base64(binary <value>)
  • Description

    Converts the binary value to a BASE64-encoded string.

  • Parameters

    value: Required. A value of the BINARY type.

  • Return value

    Returns a value of the STRING type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Convert the binary result of cast('alibaba' as binary) to a BASE64-encoded string.

      -- YWxpYmFiYQ== is returned.
      select base64(cast ('alibaba' as binary));
    • Example 2: The input parameter is NULL. The command is:

      -- NULL is returned.
      select base64(null);

BETWEEN AND expression

  • Command format

    <a> [NOT] between <b> and <c>
  • Description

    Filters data where the value of a is between b and c, or not between b and c.

  • Parameters

    • a: Required. The field to be filtered.

    • b, c: Required. The start and end of the specified range. The data types must be the same as the data type of a.

  • Return value

    Returns the data that meets the specified condition.

    If a, b, or c is NULL, the expression evaluates to NULL.

  • Examples

    The following data is in the emp table.

    | empno | ename | job | mgr | hiredate| sal| comm | deptno |
    7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
    7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
    7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
    7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
    7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
    7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
    7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
    7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
    7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
    7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
    7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
    7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
    7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
    7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
    7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
    7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
    7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

    The following example queries for data where the value of sal is greater than or equal to 1000 and less than or equal to 1500.

    select * from emp where sal between 1000 and 1500;

    The following result is returned.

    +-------+-------+-----+------------+------------+------------+------------+------------+
    | empno | ename | job | mgr        | hiredate   | sal        | comm       | deptno     |
    +-------+-------+-----+------------+------------+------------+------------+------------+
    | 7521  | WARD  | SALESMAN | 7698  | 1981-02-22 00:00:00 | 1250.0     | 500.0      | 30  |
    | 7654  | MARTIN | SALESMAN | 7698 | 1981-09-28 00:00:00 | 1250.0     | 1400.0     | 30 |
    | 7844  | TURNER | SALESMAN | 7698 | 1981-09-08 00:00:00 | 1500.0     | 0.0        | 30 |
    | 7876  | ADAMS | CLERK | 7788  | 1987-05-23 00:00:00 | 1100.0     | NULL     | 20   |
    | 7934  | MILLER | CLERK | 7782  | 1982-01-23 00:00:00 | 1300.0     | NULL      | 10  |
    | 7956  | TEBAGE | CLERK | 7748  | 1982-12-30 00:00:00 | 1300.0     | NULL      | 10  |
    +-------+-------+-----+------------+------------+------------+------------+------------+

CASE WHEN expression

  • Command format

    MaxCompute provides the following two case when formats:

    • case <value>
      when <value1> then <result1>
      when <value2> then <result2>
      ...
      else <resultn>
      end
    • case
      when (<_condition1>) then <result1>
      when (<_condition2>) then <result2>
      when (<_condition3>) then <result3>
      ...
      else <resultn>
      end
  • Description

    Returns different result values based on the evaluation of value or _condition.

  • Parameters

    • value: Required. The value to be compared.

    • _condition: Required. The condition to be evaluated.

    • result: Required. The return value.

  • Return value

    • If result contains only BIGINT or DOUBLE values, the function converts them to DOUBLE before returning the result.

    • If result contains a value of the STRING type, the function converts all values to the STRING type before returning the result. If a data type conversion is not supported, an error is returned. For example, data of the BOOLEAN type cannot be converted to the STRING type.

    • Conversions between other data types are not allowed.

  • Examples

    The sale_detail table contains the fields shop_name string, customer_id string, total_price double and the following data.

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    | null       | c5          | NULL        | 2014       | shanghai   |
    | s6         | c6          | 100.4       | 2014       | shanghai   |
    | s7         | c7          | 100.5       | 2014       | shanghai   |
    +------------+-------------+-------------+------------+------------+

    The following command is an example.

    select 
    case  
    when region='china' then 'default_region'
    when region like 'shang%' then 'sh_region'
    end as region 
    from sale_detail;

    The following result is returned.

    +------------+
    | region     |
    +------------+
    | default_region |
    | default_region |
    | default_region |
    | sh_region  |
    | sh_region  |
    | sh_region  |
    +------------+

CAST

  • Command format

    cast(<expr> as <type>)
  • Description

    Converts the result of expr to the target data type type.

  • Parameters

    • expr: Required. The source data to be converted.

    • type: Required. The target data type.

      • cast(double as bigint): Converts a value of the DOUBLE data type to the BIGINT data type.

      • cast(string as bigint): When you convert a string to the BIGINT data type, if the string represents an integer, it is directly converted to BIGINT. If the string represents a floating-point number or a number in exponential notation, it is first converted to DOUBLE and then to BIGINT.

      • cast(string as datetime) or cast(datetime as string): The default date format yyyy-mm-dd hh:mi:ss is used.

  • Return value

    • The return value is of the target data type.

    • If you run the setproject odps.function.strictmode=false command, the function returns the digits that appear before any letters.

    • If you run the setproject odps.function.strictmode=true command, an error is returned.

    • When you convert a value to the DECIMAL type, if you set odps.sql.decimal.tostring.trimzero=true, trailing zeros after the decimal point are removed. If you set odps.sql.decimal.tostring.trimzero=false, trailing zeros after the decimal point are retained.

      Important

      The odps.sql.decimal.tostring.trimzero parameter applies to both data retrieved from tables and static values.

  • Examples

    • Example 1: Common usage.

      --Returns 1.
      select cast('1' as bigint);
    • Example 2: Convert a STRING value to a BOOLEAN value. If the STRING is empty, false is returned. Otherwise, true is returned.

      • The STRING is empty.

        select cast("" as boolean);
        --Returns
        +------+
        | _c0  |
        +------+
        | false |
        +------+
      • The STRING is not empty.

        select cast("false" as boolean);
        --Returns true
        +------+
        | _c0  |
        +------+
        | true |
        +------+
    • Example 3: Convert a string to a date.

      --Convert a string to a date.
      select cast("2022-12-20" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      
      --Convert a date string with a time part to a date.
      select cast("2022-12-20 00:01:01" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | NULL       |
      +------------+
      --To ensure the value is displayed correctly, set the following parameter:
      set odps.sql.executionengine.enable.string.to.date.full.format= true;
      select cast("2022-12-20 00:01:01" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      Note

      By default, the odps.sql.executionengine.enable.string.to.date.full.format parameter is false. To convert a date string that includes a time part, set this parameter to true.

    • Example 4 (Incorrect command example): Invalid usage. An exception is thrown if a conversion fails or is not supported. The following command is an example of incorrect usage.

      select cast('abc' as bigint);
    • Example 5: Example of a scenario in which setproject odps.function.strictmode=false is set.

      setprojectodps.function.strictmode=false;
      select cast('123abc'as bigint);
      --Returns
      +------------+
      |_c0|
      +------------+
      |123|
      +------------+
    • Example 6: Example of a scenario in which setproject odps.function.strictmode=true is set.

      setprojectodps.function.strictmode=true;
      select cast('123abc' as bigint);
      --Returns
      FAILED:ODPS-0130071:[0,0]Semanticanalysisexception-physicalplangenerationfailed:java.lang.NumberFormatException:ODPS-0123091:Illegaltypecast-Infunctioncast,value'123abc'cannotbecastedfromStringtoBigint.
    • Example 7: Example of a scenario in which odps.sql.decimal.tostring.trimzero is set.

      --Create a table.
      create table mf_dot (dcm1 decimal(38,18),
                           dcm2 decimal(38,18));
      --Insert data.
      insert into table mf_dot values (12.45500BD,12.3400BD);
      
      --When the flag is true or not set.
      set odps.sql.decimal.tostring.trimzero=true;
      --Remove trailing zeros after the decimal point.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      --Return value
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.34      |
      +------------+------------+
      
      --When the flag is false.
      set odps.sql.decimal.tostring.trimzero=false;
      --Keep trailing zeros after the decimal point.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      --Return value
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.340     |
      +------------+------------+
      
      --This parameter also applies to static values.
      set odps.sql.decimal.tostring.trimzero=false;
      select cast(round(12345.120BD,3) as string);
      --Returns:
      +------------+
      | _c0        |
      +------------+
      | 12345.120  |
      +------------+

COALESCE

  • Command format

    coalesce(<expr1>, <expr2>, ...)
  • Description

    Returns the first non-NULL value in <expr1>, <expr2>, ....

  • Parameters

    expr: Required. The value to be evaluated.

  • Return value

    The return value is of the same data type as the parameters.

  • Examples

    • Example 1: Common usage.

      -- 1 is returned.
      select coalesce(null,null,1,null,3,5,7);
    • Example 2: If a parameter's value type is undefined, an error is returned.

      • Incorrect command example

        -- The data type of the parameter abc is not defined. The system engine cannot recognize it, and an error is returned.
        select coalesce(null,null,1,null,abc,5,7);
      • Correct command example

        select coalesce(null,null,1,null,'abc',5,7);
    • Example 3: If all parameter values are NULL and data is not read from a table, an error is returned. The following is an incorrect command example.

      -- An error is returned, indicating that at least one parameter must not be NULL.
      select coalesce(null,null,null,null);
    • Example 4: Reading from a table returns NULL if all parameter values are empty.

      Source data table:

      +-----------+-------------+------------+
      | shop_name | customer_id | toal_price |
      +-----------+-------------+------------+
      | ad        | 10001       | 100.0      |
      | jk        | 10002       | 300.0      |
      | ad        | 10003       | 500.0      |
      | tt        | NULL        | NULL       |
      +-----------+-------------+------------+

      In the source data table, all values for tt are NULL. When you execute the following statement, NULL is returned.

      select coalesce(customer_id,total_price) from sale_detail where shop_name='tt';

COMPRESS

  • Command format

    binary compress(string <str>)
    binary compress(binary <bin>)
  • Description

    Compresses str or bin using the GZIP algorithm.

  • Parameters

    • str: Required. A value of the STRING type.

    • bin: Required. A value of the BINARY type.

  • Return value

    Returns a value of the BINARY type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Compress the string hello using the GZIP algorithm.

      -- Returns: 1F=8B=08=00=00=00=00=00=00=03=CBH=CD=C9=C9=07=00=86=A6=106=05=00=00=00
      select compress('hello');
    • Example 2: The input parameter is an empty string.

      --Returns=1F=8B=08=00=00=00=00=00=00=03=03=00=00=00=00=00=00=00=00=00
      select compress('');
    • Example 3: The input parameter is NULL.

      -- NULL is returned.
      select compress(null);

CRC32

  • Command format

    bigint crc32(string|binary <expr>)
  • Description

    Calculates the cyclic redundancy check value of the string or binary expr.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the BIGINT type. The following rules apply:

    • If the input parameter is NULL, NULL is returned.

    • If the input parameter is an empty string, 0 is returned.

  • Examples

    • Example 1: Calculate the cyclic redundancy check value of the string ABC.

      -- 2743272264 is returned.
      select crc32('ABC');
    • Example 2: The input parameter is NULL.

      -- NULL is returned.
      select crc32(null);

DECODE

  • Command format

    decode(<expression>, <search>, <result>[, <search>, <result>]...[, <default>])
  • Description

    Implements the if-then-else branching logic.

  • Parameters

    • expression: Required. The expression to be compared.

    • search: Required. The search item to be compared with expression.

    • result: Required. The value that is returned if the value of search matches the value of expression.

    • default: Optional. If no search item matches, the default value is returned. If this parameter is not specified, NULL is returned.

    Note
    • All result values must have the same data type or be NULL. If the data types are inconsistent, an error is returned.

    • All search and expression values must have the same data type. Otherwise, an error is returned.

  • Return value

    • If a match is found, result is returned.

    • If no match is found, default is returned.

    • If default is not specified, NULL is returned.

    • If a duplicate search option exists and a match is found, the result corresponding to the first match is returned.

    • MaxCompute SQL typically returns NULL when it evaluates NULL=NULL. However, in this function, NULL is equal to NULL.

  • Examples

    The sale_detail table contains the fields shop_name string, customer_id string, total_price double and the following data.

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    | null       | c5          | NULL        | 2014       | shanghai   |
    | s6         | c6          | 100.4       | 2014       | shanghai   |
    | s7         | c7          | 100.5       | 2014       | shanghai   |
    +------------+-------------+-------------+------------+------------+

    The following command is an example.

    -- If the value of customer_id is c1, Taobao is returned. If the value is c2, Alipay is returned. If the value is c3, Aliyun is returned. If the value is NULL, N/A is returned. Otherwise, Others is returned.
    select
    decode(customer_id,
    'c1', 'Taobao',
    'c2', 'Alipay',
    'c3', 'Aliyun',
    Null, 'N/A',
    'Others') as result
    from sale_detail;
    -- This is equivalent to the following statement.
    if customer_id = c1 then
    result := 'Taobao';
    elsif customer_id = c2 then
    result := 'Alipay';
    elsif customer_id = c3 then
    result := 'Aliyun';
    ...
    else
    result := 'Others';
    end if;

    The following is the result.

    +------------+
    | result     |
    +------------+
    | Others     |
    | Others     |
    | Others     |
    | Taobao     |
    | Alipay     |
    | Aliyun     |
    +------------+

DECOMPRESS

  • Command format

    binary decompress(binary <bin>)
  • Description

    Decompresses bin using the GZIP algorithm.

  • Parameters

    bin: Required. A value of the BINARY type.

  • Return value

    Returns a value of the BINARY type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Decompress the compressed result of the string hello, world and convert it to a string.

      -- hello, world is returned.
      select cast(decompress(compress('hello, world')) as string);
    • Example 2: The input parameter is NULL.

      -- NULL is returned.
      select decompress(null);

GET_IDCARD_AGE

  • Command format

    get_idcard_age(<idcardno>)
  • Description

    Returns the current age based on a Chinese ID card number. The age is the difference between the current year and the birth year in the ID card number.

  • Parameters

    idcardno: Required. A 15- or 18-digit Chinese ID card number of the STRING type. The function validates the ID card number based on the province code and the last check digit. If the validation fails, NULL is returned.

  • Return value

    Returns a value of the BIGINT type. If the input is NULL, NULL is returned.

GET_IDCARD_BIRTHDAY

  • Command format

    get_idcard_birthday(<idcardno>)
  • Description

    Returns the date of birth based on a Chinese ID card number.

  • Parameters

    idcardno: Required. A 15- or 18-digit Chinese ID card number of the STRING type. The function validates the ID card number based on the province code and the last check digit. If the validation fails, NULL is returned.

  • Return value

    Returns a value of the DATETIME type. If the input is NULL, NULL is returned.

GET_IDCARD_SEX

  • Command format

    get_idcard_sex(<idcardno>)
  • Description

    Returns the gender based on a Chinese ID card number. The value is M (male) or F (female).

  • Parameters

    idcardno: Required. A 15- or 18-digit Chinese ID card number of the STRING type. The function validates the ID card number based on the province code and the last check digit. If the validation fails, NULL is returned.

  • Return value

    Returns a value of the STRING type. If the input is NULL, NULL is returned.

GET_USER_ID

  • Command format

    get_user_id()
  • Description

    Obtains the ID of the current account, which is the user ID or UID.

  • Parameters

    No input parameters are required.

  • Return value

    Returns the ID of the current account.

  • Examples

    select get_user_id();
    -- The following result is returned.
    +------------+
    | _c0        |
    +------------+
    | 1117xxxxxxxx8519 |
    +------------+

GREATEST

  • Command format

    greatest(<var1>, <var2>[,...])
  • Description

    Returns the maximum value among the input parameters.

  • Parameters

    var1, var2: Required. A value of the BIGINT, DOUBLE, DECIMAL, DATETIME, or STRING type.

  • Return value

    • Returns the maximum value among the input parameters. If no implicit conversion exists, the return value is of the same data type as the input parameters.

    • NULL is the minimum value.

    • If the input parameters have different data types, they are promoted to a common data type for comparison. For example, a mix of DOUBLE, BIGINT, DECIMAL, and STRING values results in a DOUBLE return type. A mix of STRING and DATETIME values results in a DATETIME return type. Other implicit conversions are not allowed.

    • When set odps.sql.hive.compatible=true; is set, if any input parameter is NULL, NULL is returned.

HASH

  • Command format

    • If the MaxCompute project is in Hive-compatible mode, the command format is as follows.

      INT HASH(<value1>, <value2>[, ...]);
    • If the MaxCompute project is not in Hive-compatible mode, the command format is as follows.

      BIGINT HASH(<value1>, <value2>[, ...]);
  • Description

    Performs a hash operation on value1 and value2 to obtain a hash value.

  • Parameters

    value1, value2: Required. The parameters from which to calculate the hash value. The parameters can be of different data types. The supported data types vary between Hive-compatible and non-Hive-compatible modes:

    • Hive-compatible mode: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, BOOLEAN, STRING, CHAR, VARCHAR, DATETIME, and DATE.

    • Non-Hive-compatible mode: BIGINT, DOUBLE, BOOLEAN, STRING, and DATETIME.

    Note

    For two identical inputs, the returned hash values are always the same. However, two identical hash values do not guarantee that the input values are the same. A hash collision may occur.

  • Return value

    Returns a value of the INT or BIGINT type. If an input parameter is an empty string or NULL, 0 is returned.

  • Examples

    • Example 1: Calculate the hash value of input parameters of the same data type.

      -- 66 is returned.
      SELECT HASH(0L, 2L, 4L);
    • Example 2: Calculate the hash value of input parameters of different data types.

      -- 97 is returned.
      SELECT HASH(0L, 'a');
    • Example 3: An input parameter is an empty string or NULL.

      -- 0 is returned.
      SELECT HASH(0L, null);
      -- 0 is returned.
      SELECT HASH(0L, '');

IF

  • Command format

    if(<testCondition>, <valueTrue>, <valueFalseOrNull>)
  • Description

    Checks whether testCondition is true. If it is true, the value of valueTrue is returned. Otherwise, the value of valueFalseOrNull is returned.

  • Parameters

    • testCondition: Required. The expression to be evaluated. It must be of the BOOLEAN type.

    • valueTrue: Required. The value that is returned if the testCondition expression is True.

    • valueFalseOrNull: The value that is returned if the testCondition expression is False. This can be set to NULL.

  • Return value

    The return value is of the same data type as the valueTrue or valueFalseOrNull parameters.

  • Examples

    -- 200 is returned.
    select if(1=2, 100, 200); 

LEAST

  • Command format

    least(<var1>, <var2>[,...])
  • Description

    Returns the minimum value among the input parameters.

  • Parameters

    var: Required. An input parameter. It can be of the BIGINT, DOUBLE, DECIMAL, DATETIME, or STRING type.

  • Return value

    • The minimum value among the input parameters. If no implicit conversion exists, the return value is of the same data type as the input parameters.

    • When an implicit type conversion occurs, the resulting type is DOUBLE for conversions among DOUBLE, BIGINT, and STRING, DATETIME for conversions between STRING and DATETIME, and DECIMAL for conversions among DECIMAL, DOUBLE, BIGINT, and STRING. No other implicit type conversions are allowed.

    • NULL is the minimum value.

    • If all parameter values are NULL, NULL is returned.

  • Examples

    -- 2 is returned.
    select least(5, 2, 7);

MAX_PT

  • Command format

    MAX_PT(<table_full_name>)
  • Description

    Returns the maximum value of the level-1 partitions that contain data in a partitioned table. The partitions are sorted alphabetically.

  • Notes

    • The MAX_PT function can also be implemented using standard SQL. SELECT * FROM table WHERE pt=MAX_PT("table"); can be rewritten as SELECT * FROM table WHERE pt = (SELECT MAX(pt) FROM table);.

      Note

      MaxCompute does not provide the MIN_PT function. You cannot use the SQL statement SELECT * FROM table WHERE pt=MIN_PT("table"); to perform an operation analogous to MAX_PT. However, you can use the standard SQL statement SELECT * FROM table WHERE pt= (SELECT MIN(pt) FROM table); to achieve the same result.

    • If all partitions in the table are empty, the MAX_PT function fails. Make sure that at least one partition contains data.

    • OSS foreign tables also support the MAX_PT function. The behavior is the same as that for internal tables.

  • Parameters

    table_full_name: Required. The name of the table. It is of the STRING type. You must have read permission on the table.

  • Return value

    Returns the value of the largest level-1 partition.

    Note

    If you create a partition using only the ALTER TABLE statement and the partition does not contain any data, this partition is not returned.

  • Examples

    • Example 1: The tbl table is a partitioned table that contains the partitions 20120901 and 20120902. Both partitions contain data. In the following statement, MAX_PT returns '20120902'. The MaxCompute SQL statement reads data from the partition where pt='20120902'.

      SELECT * FROM tbl WHERE pt= MAX_PT('tbl');
      -- This is equivalent to the following statement.
      SELECT * FROM tbl WHERE pt= (SELECT MAX(pt) FROM tbl);
    • Example 2: In a multi-level partitioning scenario, use standard SQL to obtain data from the largest partition.

      SELECT * FROM table WHERE pt1 = (SELECT MAX(pt1) FROM table) AND pt2 = (SELECT MAX(pt2) FROM table WHERE pt1= (SELECT MAX(pt1) FROM table));

NULLIF

  • Command format

    T nullif(T <expr1>, T <expr2>)
  • Description

    Compares the values of expr1 and expr2. If they are equal, NULL is returned. Otherwise, expr1 is returned.

  • Parameters

    expr1, expr2: Required. An expression of any type. T represents the input data type, which can be any data type supported by MaxCompute.

  • Return value

    Returns NULL or expr1.

  • Examples

    -- 2 is returned.
    select nullif(2, 3);
    -- NULL is returned.
    select nullif(2, 2);
    -- 3 is returned.
    select nullif(3, null);

NVL

  • Command format

    nvl(T <value>, T <default_value>)
  • Description

    If the value of value is NULL, default_value is returned. Otherwise, value is returned. The two parameters must be of the same data type.

  • Parameters

    • value: Required. An input parameter. T represents the input data type, which can be any data type supported by MaxCompute.

    • default_value: Required. The replacement value. It must be of the same data type as value.

  • Examples

    The t_data table has three columns: c1 string, c2 bigint, and c3 datetime. The table contains the following data.

    +----+------------+------------+
    | c1 | c2 | c3 |
    +----+------------+------------+
    | NULL | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | NULL |
    | bbb | NULL | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +----+------------+------------+

    Use the nvl function to replace NULL values in c1 with 00000, NULL values in c2 with 0, and NULL values in c3 with -.

    select nvl(c1,'00000'),nvl(c2,0),nvl(c3,'-') from nvl_test;
    -- The following result is returned.
    +-----+------------+-----+
    | _c0 | _c1 | _c2 |
    +-----+------------+-----+
    | 00000 | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | - |
    | bbb | 0 | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +-----+------------+-----+

ORDINAL

  • Command format

    ordinal(bigint <nth>, <var1>, <var2>[,...])
  • Description

    Sorts the input variables in ascending order and returns the value at the nth position.

  • Parameters

    • nth: Required. The position number, starting from 1. It is of the BIGINT type. If nth is NULL, NULL is returned.

    • var: Required. The value to be sorted. It can be of the BIGINT, DOUBLE, DATETIME, or STRING type.

  • Return value

    • The value at the nth position. If no implicit conversion exists, the return value is of the same data type as the input parameters.

    • If the input parameters have different data types, they are promoted to a common data type. The return value is of this common data type. For example, a mix of DOUBLE, BIGINT, and STRING values results in a DOUBLE return type. A mix of STRING and DATETIME values results in a DATETIME return type. Other implicit conversions are not allowed.

    • NULL is the minimum value.

  • Examples

    -- 3 is returned.
    SELECT ordinal(3, 1, 3, 7, 5, 2, 4, 6); 

PARTITION_EXISTS

  • Command format

    boolean partition_exists(string <table_name>, string... <partitions>)
  • Description

    Checks whether a specified partition exists.

  • Parameters

    • table_name: Required. The name of the table. It is of the STRING type. You can specify the project name in the table name, for example, my_proj.my_table. If you do not specify a project name, the current project is used by default.

    • partitions: Required. The name of the partition. It is of the STRING type. Specify the partition values in the order of the partition key columns. The number of partition values must be the same as the number of partition key columns.

  • Return value

    Returns a value of the BOOLEAN type. If the specified partition exists, True is returned. Otherwise, False is returned.

  • Examples

    -- Create a partitioned table named foo.
    create table foo (id bigint) partitioned by (ds string, hr string);
    -- Add a partition to the foo table.
    alter table foo add partition (ds='20190101', hr='1');
    -- Query whether the partition ds='20190101' and hr='1' exists. The result is True.
    select partition_exists('foo', '20190101', '1');

SAMPLE

  • Command format

    boolean sample(<x>, <y>, [<column_name1>, <column_name2>[,...]])
  • Description

    Samples data based on the values read from the column_name columns and the settings for x and y, filtering out rows that do not meet the sampling conditions.

  • Parameters

    • x, y: x is required. It must be a BIGINT constant greater than 0. This parameter specifies that the data is hashed into x parts and the yth part is selected.

      y is optional. If it is omitted, the first part is selected by default. If you omit the y parameter, you must also omit the column_name parameter.

      If x or y is of another data type or is less than or equal to 0, an error occurs. If y is greater than x, an error also occurs. If x or y is NULL, NULL is returned.

    • column_name: Optional. The column based on which to sample. If this parameter is omitted, random sampling is performed based on the values of x and y. The column can be of any data type and its value can be NULL. No implicit type conversion is performed. If column_name is a constant NULL, an error is returned.

      Note
      • To prevent data skew caused by NULL values, NULL values in the column_name column are evenly hashed into x parts. If you do not specify column_name, the output may not be uniform when the data volume is small. In this case, specify column_name to obtain a more uniform sample.

      • Currently, random sampling is supported only for columns of the following data types: bigint, datetime, boolean, double, string, binary, char, and varchar.

  • Return value

    Returns a value of the BOOLEAN type.

  • Examples

    A table named tbla exists and contains a column named cola.

    -- This indicates that the values are hashed into four parts based on the cola column, and the first part is taken. The return value is True.
    select * from tbla where sample (4, 1 , cola);
    -- This indicates that each row of data is randomly hashed into four parts, and the second part is taken. The return value is True.
    select * from tbla where sample (4, 2);

SHA

  • Command format

    string sha(string|binary <expr>)
  • Description

    Calculates the SHA-1 hash value of the string or binary expr and returns the value as a hexadecimal string.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the STRING type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Calculate the SHA hash value of the string ABC.

      -- 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 is returned.
      select sha('ABC');
    • Example 2: The input parameter is NULL.

      -- NULL is returned.
      select sha(null);

SHA1

  • Command format

    string sha1(string|binary <expr>)
  • Description

    Calculates the SHA-1 hash value of the string or binary expr and returns the value as a hexadecimal string.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the STRING type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Calculate the SHA-1 hash value of the string ABC.

      -- 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 is returned.
      select sha1('ABC');
    • Example 2: The input parameter is NULL.

      -- NULL is returned.
      select sha1(null);

SHA2

  • Command format

    string sha2(string|binary <expr>, bigint <number>)
  • Description

    Calculates the SHA-2 hash value of the string or binary expr and returns the value in the specified number format.

  • Parameters

    • expr: Required. A value of the STRING or BINARY type.

    • number: Required. A value of the BIGINT type. The hash bit length. Valid values: 224, 256, 384, 512, and 0 (equivalent to 256).

  • Return value

    Returns a value of the STRING type. The following rules apply:

    • If any input parameter is NULL, NULL is returned.

    • If the value of number is not within the valid range, NULL is returned.

  • Examples

    • Example 1: Calculate the SHA-2 hash value of the string ABC.

      -- b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78 is returned.
      select sha2('ABC', 256);
    • Example 2: An input parameter is NULL.

      -- NULL is returned.
      select sha2('ABC', null);

STACK

  • Command format

    stack(n, expr1, ..., exprk) 
  • Description

    Splits expr1, ..., exprk into n rows. Unless otherwise specified, the output result uses the default column names col0, col1....

  • Parameters

    • n: Required. The number of rows to split into.

    • expr: Required. Specifies the parameters to be split, such as expr1, ..., exprk. The parameters must be integers, and the total number of parameters must be an integer multiple of n to be split into n complete rows. An error is returned if these conditions are not met.

  • Return value

    Returns a dataset with n rows. The number of columns is the quotient of the number of parameters divided by n.

  • Examples

    -- Arrange 1, 2, 3, 4, 5, 6 into 3 rows.
    select stack(3, 1, 2, 3, 4, 5, 6);
    -- The following result is returned.
    +------+------+
    | col0 | col1 |
    +------+------+
    | 1    | 2    |
    | 3    | 4    |
    | 5    | 6    |
    +------+------+
    
    -- Arrange 'A',10,date '2015-01-01','B',20,date '2016-01-01' into two rows.
    select stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') as (col0,col1,col2);
    -- The following result is returned.
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+
    
    -- Arrange a, b, c, d into two rows. If the source table has multiple rows, the stack function is executed row by row.
    select stack(2,a,b,c,d) as (col,value)
    from values 
        (1,1,2,3,4),
        (2,5,6,7,8),
        (3,9,10,11,12),
        (4,13,14,15,null)
    as t(key,a,b,c,d);
    -- The following result is returned.
    +------+-------+
    | col  | value |
    +------+-------+
    | 1    | 2     |
    | 3    | 4     |
    | 5    | 6     |
    | 7    | 8     |
    | 9    | 10    |
    | 11   | 12    |
    | 13   | 14    |
    | 15   | NULL  |
    +------+-------+
    
    -- Use with lateral view.
    select tf.* from (select 0) t lateral view stack(2,'A',10,date '2015-01-01','B',20, date '2016-01-01') tf as col0,col1,col2;
    -- The following result is returned.
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+

STR_TO_MAP

  • Command format

    str_to_map([string <mapDupKeyPolicy>,] <text> [, <delimiter1> [, <delimiter2>]])
  • Description

    Splits text into key-value pairs using delimiter1, and then splits each key-value pair into a key and a value using delimiter2.

  • Parameters

    • mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:

      • exception: An error is returned.

      • last_win: The latter key overwrites the former key.

      You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.

      Note

      The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.

    • text: Required. A value of the STRING type. The string to be split.

    • delimiter1: Optional. A separator of the STRING type. If this parameter is not specified, the default separator is a comma (,).

    • delimiter2: Optional. A separator of the STRING type. If this parameter is not specified, the default separator is an equal sign (=).

      Note

      If the separator is a regular expression or a special character, you must add two backslashes (\\) before it for escaping. Special characters include colons (:), periods (.), question marks (?), plus signs (+), and asterisks (*).

  • Return value

    Returns a value of the map<string, string> type. The return value is the result of splitting text using delimiter1 and delimiter2.

  • Examples

    -- {test1:1, test2:2} is returned.
    select str_to_map('test1&1-test2&2','-','&');
    -- {test1:1, test2:2} is returned.
    select str_to_map("test1.1,test2.2", ",", "\\.");
    -- {test1:1, test2:3} is returned.
    select str_to_map("test1.1,test2.2,test2.3", ",", "\\.");

TABLE_EXISTS

  • Command format

    boolean table_exists(string <table_name>)
  • Description

    Checks whether a specified table exists.

  • Parameters

    table_name: Required. The name of the table. It is of the STRING type. You can specify the project name in the table name, for example, my_proj.my_table. If you do not specify a project name, the current project is used by default.

  • Return value

    Returns a value of the BOOLEAN type. If the specified table exists, True is returned. Otherwise, False is returned.

  • Examples

    -- Use in a select list.
    select if(table_exists('abd'), col1, col2) from src;

TRANS_ARRAY

  • Limits

    • All columns that are used as keys must be placed first, and the columns to be transposed must be placed after the key columns.

    • A select statement can contain only one user-defined table-valued function (UDTF) and cannot contain other columns.

    • This function cannot be used with group by, cluster by, distribute by, or sort by.

  • Command format

    trans_array (<num_keys>, <separator>, <key1>,<key2>,…,<col1>,<col2>,<col3>) as (<key1>,<key2>,...,<col1>, <col2>)
  • Description

    A UDTF that transposes one row into multiple rows. It converts an array that is stored in a column and delimited by a fixed separator into multiple rows.

  • Parameters

    • num_keys: Required. A BIGINT constant that must be >=0. This parameter specifies the number of columns to be used as transposition keys when converting to multiple rows.

    • separator: Required. A STRING constant that is used as a separator to split a string into multiple elements. If this parameter is an empty string, an error is returned.

    • keys: Required. The columns to be used as keys for transposition. The number of key columns is specified by num_keys. If num_keys specifies that all columns are used as keys (that is, num_keys is equal to the total number of columns), only one row is returned.

    • cols: Required. The arrays to be transposed into rows. All columns after the keys columns are considered arrays to be transposed. These columns must be of the STRING type and store arrays in string format, such as Hangzhou;Beijing;Shanghai, which is an array delimited by semicolons (;).

  • Return value

    Returns the transposed rows. The new column names are specified by as. The data types of the key columns remain unchanged. All other columns are of the STRING type. The number of resulting rows is determined by the array with the most elements. Arrays with fewer elements are padded with NULL.

  • Examples

    • Example 1: The t_table table contains the following data.

      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangA | 192.168.0.1,192.168.0.2 | 20120101010000,20120102010000 |
      | wangwangB | 192.168.45.10,192.168.67.22,192.168.6.3 | 20120111010000,20120112010000,20120223080000 |
      +----------+----------+------------+
      -- Execute the SQL statement.
      select trans_array(1, ",", login_id, login_ip, login_time) as (login_id,login_ip,login_time) from t_table;
      -- The following result is returned.
      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangB | 192.168.45.10 | 20120111010000 |
      | wangwangB | 192.168.67.22 | 20120112010000 |
      | wangwangB | 192.168.6.3 | 20120223080000 |
      | wangwangA | 192.168.0.1 | 20120101010000 |
      | wangwangA | 192.168.0.2 | 20120102010000 |
      +----------+----------+------------+
      
      -- If the table contains the following data.
      Login_id LOGIN_IP LOGIN_TIME 
      wangwangA 192.168.0.1,192.168.0.2 20120101010000
      -- The arrays with fewer elements are padded with NULL. 
      Login_id Login_ip Login_time 
      wangwangA 192.168.0.1 20120101010000
      wangwangA 192.168.0.2 NULL
    • Example 2: The mf_fun_array_test_t table contains the following data.

      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1,192.168.100.2 | 20211101010101,20211101010102 |
      | 2          | Jerry      | 192.168.100.3,192.168.100.4 | 20211101010103,20211101010104 |
      +------------+------------+------------+------------+
      
      -- Use two keys, id and name, for array transposition. Execute the SQL statement.
      select trans_array(2, ",", Id,Name, login_ip, login_time) as (Id,Name,login_ip,login_time) from mf_fun_array_test_t;
      -- The following result is returned. The data has been split and grouped by the keys id and name.
      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1 | 20211101010101 |
      | 1          | Tom        | 192.168.100.2 | 20211101010102 |
      | 2          | Jerry      | 192.168.100.3 | 20211101010103 |
      | 2          | Jerry      | 192.168.100.4 | 20211101010104 |
      +------------+------------+------------+------------+

TRANS_COLS

  • Limits

    • All columns that are used as keys must be placed first, and the columns to be transposed must be placed after the key columns.

    • A select statement can contain only one UDTF and cannot contain other columns.

  • Command format

    trans_cols (<num_keys>, <key1>,<key2>,…,<col1>, <col2>,<col3>) as (<idx>, <key1>,<key2>,…,<col1>, <col2>)
  • Description

    A UDTF that transposes one row into multiple rows. It splits different columns into different rows.

  • Parameters

    • num_keys: Required. A BIGINT constant that must be >=0. This parameter specifies the number of columns to be used as transposition keys when converting to multiple rows.

    • keys: Required. The columns to be used as keys for transposition. The number of key columns is specified by num_keys. If num_keys specifies that all columns are used as keys (that is, num_keys is equal to the total number of columns), only one row is returned.

    • idx: Required. Specifies the index of the transposed row.

    • cols: Required. The columns to be transposed into rows.

  • Return value

    Returns the transposed rows. The new column names are specified by the as clause. The output includes the key columns, an index column for the transposed row (starting from 1), and a value column containing the data from the transposed columns. The data types of the key columns remain unchanged. The value column will have a common supertype of the transposed input columns.

  • Examples

    The t_table table contains the following data.

    +----------+----------+------------+
    | Login_id | Login_ip1 | Login_ip2 |
    +----------+----------+------------+
    | wangwangA | 192.168.0.1 | 192.168.0.2 |
    +----------+----------+------------+
    -- Execute the SQL statement.
    select trans_cols(1, login_id, login_ip1, login_ip2) as (idx, login_id, login_ip) from t_table;
    -- The following result is returned.
    idx    login_id    login_ip
    1    wangwangA    192.168.0.1
    2    wangwangA    192.168.0.2

UNBASE64

  • Command format

    binary unbase64(string <str>)
  • Description

    Converts the BASE64-encoded string str to its binary representation.

  • Parameters

    str: Required. A value of the STRING type. The BASE64-encoded string to be converted.

  • Return value

    Returns a value of the BINARY type. If the input is NULL, NULL is returned.

  • Examples

    • Example 1: Convert the string YWxpYmFiYQ== to its binary value.

      -- alibaba is returned.
      select unbase64('YWxpYmFiYQ==');
    • Example 2: The input parameter is NULL.

      -- NULL is returned.
      select unbase64(null);

UNIQUE_ID

  • Command format

    string unique_id()
  • Description

    Returns a random unique ID. An example is 29347a88-1e57-41ae-bb68-a9edbdd9****_1. This function is more efficient than the UUID function. The returned ID is longer than that returned by the UUID function and contains an underscore (_) and a number, such as _1.

UUID

  • Command format

    string uuid()
  • Description

    Returns a random ID. An example is 29347a88-1e57-41ae-bb68-a9edbdd9****.

    Note

    UUID returns a random ID. The probability of duplication is very low.