Dataphin currently utilizes MaxCompute 2.0 as its compute engine. This topic outlines the data types that MaxCompute 2.0 supports within Dataphin.
Limitations
The data type system version 2.0 is intended for:
MaxCompute projects created after April 2020 that contain no pre-existing data.
Environments where all dependent product components are also compatible with the version 2.0 type system.
Basic data types
Type | Constant example | Description |
TINYINT | 1Y and -127Y | An 8-bit signed integer type. Valid values range from -128 to 127. |
SMALLINT | 32767S and -100S | A 16-bit signed integer type. Valid values range from -32768 to 32767. |
INT | 1000 and -15645787 | A 32-bit signed integer type. Valid values range from -231 to 231-1. |
BIGINT | 100000000000L and -1L | A 64-bit signed integer type. Valid values range from -263+1 to 263 -1. |
BINARY |
| A binary number with a maximum length of 8 MB. Note
|
FLOAT | 3.14F and cast(3.14159261E+7 as float) | A 32-bit binary floating-point type. Note FLOAT data type calculations may result in precision loss due to computer storage and internal computing logic. For high precision requirements, consider converting FLOAT data to DECIMAL data. |
DOUBLE | 3.14D and 3.14159261E+7 | A 64-bit binary floating-point type. Note DOUBLE data type calculations may result in precision loss due to computer storage and internal computing logic. For high precision requirements, consider converting DOUBLE data to DECIMAL data. |
DECIMAL(precision,scale) | 3.5BD and 99999999999.9999999BD | A precise numeric type based on the decimal system.
The default is Note
|
VARCHAR(n) | None | A variable-length character type where n specifies the length. Valid values: 1 to 65535. |
CHAR(n) | None | The fixed-length character type is defined by n, which specifies its length, with a maximum of 255 characters. When the length is insufficient, spaces are added for padding; however, these spaces are not considered during comparisons. |
STRING | "abc", 'bcd', "alibaba", and 'inc' | The string type with a maximum length of 8 MB. |
DATE | DATE'2017-11-11' | The date type formatted as Valid values span from 0001-01-01 to 9999-12-31. |
DATETIME | DATETIME'2017-11-11 00:00:00' | The DATETIME type. Valid values range from 0001-01-01 00:00:00.000 to 9999-12-31 23:59:59.999, accurate to the millisecond. |
TIMESTAMP | TIMESTAMP'2017-11-11 00:00:00.123456789' | The TIMESTAMP data type. Valid values span from 0001-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999, accurate to the nanosecond. Note The timestamp is time zone agnostic. It stores a date offset from Epoch (UTC 1970-01-01 00:00:00) that is consistent across all time zones. TIMESTAMP data can be manipulated for time zone calculations using built-in functions, such as converting TIMESTAMP to STRING data based on the current time zone with |
TIMESTAMP_NTZ | TIMESTAMP_NTZ '2017-11-11 00:00:00.123456789' | A TIMESTAMP data type that does not consider time zones. Valid values range from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999. For more details on the TIMESTAMP_NTZ data type, see the MaxCompute TIMESTAMP_NTZ data type documentation. |
BOOLEAN | True and False | The BOOLEAN type. Valid values: True and False. |
Please consider the following points:
All data types are nullable.
The INT keyword in an SQL statement refers to the 32-bit integer type.
-- Convert a into a 32-bit integer. CAST(a AS INT)By default, an integer constant is processed as the INT type. For example, integer constant 1 in
SELECT 1 + a;is processed as the INT type. If a constant exceeds the value range of the INT type but does not exceed the value range of the BIGINT type, the constant is processed as the BIGINT type. If the constant exceeds the value range of the BIGINT type, the constant is processed as the DOUBLE type.Implicit conversions
Certain implicit conversions (e.g.,
STRINGtoBIGINT,STRINGtoDATETIME,DOUBLEtoBIGINT,DECIMALtoDOUBLE,DECIMALtoBIGINT) are disabled to prevent potential data loss or errors. Use theCASTfunction for explicit conversions.Constants of the VARCHAR type can be implicitly converted into the STRING type.
Tables, Functions, and UDFs
Built-in functions that use 2.0 data types operate correctly in the 2.0 data type system.
UDFs are resolved and overloaded based on the 2.0 data type system.
Partition columns support
STRING,VARCHAR,CHAR,TINYINT,SMALLINT,INT, andBIGINTdata types.
MaxCompute automatically concatenates multiple adjacent string literals. For example,
select 'abc' 'efg' 'ddt';returnsabcefgddt.When inserting a constant into a
DECIMALfield, the constant's format must match its definition. For example:INSERT INTO test_tb(a) VALUES (3.5BD)DATETIME values do not include milliseconds. To specify the time format with millisecond precision, use the
-dfpparameter in the Tunnel command, as intunnel upload -dfp 'yyyy-MM-dd HH:mm:ss.SSS'. For more information, see Tunnel command documentation.
Complex data types
Data type | Definition | Constructor |
ARRAY |
|
|
MAP |
|
|
STRUCT |
|
|