This feature is in private preview. To apply, complete the Trial Application Form. The MaxCompute team reviews applications within 3 business days and sends results by SMS. For questions about the result, contact us through the Application Link.
Overview
MaxCompute stores Apache Iceberg tables in Object Storage Service (OSS) with unified metadata, permission, and lifecycle management. Iceberg tables are compatible with open-source engines such as Spark, Flink, Trino, and Presto, enabling multi-engine data sharing for a Lakehouse architecture.
The open Iceberg format eliminates data silos and enables seamless data sharing with Spark, Flink, and Hive. You can choose the right compute engine for each workload without vendor lock-in. The open standard also ensures long-term data portability and supports cross-organizational collaboration in multi-cloud scenarios.
MaxCompute-managed Iceberg tables offer the following features:
-
Iceberg lake table data management:
Create, read, and write Iceberg tables on OSS using MaxCompute SQL. Supports DDL, DML, time travel, schema evolution, and partition evolution.
-
Automated background maintenance:
Built-in optimization handles compaction, snapshot expiration, and metadata archiving automatically, eliminating manual maintenance.
-
Metadata consistency:
MaxCompute Metastore (MaxMeta) manages table metadata with ACID-compliant writes. External engines must use the MaxCompute Storage Write API or a MaxCompute-native engine, preventing metadata conflicts from multiple write sources.
-
Open ecosystem integration:
Trino, Presto, Spark, Flink, and Doris can access managed Iceberg tables natively, enabling low-latency, high-concurrency multi-engine queries.
-
Unified permission management: Define permissions for your data lake and data warehouse in a single system. Integrates RAM permissions with data warehouse access control for enterprise-grade Lakehouse security.
Usage notes
The following operations on a MaxCompute-managed Iceberg table can cause data loss or make data unreadable:
|
Forbidden operation |
Consequence |
Correct practice |
|
Modifying Iceberg table data through any interface other than MaxCompute write APIs. |
The table may fail consistency checks and become unreadable. |
Modify data only through MaxCompute SQL. |
|
Uploading files to the managed OSS path. |
The background service treats uploaded files as orphan files and deletes them. |
Do not upload data to the managed path. |
|
Sharing the same OSS path across multiple managed Iceberg tables. |
Each table's garbage collection process deletes the other table's files. |
Use a unique OSS path for each table. |
|
Using OSS paths with a parent-child directory relationship for different managed Iceberg tables. |
Each table's garbage collection treats the other table's files as orphan data and deletes them, causing data loss. |
Use a unique OSS path for each managed Iceberg table. Avoid paths with parent-child relationships. |
Create a table
Syntax
CREATE ICEBERG TABLE [IF NOT EXISTS] <table name> (
<col_name> <data_type>,
...
)
PARTITIONED BY (<partitionExpression>)
WITH CONNECTION <connection name>
OPTIONS(
location='<oss_location>'
)
;
Parameters
Example
CREATE ICEBERG TABLE mc_iceberg_table (
id bigint COMMENT 'Unique user ID',
name string COMMENT 'User name',
age bigint COMMENT 'User age',
gender string COMMENT 'User gender',
height float COMMENT 'User height',
birthday date COMMENT 'User date of birth',
phone_number string COMMENT 'User phone number',
email string COMMENT 'User email address',
address string COMMENT 'User address',
salary decimal(18, 2) COMMENT 'User salary',
create_time timestamp COMMENT 'Time when user information was created',
update_time timestamp COMMENT 'Time when user information was last updated',
is_deleted boolean COMMENT 'Flag indicating if the user information has been deleted',
dt string COMMENT 'Partition field'
)
PARTITIONED BY (dt)
WITH CONNECTION <connection name>
OPTIONS(
location='oss://<oss bucket>/Demo-iceberg/'
);
Writing data
-
MaxCompute write syntax: Syntax description.
-
Example:
SET odps.sql.type.system.odps2=true; SET odps.sql.decimal.odps2=true; INSERT INTO mc_iceberg_table VALUES (1, 'Zhang San', 18, 'Male', cast (178.56 as float), DATE '1990-01-01', '13800000000', 'zhangsan@example.com', 'Haidian District, Beijing', 5000.00, TIMESTAMP '2023-04-19 11:32:00', TIMESTAMP '2023-04-19 11:32:00', false,'20260402'), (2, 'Li Si', 20, 'Female', cast (162.70 as float), DATE '1992-02-02', '13900000000', 'lisi@example.com', 'Pudong New Area, Shanghai', 6000.00, TIMESTAMP '2023-04-19 11:32:00', TIMESTAMP '2023-04-19 11:32:00',false,'20260401'), (3, 'Wang Wu', 22, 'Male', cast (185.21 as float), DATE '1994-03-03', '14000000000', 'wangwu@example.com', 'Nanshan District, Shenzhen', 7000.00, TIMESTAMP '2023-04-19 11:32:00', TIMESTAMP '2023-04-19 11:32:00', false,'20260403') ;
Querying and analyzing data
-
SELECT syntax: Syntax description.
-
Example:
SELECT * FROM mc_iceberg_table; -- The following result is returned: +------------+------+------------+--------+--------+----------+--------------+-------+---------+--------+-------------+-------------+------------+----+ | id | name | age | gender | height | birthday | phone_number | email | address | salary | create_time | update_time | is_deleted | dt | +------------+------+------------+--------+--------+----------+--------------+-------+---------+--------+-------------+-------------+------------+----+ | 1 | Zhang San | 18 | Male | 178.56 | 1990-01-01 | 13800000000 | zhangsan@example.com | Haidian District, Beijing | 5000 | 2023-04-19 03:32:00 | 2023-04-19 03:32:00 | false | 20260402 | | 2 | Li Si | 20 | Female | 162.7 | 1992-02-02 | 13900000000 | lisi@example.com | Pudong New Area, Shanghai | 6000 | 2023-04-19 03:32:00 | 2023-04-19 03:32:00 | false | 20260401 | | 3 | Wang Wu | 22 | Male | 185.21 | 1994-03-03 | 14000000000 | wangwu@example.com | Nanshan District, Shenzhen| 7000 | 2023-04-19 03:32:00 | 2023-04-19 03:32:00 | false | 20260403 | +------------+------+------------+--------+--------+----------+--------------+-------+---------+--------+-------------+-------------+------------+----+
Iceberg external vs. managed tables
|
Dimension |
Iceberg external table |
MaxCompute-managed Iceberg table |
|
CREATE TABLE statement |
|
|
|
Table lifecycle |
MaxCompute only maps the table. |
MaxCompute manages the full lifecycle of metadata and OSS data. The specified OSS directory must be empty at creation. |
|
Metadata and data governance |
Same as standard external tables. |
High-performance metadata caching. |
|
Data openness (read/write by open-source engines) |
Consistency guaranteed by the Iceberg format. |
MaxCompute guarantees read/write consistency. Open-source engines read from OSS directly or use the MaxCompute Storage API. |
|
Lake table maintenance |
Maintained by the user. |
Automatic background maintenance:
|
Supported data types
|
Iceberg data type |
MaxCompute data type |
Read/write support |
|
Types.BooleanType |
BOOLEAN |
|
|
Types.IntegerType |
INT |
|
|
Types.LongType |
BIGINT |
|
|
Types.FloatType |
FLOAT |
|
|
Types.DoubleType |
DOUBLE |
|
|
Types.DecimalType |
DECIMAL(precision, scale)
|
|
|
Types.DateType |
DATE |
|
|
Types.TimeType |
BIGINT
|
|
|
Types.TimestampType |
TIMESTAMP_NTZ |
|
|
Types.TimestampType_z |
TIMESTAMP |
|
|
Types.StringType |
STRING |
|
|
Types.UUIDType |
BINARY
|
|
|
Types.FixedType |
BINARY |
|
|
Types.BinaryType |
BINARY |
|
|
TypeID.STRUCT |
STRUCT |
|
|
TypeID.LIST |
ARRAY |
|
|
TypeID.MAP |
MAP |
|
|
N/A |
TINYINT, SMALLINT, VARCHAR(n), CHAR(n), DATETIME, JSON |
N/A |
Limitations
-
Version limitations
-
Only features compatible with Iceberg SDK 1.6.1 are supported.
-
Reading and writing are supported only for Iceberg table format v2.
-
Features of the Iceberg v3 format are not supported.
-
-
Supported operations
-
MaxCompute data types: Data types (Version 1.0) and Data types (Version 2.0).
-
Schema evolution currently supports only adding and dropping columns.
-
Partition pruning is supported. If not enabled, submit feedback through the Application Link or join the MaxCompute Developer Community DingTalk group (group ID: 11782920).
-
-
Unsupported operations
-
For MaxCompute-managed Iceberg tables:
-
The Rename Table statement is not supported.
-
The XCOPY Cross-Region Replication statement is not supported.
-
The CLONE TABLE and RENAME TABLE statements are not supported.
-
Statements to rename table snapshots are not supported.
-
Setting default values for columns is not supported.
-
Modifying the data type of a column is not supported.
-
Materialized view operations are not supported.
-
The UPDATE statement is not supported.
-
Row-level access control is not supported.
-
The TRUNCATE statement is not supported.
-
CDC (private preview) is not supported.
-
Cross-region disaster recovery is not supported.
-
Local backup is not supported.
-
The
CREATE OR REPLACEstatement is not supported. -
The
LOADandUNLOADcommands are not supported.
-
-
Console queries and API responses display the storage size of managed Iceberg tables as 0 bytes.
-