The Storage API (also called Open Storage) gives third-party compute engines direct read access to MaxCompute's underlying storage. The Storage API provides an efficient, low-latency, and secure method for data access. Engines such as Apache Spark, StarRocks, Presto, and Apache Flink connect through a connector, bypassing data exports and reducing access latency.

How it works
A third-party compute engine connects to MaxCompute through a connector that calls the Storage API. The API exposes MaxCompute tables with standard table semantics, so engines read columnar data directly from storage without moving data out of MaxCompute.
Key capabilities:
High throughput: Supports efficient columnar reads, predicate pushdown, and the Apache Arrow data format.
Secure and user-friendly: Enforces project isolation, access control, and data encryption at the storage layer — without exposing storage internals to the caller.
Ecosystem integration: Spark on EMR and StarRocks support dedicated connectors for direct MaxCompute access.
Use cases
The Storage API suits workloads that require multi-engine access to the same data without duplication:
Cross-engine analytics: Run Spark, StarRocks, or Presto queries directly on MaxCompute tables without exporting data first.
Flexible framework switching: Switch between compute frameworks for different processing needs while keeping data in one place.
Limitations
Supported and unsupported table types
Third-party engines can read the following table types through the Storage API:
Standard tables
Partitioned tables
Clustered tables
Delta Tables
Materialized views
The following table types are not supported:
External tables
Logical views
Unsupported data types
Reading data of the JSON type is not supported.
Throughput limits (pay-as-you-go)
| Limit | Value |
|---|---|
| Concurrent requests per tenant | 1,000 |
| Transmission rate per concurrent request | 10 MB/s |
Data transmission resources
To run data transmission tasks, use exclusive resource groups for Data Transmission Service (DTS). DTS resource groups use the subscription billing method — charges are based on the number of concurrent instances purchased.
| Resource | Billing | Supported regions | Setup guide |
|---|---|---|---|
| Subscription fees for exclusive data transmission resources | Subscription — charged per concurrent instance |
| Purchase and use an exclusive resource group for Data Transmission Service |
This resource is pay-as-you-go. Each tenant receives a free monthly quota of 1 TB for data reads and writes. If the free quota is used up, you are charged for the logical size of data that is read or written. |
|
To monitor usage, go to the Resource Observation page. For more information, see Resource Observation and pay-as-you-go Storage API resources.
Use the pay-as-you-go Storage API
Enable the Storage API Switch switch.
Log on to the MaxCompute console, and select a region in the upper-left corner.
In the navigation pane on the left, choose .
On the Tenants page, click the Tenant Property tab.
On the Tenant Property tab, turn on the Storage API Switch.
The name of the pay-as-you-go Storage API resource is
pay-as-you-go. For more information, see Tenant Properties.
Grant permissions
By default, no accounts, including Alibaba Cloud accounts or roles, have the permissions to specify a quota at the job level. You must grant the required permissions.
Add a role.
Log on to the MaxCompute console, and select a region in the upper-left corner.
In the navigation pane on the left, choose .
On the Tenants page, click the Roles tab.
On the Roles tab, click Add Role. In the Add Role dialog box, enter a Role Name and Policy Content, and then click OK to create the role.
{ "Statement": [{ "Action": [ "odps:List", "odps:Usage"], "Effect": "Allow", "Resource": ["acs:odps:*:regions/*/quotas/pay-as-you-go"]}], "Version": "1" }Metric description:
Action: The operation permission that you want to grant. You can specify multiple operations in a single authorization statement. If you specify multiple operations, separate them with commas (,). For more information about the valid values, see MaxCompute permissions. For more information about the parameters in a policy document, see Basic elements of an access policy.
Resource: The scope of the authorized resource. The format is
["acs:odps:Tenant/${tenant_id}:regions/${region_id}/quotas/${quota_name}"].["acs:odps:*:regions/*/quotas/pay-as-you-go"]specifies the pay-as-you-go quota for the Storage API in all regions of the current tenant.
Grant the role to the account that you use to specify a quota at the job level.
By default, an Alibaba Cloud account or a RAM user who has the Super_Administrator role at the tenant level can grant permissions.
The authorization object determines which of the following two scenarios occurs.
Grant permissions to an Alibaba Cloud account.
Run the following command to grant permissions to an Alibaba Cloud account.
-- Add an Alibaba Cloud account to the tenant and grant a role to the Alibaba Cloud account. Add tenant user <Aliyun$xxxx>; Grant tenant role <role_name> to user <Aliyun$xxxx>; -- View the permissions of a role or user in the tenant. Show grants for tenant role <role_name>; Show grants for tenant user <user_name>; Show principals for tenant [role] <role_name>;Grant permissions to a RAM user.
Log on to the MaxCompute console, and select a region in the upper-left corner.
In the navigation pane on the left, choose .
On the Tenants page, click the Users tab.
In the Edit Role dialog box, select the roles to assign to the user from the Available Roles area, move them to the Added Roles area, and then click OK.
Use pay-as-you-go Storage API resources
When a third-party engine accesses MaxCompute, you can set the quota name to
pay-as-you-go. The following example uses the Java SDK.// The AccessKey ID and AccessKey secret of your Alibaba Cloud account or RAM user. // An AccessKey pair of an Alibaba Cloud account has permissions to call all API operations. Using these credentials to perform operations in Alibaba Cloud services is a high-risk operation. We recommend that you create a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. // In this example, the AccessKey ID and AccessKey secret are stored in environment variables. You can also store the AccessKey pair in a configuration file as needed. // To prevent security risks, we recommend that you do not hard-code the AccessKey pair in your code. private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"); private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"); // The name of the pay-as-you-go quota that is used to access MaxCompute. String quotaName = "pay-as-you-go"; // The name of the MaxCompute project. String project = "<project>"; // Create an Odps object to connect to the MaxCompute service. Account account = new AliyunAccount(accessId, accessKey); Odps odps = new Odps(account); odps.setDefaultProject(project); // The endpoint of the MaxCompute service. Only VPCs are supported. odps.setEndpoint(endpoint); Credentials credentials = Credentials.newBuilder().withAccount(odps.getAccount()).withAppAccount(odps.getAppAccount()).build(); EnvironmentSettings settings = EnvironmentSettings.newBuilder().withCredentials(credentials).withServiceEndpoint(odps.getEndpoint()).withQuotaName(quotaName).build();NoteConfigure an endpoint based on the region and network connectivity that you selected when you created the MaxCompute project. For more information about the endpoints for each region and network type, see Endpoints.
Arrow data type mapping
The Storage API uses Apache Arrow as the in-memory data format for transmission. Data written through the Storage API is not processed — duplicate keys in MAP columns are retained as-is.
TIMESTAMP and INTERVAL_DAY_TIME types are subject to precision loss. TIMESTAMP values beyond the Arrow TimestampType range have the high-precision part truncated. INTERVAL_DAY_TIME (nanosecond precision) is truncated to milliseconds.
The following table maps MaxCompute types to their Arrow equivalents.
| MaxCompute type | Arrow type | Notes |
|---|---|---|
| TINYINT | Int8Type | |
| SMALLINT | Int16Type | |
| INT | Int32Type | |
| BIGINT | Int64Type | |
| FLOAT | FloatType | |
| DOUBLE | DoubleType | |
| BOOLEAN | BooleanType | |
| DECIMAL | Decimal128Type | Read: converted to decimal(38,18); overflow throws an exception. Write: Arrow decimal(precision,scale) maps to DECIMAL(38,18); precision and scale must match. |
| DECIMAL(precision, scale) | Decimal128Type | |
| STRING | StringType | |
| BINARY | BinaryType | |
| VARCHAR | StringType | |
| CHAR | StringType | |
| DATETIME | TimestampType | Time unit: milliseconds. Timezone: UTC. |
| TIMESTAMP | TimestampType | Time unit: nanoseconds. Timezone: UTC. Supports a wider value range; data beyond the Arrow precision range is truncated. |
| DATE | Date32Type | |
| INTERVAL_DAY_TIME | DayTimeIntervalType | MaxCompute precision: nanoseconds. Arrow precision: milliseconds. The nanosecond part is truncated on read and write. |
| INTERVAL_YEAR_MONTH | MonthIntervalType | |
| ARRAY | ListType | |
| MAP | MapType | Duplicate keys are retained on write. On query, the SQL engine applies "last write wins" (example: writing {'a': 1, 'b': 2, 'a': 3} returns {'a': 3, 'b': 2}). |
| STRUCT | StructType | |
| JSON | StringType | Not supported: Reading JSON data through the Storage API is not supported. |
What's next
Access MaxCompute using a connector:
Access MaxCompute using an SDK: