External Catalog
External Catalogs let AnalyticDB for MySQL query external data sources directly with zero ETL—no data import or migration is required. You can plug in Apache Hive, Apache Iceberg, Delta Lake, and Apache Paimon data lakes, plus any MySQL-compatible database (RDS for MySQL, StarRocks, or another AnalyticDB for MySQL instance). The result is transparent cross-source queries, lakehouse analytics, and on-demand data ingestion.
Supported catalog types
|
Type |
Source |
Cluster version requirement |
Metastore |
Supported operations |
|
Paimon Catalog |
Apache Paimon |
3.2.7 |
|
Query, write, DDL (from 3.2.8) |
|
Hive Catalog |
Apache Hive |
3.2.8 |
HMS |
Query, write, DDL |
|
Iceberg Catalog |
Apache Iceberg |
3.2.8 |
HMS |
Query, write, DDL |
|
Delta Catalog |
Delta Lake |
3.2.8 |
HMS |
Query only |
|
MySQL Catalog |
RDS for MySQL, AnalyticDB for MySQL, StarRocks, or any MySQL-compatible database |
3.2.8 |
N/A (direct JDBC) |
Query only |
Prerequisites
-
An AnalyticDB for MySQL cluster is running on a kernel version that meets the cluster version requirement of the catalog type you plan to use.
-
ENI network access is enabled.
-
The OSS bucket is in the same region as the cluster (VPCs in the same region can access OSS over the internal network by default).
-
When using HMS: the cluster and HMS are in the same VPC, and the HMS security group allows traffic from the cluster's ENI CIDR range.
-
When using DLF 2.5: the cluster's VPC is on the DLF allowlist, and the service-linked role
AliyunServiceRoleForAnalyticDBForMySQLhas been granted permissions on the target Catalog in the DLF console under Data Permissions (select the preset permission type Data Editor). -
When using MySQL Catalog: the target database's security group or allowlist permits traffic from the cluster's ENI CIDR range.
Important notes
CREATE EXTERNAL CATALOG validates only SQL syntax and parameter format. It does not verify connectivity to the underlying source. Even if the OSS path is missing or the Metastore endpoint is wrong, the statement still returns success—errors surface only when you query through the catalog. After creation, immediately run SHOW DATABASES FROM <catalog_name> to confirm connectivity.
Hive Catalog
Hive Catalog manages and queries Apache Hive data. Available since AnalyticDB for MySQL 3.2.8; HMS only.
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "hive",
"metastore-uri" = "<hms_thrift_uri>"
);
|
Property |
Required |
Description |
|
|
Yes |
Catalog type. Must be |
|
|
Yes |
The Hive Metastore Thrift endpoint, in the form |
Example
CREATE EXTERNAL CATALOG hive_hms
PROPERTIES
(
"type" = "hive",
"metastore-uri" = "thrift://hms-host:9083"
);
DDL and DML
From AnalyticDB for MySQL 3.2.8 onward, Hive Catalog supports the following:
-- Create database
CREATE DATABASE IF NOT EXISTS <catalog_name>.<db_name>;
-- Create table (PARQUET only)
CREATE TABLE IF NOT EXISTS <catalog_name>.<db_name>.<table_name> (
<col1> <data_type>,
<col2> <data_type>
)
STORED AS PARQUET;
-- Insert
INSERT INTO <catalog_name>.<db_name>.<table_name> VALUES (...);
-- Bulk insert
INSERT INTO <catalog_name>.<db_name>.<table_name>
SELECT * FROM <source_catalog>.<source_db>.<source_table>;
-- CTAS
CREATE TABLE <catalog_name>.<db_name>.<new_table>
AS SELECT * FROM <source_catalog>.<source_db>.<source_table>;
-- Drop
DROP TABLE IF EXISTS <catalog_name>.<db_name>.<table_name>;
DROP DATABASE IF EXISTS <catalog_name>.<db_name>;
Iceberg Catalog
Iceberg Catalog manages and queries Apache Iceberg data. Available since AnalyticDB for MySQL 3.2.8; HMS only.
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "iceberg",
"metastore-uri" = "thrift://<host>:<port>"
);
Property semantics are the same as Hive Catalog; just change type to iceberg.
Example
CREATE EXTERNAL CATALOG iceberg_hms
PROPERTIES
(
"type" = "iceberg",
"metastore-uri" = "thrift://hms-host:9083"
);
DDL and DML
From 3.2.8 onward, the supported DDL and DML are identical to Hive Catalog. Use STORED AS ICEBERG when creating a table:
CREATE TABLE IF NOT EXISTS <catalog_name>.<db_name>.<table_name> (
<col1> <data_type>,
<col2> <data_type>
)
STORED AS ICEBERG;
Delta Catalog
Delta Catalog queries Delta Lake-formatted data. Available since AnalyticDB for MySQL 3.2.8; HMS only; query only—writes and DDL are not supported.
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "delta",
"metastore-uri" = "thrift://<host>:<port>"
);
Property semantics are the same as Hive Catalog; just change type to delta.
Example
CREATE EXTERNAL CATALOG delta_hms
PROPERTIES
(
"type" = "delta",
"metastore-uri" = "thrift://hms-host:9083"
);
Paimon Catalog
Paimon Catalog manages and queries Apache Paimon data. Available since AnalyticDB for MySQL 3.2.7. Compared with the other types, Paimon Catalog supports more metastores: FileSystem, HMS, and DLF 2.5. Write and DDL operations are supported from 3.2.8.
Supported metastores
|
Metastore |
Description |
|
FileSystem |
Paimon writes metadata directly into the warehouse directory. No separate metastore is required. Best for lightweight setups. |
|
DLF 2.5 |
Alibaba Cloud Data Lake Formation (DLF 2.5) as a unified metadata service. Recommended for production. |
|
HMS |
Hive Metastore. Best for environments with an existing Hadoop ecosystem. |
FileSystem (no separate metastore)
Paimon writes its metadata directly into the warehouse directory. Best for lightweight setups.
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "filesystem",
"paimon.catalog.warehouse" = "oss://<bucket>/<path>/",
-- Optional tuning properties
"paimon.enable_raw_delete_vector" = "true",
"paimon.enable_metadata_cache" = "true",
"paimon.enable_bucket_join" = "true",
"paimon.meta_cache_expired_time" = "1800"
);
-- Example
CREATE EXTERNAL CATALOG paimon_oss
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "filesystem",
"paimon.catalog.warehouse" = "oss://my-bucket/paimon/warehouse/"
);
DLF 2.5 (recommended for production)
Use Alibaba Cloud Data Lake Formation (DLF 2.5) as the unified metastore.
DLF 1.0 is not supported. Use DLF 2.5 or later.
Prerequisites:
-
DLF 2.5 is enabled and a target Catalog has been created in the DLF console under Data Catalogs. Its name is what you fill into
paimon.catalog.warehouse. -
In the DLF console under Data Permissions, grant the AnalyticDB for MySQL service-linked role
AliyunServiceRoleForAnalyticDBForMySQLpermissions on the target Catalog: select the preset permission type Data Editor.
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "rest",
"paimon.catalog.warehouse" = "<dlf_catalog_name>",
"paimon.catalog.uri" = "<dlf_endpoint>",
"paimon.catalog.token.provider" = "dlf",
-- The following properties are required only outside the managed-credential scenario
"paimon.catalog.dlf.access-key-id" = "<access_key_id>",
"paimon.catalog.dlf.access-key-secret" = "<access_key_secret>"
);
Properties
|
Property |
Required |
Description |
|
|
Yes |
Catalog type. Must be |
|
|
Yes |
Paimon Catalog engine type. Must be |
|
|
Yes |
Metastore type. Set to |
|
|
Yes |
The DLF Catalog name (not an OSS path). |
|
|
Yes |
The VPC-internal endpoint of the DLF service. Do not include a protocol prefix. Format: |
|
|
Yes |
Token provider. Must be |
|
|
No |
Alibaba Cloud AccessKey ID. Not required when the AnalyticDB for MySQL cluster uses managed credentials. |
|
|
No |
Alibaba Cloud AccessKey Secret. Not required when the AnalyticDB for MySQL cluster uses managed credentials. |
Examples
With managed credentials:
CREATE EXTERNAL CATALOG paimon_catalog_dlf
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "rest",
"paimon.catalog.warehouse" = "my_paimon_catalog",
"paimon.catalog.uri" = "cn-hangzhou-vpc.dlf.aliyuncs.com",
"paimon.catalog.token.provider" = "dlf"
);
With an explicit AccessKey pair:
CREATE EXTERNAL CATALOG paimon_catalog_dlf
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "rest",
"paimon.catalog.warehouse" = "my_paimon_catalog",
"paimon.catalog.uri" = "cn-hangzhou-vpc.dlf.aliyuncs.com",
"paimon.catalog.token.provider" = "dlf",
"paimon.catalog.dlf.access-key-id" = "<your_access_key_id>",
"paimon.catalog.dlf.access-key-secret" = "<your_access_key_secret>"
);
HMS (existing Hadoop ecosystem)
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "paimon",
"paimon.catalog.metastore" = "hive",
"paimon.catalog.metastore.thrift.uri.selection" = "SEQUENTIAL",
"paimon.catalog.warehouse" = "oss://<bucket>/<path>/",
"paimon.catalog.uri" = "thrift://<hms_host>:<port>"
);
paimon.catalog.metastore.thrift.uri.selection must be set to SEQUENTIAL in the current release. Setting it to Random can cause HMS connection failures.
Key properties
|
Property |
Required |
Description |
|
|
Yes |
Must be |
|
|
Yes |
Metastore type: |
|
|
Yes |
FileSystem and HMS modes: the OSS path of the Paimon warehouse, in the form |
|
|
No |
Whether to enable metadata caching. Default |
|
|
No |
Cache TTL in seconds. Default |
|
|
No |
Whether to enable bucket-join optimization. Default |
|
|
No |
Whether to enable raw delete-vector optimization. Default |
DDL and DML
From 3.2.8 onward, Paimon Catalog supports the operations below. Table properties such as primary-key, partition, and bucket go inside TBLPROPERTIES. Paimon supports several table types: Append Only, primary-key (MOR/COW/MOW), aggregation, and First Row.
CREATE TABLE IF NOT EXISTS <catalog_name>.<db_name>.<table_name> (
order_key BIGINT,
order_status VARCHAR,
total_price DOUBLE,
order_date DATE
)
TBLPROPERTIES (
'file.format' = 'PARQUET',
'partition' = 'order_date',
'bucket' = '2',
'bucket-key' = 'order_key'
);
MySQL Catalog
MySQL Catalog talks to MySQL-compatible external databases (RDS for MySQL, another AnalyticDB for MySQL instance, StarRocks) over JDBC. Available since AnalyticDB for MySQL 3.2.8; no metastore is required; query only.
Supported data sources
|
Data source |
Description |
|
RDS for MySQL |
Alibaba Cloud ApsaraDB RDS for MySQL. Connects through the standard MySQL port (default 3306). |
|
AnalyticDB for MySQL |
Another AnalyticDB for MySQL instance. Connects through the standard MySQL port (default 3306). |
|
StarRocks |
Connects to the StarRocks FE node over the MySQL protocol (default port 9030). The MariaDB driver must be enabled. |
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
PROPERTIES
(
"type" = "mysql",
"connection-url" = "jdbc:mysql://<host>:<port>",
"connection-user" = "<username>",
"connection-password" = "<password>"
-- For StarRocks, also set:
-- , "mysql.use-mariadb-driver" = "true"
);
|
Property |
Required |
Description |
|
|
Yes |
Must be |
|
|
Yes |
JDBC URL of the target database. Must start with |
|
|
Yes |
Credentials for the target database. |
|
|
No |
Whether to use the MariaDB JDBC driver. Default |
Examples
-- Connect to RDS for MySQL
CREATE EXTERNAL CATALOG rds_mysql_catalog
PROPERTIES
(
"type" = "mysql",
"connection-url" = "jdbc:mysql://rm-bp1xxxxxxx.mysql.rds.aliyuncs.com:3306",
"connection-user" = "<your_username>",
"connection-password" = "<your_password>"
);
-- Connect to another AnalyticDB for MySQL instance
CREATE EXTERNAL CATALOG adb_mysql_catalog
PROPERTIES
(
"type" = "mysql",
"connection-url" = "jdbc:mysql://am-bp1xxxxxxx.ads.aliyuncs.com:3306",
"connection-user" = "<your_username>",
"connection-password" = "<your_password>"
);
-- Connect to StarRocks (FE port defaults to 9030; MariaDB driver is required)
CREATE EXTERNAL CATALOG sr_catalog
PROPERTIES
(
"type" = "mysql",
"connection-url" = "jdbc:mysql://fe-host:9030",
"connection-user" = "<your_username>",
"connection-password" = "<your_password>",
"mysql.use-mariadb-driver" = "true"
);
Manage External Catalogs
View and drop
-- View the creation statement of an External Catalog
SHOW CREATE CATALOG <catalog_name>;
-- Drop a catalog (the original data in the external source is untouched)
DROP CATALOG <catalog_name>;
Permission control
External Catalogs offer a separate catalog-level permission model (the Managed Catalog adb reuses the existing database/table/column three-level model):
-- Grant
GRANT <privilege> ON CATALOG <catalog_name> TO '<user_name>'@'%';
-- Revoke
REVOKE <privilege> ON CATALOG <catalog_name> FROM '<user_name>'@'%';
|
Privilege |
Effect |
|
|
Access databases and tables under the catalog |
|
|
Create catalogs |
|
|
Drop catalogs |
|
|
Alter catalogs |
Ingestion and query examples
Import from an external source into AnalyticDB for MySQL
Use INSERT INTO ... SELECT to materialize data from an External Catalog into an internal table. The pattern works for every catalog type.
INSERT INTO <adb_db>.<adb_table>
SELECT * FROM <catalog_name>.<lake_db>.<lake_table>;
-- From Hive Catalog
INSERT INTO analytics_db.orders_adb
SELECT * FROM hive_hms.warehouse_db.orders;
-- From Iceberg Catalog
INSERT INTO analytics_db.events_adb
SELECT * FROM iceberg_hms.lakehouse.events;
-- From Delta Catalog
INSERT INTO analytics_db.transactions_adb
SELECT * FROM delta_hms.finance_db.transactions;
-- From Paimon Catalog
INSERT INTO analytics_db.lineitem_adb
SELECT * FROM paimon_oss.tpch.lineitem;
Hive Catalog query
-- List databases in the Hive Catalog
SHOW DATABASES FROM hive_hms;
-- Query a Hive table
SELECT
order_id,
customer_name,
order_date,
total_amount
FROM hive_hms.sales_db.orders
WHERE order_date >= '2024-01-01'
ORDER BY total_amount DESC
LIMIT 100;
Iceberg Catalog query
-- List databases in the Iceberg Catalog
SHOW DATABASES FROM iceberg_hms;
-- Query an Iceberg table
SELECT
event_type,
COUNT(*) AS event_count,
AVG(duration_ms) AS avg_duration
FROM iceberg_hms.lakehouse.user_events
WHERE event_date >= '2024-01-01'
GROUP BY event_type
ORDER BY event_count DESC;
Delta Catalog query
-- List databases in the Delta Catalog
SHOW DATABASES FROM delta_hms;
-- Query a Delta table
SELECT
product_category,
SUM(revenue) AS total_revenue,
COUNT(DISTINCT customer_id) AS unique_customers
FROM delta_hms.finance_db.transactions
WHERE transaction_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY product_category
ORDER BY total_revenue DESC;
Paimon Catalog query
SELECT
l_returnflag,
l_linestatus,
SUM(l_quantity) AS sum_qty,
SUM(l_extendedprice) AS sum_base_price,
SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
AVG(l_quantity) AS avg_qty,
AVG(l_extendedprice) AS avg_price,
AVG(l_discount) AS avg_disc,
COUNT(*) AS count_order
FROM paimon_oss.tpch.lineitem
WHERE l_shipdate <= DATE '1998-12-01' - INTERVAL '90' DAY
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
MySQL Catalog query
-- List databases in the MySQL Catalog
SHOW DATABASES FROM rds_mysql_catalog;
-- Query an RDS MySQL table directly
SELECT
user_id,
username,
email,
created_at
FROM rds_mysql_catalog.user_db.users
WHERE created_at >= '2024-01-01'
ORDER BY created_at DESC
LIMIT 100;
Federated lakehouse query
Join an internal table with an external one in the same statement:
SELECT
u.username,
u.email,
COUNT(o.order_id) AS order_count,
SUM(o.total_amount) AS total_spent
FROM
rds_mysql_catalog.user_db.users u
JOIN
analytics_db.orders o ON u.user_id = o.user_id
GROUP BY u.username, u.email
ORDER BY total_spent DESC
LIMIT 50;
FAQ
Q: Do External Catalogs support writes and DDL?
A: Paimon, Hive, and Iceberg Catalogs support write and DDL operations from 3.2.8. Delta Catalog and MySQL Catalog are query-only at present. To persist data from a query-only catalog into AnalyticDB for MySQL, use INSERT INTO ... SELECT into an internal table.
Q: Can Hive / Iceberg / Delta Catalogs use FileSystem or DLF as the metastore?
A: Not today—HMS only. Use Paimon Catalog if you need FileSystem or DLF.
Q: HMS connection fails. What should I check?
A: Verify ENI access is on, that the cluster and HMS share a subnet, that the HMS security group allows the subnet's IP range, and that the address and port in metastore-uri are correct.
Q: Does dropping a catalog delete the external data?
A: No. DROP CATALOG removes the catalog definition and metadata cache from AnalyticDB for MySQL only; the source data is untouched.
Q: Why does MySQL Catalog need the MariaDB driver for StarRocks?
A: StarRocks's MySQL protocol implementation is based on MariaDB. The default MySQL JDBC driver can hit compatibility issues. Setting mysql.use-mariadb-driver to true avoids them.
Q: CREATE EXTERNAL CATALOG returned success but later queries fail. Why?
A: Catalog creation only validates SQL syntax and parameter format; it never actually contacts the underlying source. Run SHOW DATABASES FROM <catalog_name> right after creation to confirm connectivity. Common causes include a missing or unauthorized OSS bucket, a wrong Metastore address, ENI access being off, or the target database firewall blocking the cluster.
Related documents
-
Catalog overview: learn about the catalog model and its categories.
-
Managed Catalog: manage internal data of an AnalyticDB for MySQL instance.