Data Catalog

更新时间:
复制 MD 格式

Data Catalog centralizes metadata from MaxCompute, Hologres, DLF, and other sources, letting you create tables, manage views, and prepare data without leaving DataWorks.

Key features

  • Unified management: Centrally manage and search tables, views, functions, and resources from multiple data sources.

  • Quick table creation: Create table schemas directly in DataWorks without switching consoles.

    • DDL-based creation: Use native DDL statements for flexible, precise control.

    • Visual creation: Create tables through a form-based interface.

    • Copilot-assisted creation: Describe your requirements in natural language and let AI generate the table schema.

  • One-click synchronization: Initiate data sync tasks between sources like MaxCompute and Hologres.

  • Quick exploration: Preview table schemas to understand your data at a glance.

Supported data catalogs

The following table lists supported data source types and how to add them.

Data catalog

From workspace

From account

MaxCompute (internal and external projects)

Supported

Supported

Hologres (internal and external databases)

Supported

Supported

DLF Catalog (DLF 1.0, DLF 2.0, DLF 2.5, and later versions)

Supported

Supported

Hive (EMR Hive)

Supported

Not supported

Lindorm

Supported

Not supported

AnalyticDB MySQL

Supported

Not supported

AnalyticDB PostgreSQL

Supported

Not supported

StarRocks

Supported

Not supported

AI Catalog (AI datasets and AI models)

Automatically reads from the AI workspace with the same name as the current DataWorks workspace.

Not supported

Authentication and authorization

How Data Catalog accesses a data source depends on how the source is added:

  • For workspace-attached data sources, Data Catalog uses the identity configured in the data source to access metadata.

  • For account-level data sources, Data Catalog uses your personal identity to access metadata.

  • To view MaxCompute data in Data Catalog with a RAM user or RAM role, first obtain the required MaxCompute permissions. If the Layer 3 model is enabled for the MaxCompute data source or project, also grant the RAM user or RAM role permission to view schema metadata.

    Note

    If a MaxCompute project contains multiple schemas, grant metadata permissions for all schemas to view the complete schema list on the project details page.

    • Grant permissions to a RAM user:

      GRANT DESCRIBE ON SCHEMA <schema_name> TO USER RAM$<alibaba_cloud_account_name>:<ram_user_name>;
    • Grant permissions to a RAM role:

      GRANT DESCRIBE ON SCHEMA <schema_name> TO USER `RAM$<alibaba_cloud_account_name>:role/<ram_role_name>`;

Access Data Catalog

Important

This feature is available only in workspaces that use Use Data Studio (New Version).

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose Shortcuts > Data Studio in the Actions column.

  2. In the left-side navigation pane, click image to open Data Catalog.

Add a data catalog

To add a data catalog:

  1. In Data Catalog, find the data source type to add and click image next to its name.

  2. On the Add Data Catalog page, find the target instance and click Add in the Operation column.

Note
  • A workspace-level data catalog is visible to all workspace members.

  • An account-level data catalog is visible only to you.

  • When adding from your account, only instances in the same region as the DataWorks workspace that you have access to are listed.

Manage data catalogs

Hide a data catalog

To hide irrelevant data catalogs:

  1. In the left directory tree, find the data catalog and click image in the upper-right corner.

  2. In the pop-up, click image next to an engine name to hide its data catalogs.

    Note

    To unhide, click the empty space next to the engine name in the pop-up.

Remove a data catalog

To remove a data catalog you no longer need:

In the left directory tree, find the target data catalog and click Remove or Disassociate Data Catalog in the Operation column.

Create and manage data objects

Expand a catalog to create or manage its data objects. The following table links to related documentation.

Data catalog

Description

References

MaxCompute

Create and manage data objects such as tables, views, external tables, resources, and functions.

MaxCompute data management

Hologres

Create and manage data objects such as tables and views.

Hologres data management

DLF Catalog

Create and manage table metadata in the database.

DLF Catalog data management

Hive

Create and manage table data objects.

Hive data management

AnalyticDB MySQL

Create and manage table data objects.

AnalyticDB MySQL data management

AnalyticDB PostgreSQL

Create and manage table data objects.

AnalyticDB PostgreSQL data management

StarRocks

Create and manage table and view data objects.

StarRocks data management

AI Catalog

Manage dataset and model metadata in AI Catalog.

AI Catalog data management