MaxCompute table data

更新时间:
复制 MD 格式

After you create a MaxCompute data source and bind it to DataWorks Data Studio, you can perform various operations on MaxCompute tables directly in Data Map. These operations include searching for data, previewing data, viewing metadata details and lineage, managing tables by category, and using Data Insight. This topic explains how to view and manage MaxCompute tables in Data Map.

Prerequisites

A MaxCompute data source must be created and bound to DataWorks Data Studio. After the data source is bound, DataWorks automatically collects metadata from the compute engine. The system performs a one-time full collection of existing metadata and then collects incremental metadata daily, which is aggregated in Data Map. The system automatically maintains the metadata collector.

In Data Map, use the global metadata search feature to find your target MaxCompute table.

Important

If you cannot find the target table in Data Map, go to My Data > My Tools > Refresh Table Metadata to manually refresh the table's metadata.

Table details

In the list of search results, click the name of the target table to go to its details page.

Note

After you open the table details page, if the system detects a more suitable similar table or a recommended table, a recommendation banner appears in the page header. Click the Recommended table name in the banner to jump directly to the target table; Recommendation reason explains the rationale.

Feature

Description

Related topics

Quick actions

In the upper part of the page, you can perform quick actions on the table, such as Apply for Permissions, adding the table to a Data Album or viewing a data album, adding the table to your favorites, API Generation in Data Service Studio, or writing SQL statements in SQL Query to query and analyze data.

Table Basic Information

In the Table Basic Information section, you can view the table's Views, Number of Reads, Number of Favorites, Lifecycle

, Approver, Health Status, Table Owner, and Table Type.

View basic information about a table

Table model information

View the table's data layer, business category, and storage policy.

Click View Model to navigate to the Data Modeling > Dimensional Modeling page to view the created model table. On the table editing page, you can modify table information, publish the table, view operation logs, or perform data development for the table model.

Note

Model information is displayed only for tables created in DataWorks Data Modeling.

Overview of dimensional modeling

Permission information

View your current permissions on the table. You can click Click to View to go to the Apply for Table Permissions page to apply for permissions.

Request and manage table permissions

Technical information

View the DDL Statement Updated At, Data Updated At, and Last Viewed At.

Note

Last Viewed Time:

  • Indicates the last time the table was accessed. Access includes executing manual commands and running scheduled nodes that access the table data.

  • This data is for reference only and may not reflect the exact time of the last access.

  • This data is collected offline and has a T+1 delay.

-

Details

View the table's Field Information, Partition Key Column, and Change Records.

View details about a table

Output information

If the table's data is periodically updated by a corresponding node, you can click Output information to view the running information of the production node. This data is collected offline and has a T+1 delay.

-

Lineage

View the data lineage within or between compute engine nodes. You can also view the lineage between the engine as a data source and the data APIs it produces. Additionally, MaxCompute supports viewing the complete end-to-end lineage of offline synchronization tasks. This data is collected in real time.

Note

To view the complete end-to-end lineage from an API perspective, including upstream data sources and downstream applications, see View API details.

View lineage information

Usage notes

You can perform operations such as Edit, View Versions, and View Markdown Syntax to understand the data's business context from its description.

-

Data Health

Data Asset Governance

Displays the table's governance health score, the trend of issues that require governance, and the details of governance issues. If the table has governance issues, you can quickly resolve them.

View data health

Data Quality

Displays the details of data quality monitoring rules configured for the table and a list of DQC alerts. You can click Configure Rule on the right to go to the Data Quality page and configure monitoring rules for the table.

Configure rules for a single table

Usage Records

Displays the table's usage records based on Frequently Associated and Access Statistics.

  • Frequently Associated: Shows how frequently the table is used in join operations.

  • Access Statistics: Displays the table's usage records through charts, such as read trends, field reference details, and top readers.

View usage records

Data Preview

You can preview 20 random data records from the table.

Important
  • You must have the required permissions to preview tables in the production environment. If you do not have the required permissions, see Request permissions on tables to apply for them.

  • If table preview permission is enabled in the workspace configuration, you can preview the data here even if you have not applied for table query permissions in Security Center.

  • If you have configured data masking rules and set them to active, they also take effect on the Data Preview page. For more information about how to configure data masking rules, see Create a data masking rule.

  • Data preview is not supported for MaxCompute external tables or MaxCompute tables that contain fields of the JSON data type.

Preview data

Data Insight

You can create a Data Insight report for a table to obtain data statistics and distribution through in-depth data analysis.

View Data Insight reports

Header buttons

The header of the table details page shows the table name, type tag, table description, and a group of buttons. Button visibility depends on your permissions, the current table type, and tenant activation:

  • Request Permissions: Appears when your current account does not have access to this table. Click it to jump to Security Center and submit a permission request.

  • Add to Album: Add the current table to a data album. If the table is already in an album, the button becomes an entry to view the albums it has joined.

  • Favorite/Unfavorite: Add the current table to My Favorites, or remove it from favorites.

  • API Generation: Publish the table's query capability as a DataService API. Visible only for supported data source types (such as MaxCompute, Hologres, AnalyticDB, PostgreSQL, MySQL, SQLServer, Oracle, OTS, SelectDB, MongoDB); not supported for tables in a lakehouse project.

  • DataAnalysis: Jump to Data Analysis with a new query page automatically created for the current table. Supported on data source types such as MaxCompute (non-lakehouse) and public-cloud DLF.

  • AI Enhancement: Use Copilot to generate or update the table's notes intelligently; while running, the button shows as Generating. Requires the AI enhancement permission and Copilot availability for your tenant.

  • Refresh Metadata: Manually trigger metadata synchronization for the table; equivalent to the single-table action under My Data > My Tools > Refresh Table Metadata. Visible only for data sources such as MaxCompute, E-MapReduce, and public-cloud DLF in non-lakehouse projects.

Basic information

The Table Basic Information section on the left side of the table details page displays information such as the number of Views, Number of Reads, and Number of Favorites.

  • The

  • Views: The number of times the table details page has been viewed in Data Map in the last 30 days. This data is collected offline and has a T+1 delay.

  • Number of Reads: The number of tasks that read the MaxCompute table from the production environment in the last 30 days. Read tasks include SQL, Tunnel Download, Data Integration, and DataService Studio API calls. Currently, only reads from scheduled DataWorks tasks are counted. This data is collected offline and has a T+1 delay.

  • Number of Favorites: The number of users who have added the table to their favorites. This data is collected in near real time.

  • Output Nodes: The ID of the auto-triggered DataWorks node that writes to the current table. If the table is periodically updated but no node ID is displayed, a scheduled node outside DataWorks might be writing to it. Contact the table owner for details. This data is collected offline and has a T+1 delay.

    Note

    If you do not have permission to view the code of the output node, contact the administrator of the workspace where the node is located to grant the required permissions. For more information, see Enable code and log isolation for security.

  • Storage Capacity: The logical storage size of the table. This data is collected offline and has a T+1 delay.

  • Health Status: Displays the rating of the table's Governance Health Score, which is found in Data Health > Data Asset Governance. You can use this score to determine whether the table requires governance.

  • Table Description: The description of the table, which Copilot can automatically generate.

If an administrator has configured custom attributes for table-type entities, a Custom Attributes card appears at the bottom of the left basic information panel, showing the current values of the configured attributes. Custom attributes can be inherited from the workspace; for more information, see Custom attributes.

Details

Click Details to view the table's Field Information, Partition Key Column, and Change Records:

  • Field Information

    You can view the field information of a table. If the table is partitioned, you can also view its Partition Fields.

    Actions

    Description

    Edit

    Click to edit the field's Description, Business Description, Security Level, and Primary Key. You can use Copilot to automatically generate field descriptions.

    Note
    • Only a Workspace Administrator or the table owner can edit table fields. To grant this permission to other users, assign them the Workspace Administrator role. For more information, see Manage permissions for global services.

    • The Security Level column is displayed only for tables where security levels are set for individual fields.

    • You can set the security level for table fields only after you enable the field security level feature for the MaxCompute engine. For information about how to enable this feature, see Label-based access control.

    Recommended Field Desc

    Generates descriptions for multiple fields that lack one.

    Batch Edit Security Level

    Sets the security level for multiple table fields at once to enhance data security.

    Upload

    Click this button and drag the local file that you want to upload into the Batch Upload Field Information dialog box.

    Note
    • Only a Workspace Administrator or the table owner can upload data to the target table. To allow other users to upload data, grant them the Workspace Administrator role. For more information, see Manage permissions for global services.

    • Only .xlsx files (Excel 2007 format) are supported. You can also Download Template File.

    • This feature does not support model tables created in Data Modeling.

    Download

    Click to download the field information for the current table.

    Generate SELECT

    Click to view or Copy the select statement for the current table in the Generate SELECT Statement dialog box.

    Generate DDL

    Click to view or Copy the table creation statement in the Generate DDL Statement dialog box.

    Note
    • Field access frequency: Shows the number of times the field was used in a JOIN clause in SQL on the previous day. The number is converted to a star rating, with a maximum of 5 stars and a minimum of 0 stars.

    • Associated Metric: Displays the model metrics associated with the field. To create or update the association, go to Dimensional Modeling, use field management on the editing page of the target table to maintain the field-to-metric association, and then publish the model table to apply the changes.

  • Partition Information

    View the table's Partition Name, Number of Records, Storage Capacity, and other partition information.

    Note
    • The number of records and storage size of a partition are for reference only. Data updates may be delayed. The data in the compute engine is the source of truth.

    • For MaxCompute transactional tables, the Number of Records is not available and is always displayed as -1. The result of the SELECT COUNT(*) FROM <table_name> WHERE <partition>; command is the accurate value.

  • Change records

    View the table's Description, Change Type, Granularity, and other change records.

    In the upper-left corner of the Change Records tab, you can select a change type from the drop-down list to view its specific change records.

Lineage

Lineage information shows the relationships between tables and fields, derived from the actual data flow in operations such as job scheduling and data synchronization. On the lineage page, you can view the upstream and downstream nodes of tables or table fields, trace the original data sources, and see the final destinations of the data. You can also perform impact analysis across different lineage levels as needed.

Note
  • Viewing lineage information is available only in DataWorks Standard Edition and above.

  • Data Map derives table and field lineage by parsing the actual data flow from operations like job scheduling. This data is collected in real time.

  • Data Map does not support lineage generated by manual operations, such as ad hoc queries.

  • If Data Map cannot properly display data lineage generated by SQL executed in a PyODPS node, you can resolve this issue by manually setting the relevant DataWorks scheduling parameters in the PyODPS task code. For more information, see Develop a PyODPS 3 task and Develop a PyODPS 2 task.

  • View Table Lineage

    image

    On the Table Lineage tab, you can view the details of the current table's lineage. You can perform the following operations:

    • View the number of upstream and downstream nodes for each node in the lineage. Hover over a table or node type to see its basic information.

    • Click a node. In the pop-up panel under Lineage Association, enter a keyword to display all downstream nodes that contain that keyword. You can also enter @Username to display all downstream tables owned by the specified user.

    • Click the + or - icons in the lineage graph to expand or collapse the corresponding upstream and downstream nodes.

  • View Field Lineage

    image

    On the Field Lineage tab, you can view the details of a target field's lineage. You can perform the following operations:

    • Use the Change Field area to switch between fields of the current table and view the corresponding field lineage graph.

    • View the number of upstream and downstream nodes for each node in the field lineage. Hover over a field or node type to see its basic information.

    • Click the + or - icons in the lineage graph to expand or collapse the corresponding upstream and downstream nodes.

  • Impact Analysis

    If a table's structure or data changes, its downstream nodes are affected. You can use Impact Analysis to identify which downstream tables a change will affect. On this page, you can filter by lineage level, node type, and table type to display relevant downstream tables and download the analysis results.

    Note

    Impact analysis supports up to 50 levels of table lineage.

Data health

You can view the data governance details and data quality status of a table.

  • In the data governance details, you can view the table's governance health score, the trend of issues that require governance, and the details of governance issues. You can also quickly address the issues that need to be governed.

    • Health Score: A quantitative assessment of the table's health in five domains: storage, computing, development, quality, and security. The score reflects the effectiveness of governance on the table. A higher score indicates better governance.

    • Trend of to-Be-Governed Issues: Shows the trend of unresolved governance issues over time, helping you understand the table's governance history.

    • Governance Issue: Displays a list of current issues that require governance. You can click the issue name in the Governance Issue column or the action button in the Actions column to navigate to the Data Asset Governance > Overview > To-Do List page to handle governance issues.

  • In the data quality details, you can view the details of the table's rules and any resulting data quality alerts.

Usage records

This section displays the table's Usage Records, organized into Frequently Associated and Access Statistics.

  • Frequently Associated: Shows how frequently the table is used in join operations.

    Note

    This statistic counts the number of times the table was used as a join condition in the last 30 days. This data is collected offline and has a T+1 delay.

  • Access Statistics: Displays the table's usage records in chart format.

    • Trend for Reads: The date on the line chart corresponds to the number of reads on that day, distinguishing between reads from the development and production environments. The number of field associations is related to the number of node executions and the number of times the field appears in the code. This data is collected offline and has a T+1 delay.

      For example, if a field appears once in a node and the node is executed twice, the count is two. If a field appears twice in the code, a single node execution results in a count of two.

    • Field Popularity Details: Statistics on the number of times a field is used in SQL clauses (such as WHERE, SELECT, JOIN, and GROUP BY). This data is collected offline and has a T+1 delay.

    • Top 10 Readers: Statistics on users who have read the table in SQL within the last 30 days. This includes access from both production accounts for scheduling and personal accounts. The read operations include WHERE, SELECT, JOIN, and GROUP BY. This data is collected offline and has a T+1 delay.

Data preview

Data Preview displays details of the selected table and 20 random data records. You can also preview and analyze the table in a workbook or by using an SQL query.

  • Preview in Workbook: On the Data Preview page, click Preview in Workbook. You are redirected to DataAnalysis > Spreadsheet, and automatically creates a new workbook to display the selected table's data.

  • Data Analysis: On the Data Preview page, click DataAnalysis. You are redirected to DataAnalysis > SQL Query, and automatically creates an SQL Query (Legacy) file populated with a query for the selected table.

    After you query the table in SQL Query, you can perform Data Insight on the results and generate a workbook, cards and reports, or other items.

Data insight

Important

You cannot use Data Insight for tables that have schema syntax enabled.

Data Insight analyzes the structure and values of your data to display statistics and distribution information. You can create a new Data Insight report directly on the current page or go to the Data Analysis module to use Data Insight.

Output information

On the Output tab, view the list of DataWorks scheduled tasks that produce the current table. This helps you trace the data source and the most recent runs.

The list typically includes the task's Node ID, Node name, Owner, Schedule, and Latest status. Click the node ID to jump to the node details page in Operation Center, where you can further inspect task instances, schedule configuration, and run logs.

Note

Output information is collected offline and has a T+1 delay. Tasks outside DataWorks scheduling that write to the table are not listed here.

Notes

On the Usage Notes tab, view or maintain the usage notes of the table to help downstream users quickly understand its business meaning, column descriptions, and usage caveats.

  • Read the notes: By default, the current notes are displayed in read mode.

  • Edit the notes: Click Edit to enter edit mode. Depending on your tenant configuration, the editor can be the Yuque editor (rich text plus collaborative editing) or the DataWorks built-in editor.

  • Generate notes with AI: Click the AI Enhancement button in the page header or use the entry inside the Usage Notes tab to let Copilot analyze the table metadata and produce a draft of the notes, which you can review and save.

Note

Generating notes with AI relies on Copilot and requires your tenant to have AI enhancement enabled. The editor type and AI entries visible may vary by tenant.

Related information

When the current table has cross-catalog associations (for example, StarRocks External Catalog scenarios), the table details page shows a Association Information tab that lists the other catalogs related to the current table.

  • View related catalogs: Inspect the names of the related catalogs and the cluster they belong to, to understand the cross-catalog relationships of the current table.

  • Jump to search the related assets: Click a link in the list to jump to the metadata search page with the related catalog pre-filtered, so you can quickly look up related tables.

Note

This tab appears only when the backend returns related catalogs for the table; otherwise, the tab is not displayed.

Request and manage table permissions

You can use DataWorks Security Center to apply for query and operation permissions on MaxCompute tables and view your application records in Data Map.

  • Request table permissions

    1. On the table details page, click Apply for Permissions.

      Note

      If the table is hidden, the Apply for Permissions button is not displayed.

    2. You are redirected to the Permission Application page in the new Security Center. For more information, see MaxCompute data access control.

  • Manage table permissions

    1. In the navigation pane on the left, click My Data.

    2. In the navigation pane on the left, click Managed by Me. On this page, you can update the table's lifecycle and Visibility, and perform operations such as Delete, Transfer, and Modify Category.

  • Permission approval: Go to the Security Center > Data access control page to view Permission Approval Details and Approval Records. For more information, see Data Access Control.

Manage MaxCompute tables

Manage tables with data albums

You can add the current table to a target data album and manage it from the data album's details page. You can also view the data albums that contain the current table. For more information, see Data Albums.

Manage tables with category navigation

In the navigation pane on the left of Data Map, you can click Manage Configurations > Manage Categories to configure category navigation for managing MaxCompute tables. For more information, see Configuration Management.