Create and edit offline physical tables

更新时间:
复制 MD 格式

Offline physical tables help you centrally configure and manage the tables used in compute task development to improve development efficiency.

Limits

  • If you have not purchased the Data Standard module, you cannot set the Data Standard field for tables.

  • If you have not purchased the Asset Security module, you cannot set the Data Sensitivity Level or Data Classification fields for tables.

  • Only the MaxCompute, , and compute engines are supported.

Create an offline physical table

Step 1: Configure basic information

  1. On the Dataphin home page, from the top menu bar, choose Develop > Data Development.

  2. In the top menu bar, select a Project. If you are in Dev-Prod mode, you must also select an Environment.

  3. In the navigation pane on the left, select Data Processing > Tables.

  4. In the list of tables, click the image icon and select Offline Physical Table.

  5. In the Create Physical Table wizard, configure the following parameters. The required parameters vary depending on the compute engine.

    MaxCompute compute engine

    Parameter

    Description

    Table Name

    Enter a name for the offline physical table. The name can contain only letters, digits, and underscores (_). The name can be up to 128 characters in length.

    Directory

    Select the folder where the offline physical table is stored.

    If no folder is created, create a new folder as follows:

    1. Above the compute node list on the left, click the image icon to open the Create Folder dialog box.

    2. In the Create Folder dialog box, enter a folder Name, set Type to Offline, and Select Directory as needed.

    3. Click OK.

    Data Domain (Optional)

    Select the subject area to which the table belongs. If no subject areas are available, create one. For more information, see Create a subject area.

    Description (Optional)

    Enter a brief description. The description can be up to 1,000 characters in length.

  6. When the configuration is complete, click Next.

Step 2: Configure the field list

Different compute engines support different storage formats. For more information, see Table management settings. The supported configurations for the field list vary based on the Storage File Format, as shown in the following tables.

Storage format is not hudi, delta (Delta Lake), iceberg, or paimon

  1. On the Field List configuration page, you can configure the physical table's fields, data types, data classification, and other structural information.

    image

    Area

    Description

    Field list operations

    • Search: Search for a field by its name.

    • Edit DDL: Edit the DDL statement for the current physical table.

    • Add Data Field: Click Add Field, select a field type, such as Data Field, Partition Field, or Quick Add Date Partition. In the new row, enter the Column Name, Data Type, and Remark.

      • Add Data Field: Adds a data field row to the table fields.

      • Partition Field: Adds a partition field row to the table fields.

      • Quick Add Date Partition: Adds a date partition row to the table fields. The default value is ds.

    • Import from Create Table Statement: Use a create table statement to import new fields. Click Import From Create Table Statement. In the Import From Create Table Statement dialog box, follow the instructions in the figure below, enter the create table statement, and click Parse SQL. Select the fields that you want to import from the parsed fields and click Add to import them.

    • Import from Table: Click Import From Table, and in the Import From Table dialog box, select the source table that contains the fields to import, select the desired fields, and click Add to import them.

    Field list

    The field list displays details about each field, including Serial Number, Column Name, Data Type, Remark, Recommend Standards, Data Class, and Data Sensitivity Level.

    • Serial Number: The ordinal number of the table field. The number auto-increments by 1 for each new field.

    • Column Name: The name of the table field. You can enter the full name of a root word to search. The system automatically matches the root words that are configured in Administration > Data Standard > Root Word.

    • Data Type: Supported data types include string, bigint, double, timestamp, decimal, Text, Numeric, Datetime, and Other.

      • Text: varchar and char.

      • Numeric: int, Smallint, tinyint, and float.

      • Datetime: date. The MaxCompute compute engine supports datetime.

      • Other: boolean and binary.

    • Remark: The description of the table field. The description can be up to 512 characters in length.

    • Recommend Standards: Select a data standard for the field. To create a standard, see Create and manage data standards.

    • Data Class: Select a data classification for the field. To create a data classification, see Create a data classification.

    • Data Sensitivity Level: After you select a data classification, the system automatically identifies the data sensitivity level.

    You can also Delete a field in the Actions column.

    Note

    You cannot undo the deletion of a field.

    Batch operations

    You can select multiple table fields and perform the following batch operations.

    • Delete: Click the image icon to delete the selected data fields in a batch.

    • Recommend Names: Click the image icon. The system tokenizes the content in the Remark column, matches the tokens with existing root words, and recommends field names. In the Root Word Naming dialog box, you can replace the names of the selected fields with the recommended names.

      Note
      • If a recommended field name does not meet your requirements, you can modify it in the Corrected Field Name box.

      • Click Reset to change the Corrected Field Name back to the system-matched root word.

    • Recommend Standards: Click the image icon. The system recommends data standards based on the field names. In the Data Standard dialog box, you can apply the recommended data standards to the fields.

  2. After you add the fields, click Next.

Step 3: Configure storage

The supported storage configurations for the MaxCompute compute engines vary based on the Storage File Format, as shown in the following tables.

MaxCompute compute engine

  • MaxCompute internal table

    Parameter

    Description

    Storage Type

    Select Internal Table.

    Entity Table

    Select Yes or No. If you select Yes, you can also configure a Primary Key to create the table as a Delta table.

    Primary Key (Optional)

    Select one or more primary keys. You can select any field from the field list. If you select multiple primary keys, they are sorted in the order of selection.

    Lifecycle (Optional)

    The retention period of the table. You can enter a positive integer in days, or select 7, 14, 30, or 360 days.

  • MaxCompute external table

    Parameter

    Description

    Storage Type

    Select External Table.

    Storage File Format

    Select a storage format. The default format is the same as the default storage format for external tables specified in the table management settings of the development platform. You can select parquet, avro, rcfile, orc, textfile, or sequencefile.

    Location

    Enter the Location (storage address). You can use global variables. The value can be up to 512 characters in length.

After you configure the parameters, click Submit. In the Submit dialog box that opens, review the SQL statement for creating the table and click Confirm And Submit.

Import data

After you submit the offline physical table, you can import data into it.

  1. In the table list, click the destination offline physical table. You can search for the table by its name.

  2. On the table details page, click Import Data. The Import Data dialog box opens.

  3. In the Import Data dialog box, you can upload the data file and configure the import parameters in the Basic Configuration step.

    Parameter

    Description

    Upload File

    Click Select File to upload the data file. Only .txt and .csv files are supported. The file size cannot exceed 10 MB.

    Separator

    The data separator. Supported separators include comma (,), tab character (\t), VERTICAL LINE (|), and forward slash (/). You can also specify a different separator.

    Character Set Encoding

    Select the character set encoding of the uploaded data file. Supported character sets include UTF-8 (without BOM), UTF-8 (with BOM), GBK, Big5, GB2312, ASCII, and UTF-16.

    First Row As Header

    Specify whether the first row of the uploaded data file is a header.

    Target Partition

    If the table is a partitioned table, enter the name of the Target Partition for the imported data.

  4. Click Next.

  5. In the Import Data step, configure the field mapping for the table.

    • Mapping:

      • Row Mapping: Binds fields by row number.

      • Name Mapping: Maps fields by name.

    • Import File Data Column: The value for the data column, which can be a data column, NULL, or a static field.

  6. Click Start Import to load the data into the table.

Edit an offline physical table

After you submit an offline physical table, you can edit it.

  1. In the table list, click the destination offline physical table. You can search for the table by its name.

  2. Click Edit on the table details page to open the Edit Physical Table page.

    The editable parameters vary based on the storage format, as shown in the following tables.

    MaxCompute compute engine

    • MaxCompute internal table

      Tab

      Description

      Basic Information

      You can edit only the Table Name and Description. The parameter requirements are the same as when creating an offline physical table.

      Field List

      You can edit and delete existing fields, and add new fields. The parameter requirements are the same as when creating an offline physical table. A DDL statement is generated when you modify a data type.

      You can clear the Not Null check box for a field.

      • If the current table is an internal transactional table, you cannot clear the Not Null check box for fields that are set as primary keys.

      • You can clear the Not Null check box only for fields for which it is already selected.

      Storage Configuration

      You can edit the Lifecycle. The requirements are the same as when creating an offline physical table.

    • MaxCompute external table

      Tab

      Description

      Basic Information

      You can edit only the Table Name and Description. The parameter requirements are the same as when creating an offline physical table.

      Field List

      You can edit and delete existing fields, and add new fields. The parameter requirements are the same as when creating an offline physical table. A DDL statement is generated when you modify a data type.

      You can clear the Not Null check box only for fields for which it is already selected.

      Storage Configuration

      You can edit the Storage Address (Location). The parameter requirements are the same as when creating an offline physical table.

    After you finish editing, click Submit. In the Submit dialog box, review the SQL statement for the table and click Confirm And Submit.

    Note

    If you delete fields or modify data types, you must first run the following command to enable schema evolution at the project level in MaxCompute: setproject odps.schema.evolution.enable=true;.

What to do next

  • If your development mode is Dev-Prod, you must publish the offline physical table. For more information, see Manage publish tasks.

  • If your development mode is Basic, you can manage the submitted offline physical table in the asset directory. For more information, see Asset directoryView the asset checklist.