Create and configure a standard dimension table-Dataphin(Dataphin)-阿里云帮助中心

Limitations

If you have not subscribed to the Data Standard module, you cannot set the field standard for fields in the table.
If you have not subscribed to the Asset Security module, you cannot set the data security level or data classification for fields in the table.
If you have not subscribed to the Data Quality module, the Unique and Not Null constraints for primary key fields are not enforced.

Prerequisites

A business entity must be created. For more information, see Create and manage business entities.

Procedure

Step 1: Create a standard dimension table

In the top navigation bar of the Dataphin homepage, choose Develop > data development.
In the top navigation bar, select a Project. If you are using the Dev-Prod mode, you must also select an Environment.
In the left-side navigation pane, choose standard modeling > dimension table.
In the dimension table list on the right, click the New icon.

In the Create dimension table dialog box, configure the following parameters.

Parameter	Description
Business object	Select Standard Object.
Table type	If you select Standard Object for Business object, the Table type is automatically set to standard dimension table and cannot be changed.
Data segment	Defaults to the data segment associated with the project. This setting cannot be changed.
Subject area	Defaults to the subject area of the business object. This setting cannot be changed.
Data timeliness	The data timeliness of the dimension table. Options include T+1 (daily), T+h (hourly), and T+m (minutely).
Logical table name	Enter a name for the logical table. The name can be up to 100 characters long. After you select a business object, the system automatically generates a name in the format `{data_segment_name}.dim_{business_object_code}_{data_timeliness}`. Important The name can contain only letters, digits, and underscores (_). It must start with a letter. The name is not case-sensitive, and any uppercase letters are automatically converted to lowercase. `label_` is a system-reserved prefix. Do not start names with `label_`. The `{data_timeliness}` suffix depends on the selected data timeliness: df: Daily full snapshot (T+1). Stores all historical data up to the current day. hf: Hourly full snapshot (T+h). Stores all historical data up to the current hour. mf: Minutely full snapshot (T+m). Stores all historical data up to the last 15-minute interval every 15 minutes.
Display name	The naming rules are as follows: Must be 128 characters or less. Can contain any character.
Description	Enter a description for the dimension table, up to 1,000 characters.

Click OK to create the standard dimension table.

Step 2: Configure fields

On the Table Structure configuration page, define the table structure, including fields, data types, and field categories.

image..png

Note

Click the icon next to the logical table name to view its basic information.

Area	Description
① Locate table fields	Click to navigate to the Basic Configuration or data governance section of the field list.
② Search and filter	You can search for fields by name. Click to filter fields by data type, field category, has associated dimension, associated dimension, field constraint, and data security level.
③ Field list operations	Adjust field order: To prevent data errors, ensure that no downstream jobs that depend on this logical table use a `select ` query before you reorder fields. Import fields: You can add fields by using Import from DDL statement* or Import from Table. For more information, see Import fields for a dimension table. Add field: Add a data field or partition field. You can edit the name, description, data type, field category, associated dimension, field standard, field constraint, data classification, data security level, and Remarks. Note For the MaxCompute engine, you can create up to six levels of partition fields.
④ Field list	The field list displays detailed information for each field, including sequence number, field name, description, data type, field category, associated dimension, field standard, field constraint, data classification, data security level, and Remarks. Sequence number: The sequence number of the field. It auto-increments by one for each new field. Field name: The name of the field. You can enter a field name or a keyword, and the system automatically suggests matching standard field names. Description: The field description, limited to 512 characters. Data type: Supported data types include string, bigint, double, timestamp, decimal, Text, Numeric, Date/Time, and Other. Field category: Can be set to Primary key, Partition, or Attribute. Note Only one primary key field is allowed. Only fields of type string, varchar, bigint, int, tinyint, or smallint can be used as partition fields. Associated dimension: For details, see Add an associated dimension. Field standard: Select a field standard for the field. To create a standard, see Create and manage data standards. Field constraint: Select a constraint for the field. Supported constraints are Unique and Not Null. Data classification: Select a data classification for the field. To create a data classification, see Create a data classification. Data security level: The system automatically determines the data security level after you select a data classification. Remarks: Enter remarks for the field, limited to 2,048 characters. You can also Delete fields in the Actions column. Note Deleting a field cannot be undone. The primary key and system partition fields of a dimension table cannot be deleted.
⑤ Batch operations	You can perform the following batch operations on selected fields. Delete: Click the icon to delete the selected data fields in bulk. Data type: Click the icon to modify the data type for the selected fields in bulk. Field category: Click the icon to modify the field category for the selected fields in bulk. Radix-based Naming: Click the icon. The system tokenizes the description content, matches it with existing radices, and recommends field names. In the Radix-based Naming dialog box, you can replace the names of selected fields with the suggested values. Note If the recommended field names do not meet your needs, you can edit them in the Corrected field name input box. Click Reset to revert the Modified field name to the name suggested by the system's radix match. Field standard: Click the icon. The system recommends a field standard based on the field name. In the Field Standard dialog box, you can apply the recommended standard to the field. Field constraint: Click the icon to set constraints for the selected fields in bulk. Important You cannot set field constraints for child dimension tables.

Click Save and Next.

Import fields

Import from Table

On the configuration page of the dimension table, click Import from Table.

In the Import from Table dialog box, select a source table, then select the fields you want to add.

Parameter	Description
Source table	You can select any physical table, logical table, or view (excluding parameterized views) within the current tenant for which you have read permissions. Physical tables automatically generated by Dataphin are excluded. To learn how to obtain read permissions for a physical table, see Apply for, renew, and release table permissions.
Field list	Select the fields you want to add. Note You can switch between different source tables to select fields from multiple tables.
Selected fields	The fields you add appear here. You can Delete fields from this list.

Click Add to import the fields from the physical table into the dimension table.

Once added, you can edit the new fields in the New Fields area, including names, data types, field categories, and associated dimensions.

Import from DDL statement
1. On the configuration page of the dimension table, click Import from DDL statement.
2. In the Import from DDL statement dialog box, enter the DDL statement and click Parse SQL.
3. From the field list, select the fields you need and click Add to create them in the dimension table.

Add an associated dimension

On the dimension table configuration page, click the icon in the Associated Dimension column to open the Edit Model Relationship dialog box.

In the Edit Model Relationship dialog box, configure the parameters.

Area	Parameter	Description
Null replacement value		If the main table (the current dimension table) cannot be joined with the associated dimension table, Dataphin automatically populates the join field with -110.
Dimension table	Associated entity, dimension table	Select an existing associated entity and dimension table.
Edit join logic	Join logic	Displays the join between the source field and the primary key of the associated dimension table. This field is read-only.
	Dimension table version policy	Specifies which partition of the associated dimension table to use when joining with the main table. By default, the partition from the same scheduling cycle is used. Use dimension of the same cycle: The main table and the associated dimension table use the same time partition for computation. For example, if the business date is 20220101 and you need to query the `ds=20220101` partition of both the main table and the associated dimension table, select this option. Use the latest dimension table: Uses the latest partition of the associated dimension table for computation. For example, a product category is frequently changed. 10 days ago, it was the mobile phone category, and today it is the home appliances category. If you need to reprocess data from 10 days ago by using the current home appliances category, the dimension table versioning policy must be set to Use latest dimension table (Uses latest partition).
	Join failure policy	Defines the computing logic for data that exists in the source table (left table) but not in the dimension table (right table). You can choose to Retain original non-joined data or Replace non-joined data with default values. Retain original non-joined data: Keeps the original data from the left table when creating derived metrics. Replace non-joined data with default values: If a field in the main table does not have a matching entry in the dimension table, it is replaced with the default value of -110.
Edit dimension role	Role English name, Role name	A dimension role is an alias for a dimension, allowing you to reference the same dimension multiple times with different names. You must define a Role English name and a Role name. The Role English name must have the prefix `dim`. The custom part must adhere to the following rules: Contain only letters, digits, or underscores (_). Be 64 characters or less. The Role name must adhere to the following rules: Contain only Chinese characters, digits, letters, underscores (_), or hyphens (-). Be 64 characters or less.

Click OK.

Step 3: Configure the computing logic

On the Computing logic configuration page, map source data to the fields of the dimension table.

Click Source Configuration, and in the Source Configuration dialog box, click +Add Source Object to configure the source parameters.

Note

We recommend that you do not add extra event time filters in the filter conditions or custom SQL.

Parameter	Description
Source type	Supported source types include physical table, custom SQL, and logical table. Source table type descriptions: Source table with a primary key: A logical table with a primary key can have multiple sources. The first source is always the primary source, which determines the total number of rows in the logical table. Source table without a primary key: A logical table without a primary key can have only one source. If the source involves multiple tables, you must join them in advance by using custom SQL. Note To configure multiple source objects, click Add Source Object.
Source object	Select Physical table: You can select any physical table or physical view (excluding parameterized views) within the current tenant for which you have read permissions. Physical tables automatically generated by Dataphin are excluded. To learn how to obtain read permissions for a physical table, see Apply for, renew, and release table permissions. Select Custom SQL: Click the icon and enter your SQL statement in the editor. Example: `select id, name from project_name_dev.table_name1 t1 join project_name2_dev.table_name2 t2 on t1.id = t2.id` Select Logical table: You can select any logical table within the current tenant for which you have read permissions. To learn how to obtain read permissions for a table, see Apply for, renew, and release table permissions. Important Using a logical table as the data source for another increases the complexity of computing logic and maintenance.
Object alias	Define an alias for the source table. For example: t1, t2.
Object description	Enter a description for the object, up to 1,000 characters.
Filter condition	Define filter conditions for the custom SQL. Click the icon and enter the content in the editor. Example: `ds=${bizdate} and condition1=value1`
Join field	The field in the source object that corresponds to the logical table's primary key, used for an equijoin.
Delete	The primary source cannot be deleted. For a logical table without a primary key, deleting the source clears the computing logic for all fields.

Click OK to save the source configuration.
After configuring the source, drag a Source Field into the computing logic expression. You can also click Quickly Map Fields with the Same Name to automatically map source fields to logical table fields with the same name.

Click the icon to edit the computing logic expression in the editor. Aggregate functions such as sum, count, and min are not supported. Examples:
- Example 1: substr(t1.column2, 3, 10)
- Example 2: case when t1.column2 != '1' then 'Y' else 'N' end
- Example 3: t1.column2 + t2.column1
After configuring the computing logic, click the icon at the bottom to validate the expressions.

Click Preview SQL to view the generated SQL for the computing logic.
Click Save and Next.

Step 4: Configure constraints

Based on the field constraints, the system creates quality rules for the current logical table in the Data Quality module. Set the rule severity for each field, choosing between Strong rule and Weak rule. For more information, see Data table quality rules.

Note
The rule severity for logical table field constraints can only be configured here and cannot be edited in the Data Quality module.
Click Save and Next.

Step 5: Configure scheduling

On the Scheduling and parameter configuration page, configure the data delay, scheduling properties, scheduling dependencies, scheduling parameters, and runtime configuration for the dimension table.

Parameter	Description
Data delay	If you enable data delay, the system automatically reruns all data for this logical table within the period specified by Maximum Delay Days. For details, see Configure data delay for a logical table.
Scheduling properties	Specifies how the dimension table is scheduled in the production environment. Configure properties such as scheduling type, cycle, and execution logic. For details, see Configure scheduling properties for a logical table.
Upstream dependencies	Defines the logical table's scheduling node. Dataphin uses these dependencies to run nodes in order, ensuring timely and valid data output. For details, see Configure upstream dependencies for a logical table.
Parameter configuration	Assigns values to variables in your code. These variables are automatically replaced with their corresponding values during node scheduling. On the scheduling parameter configuration page, you can choose to Ignore a parameter or Convert to Global Variable. For details, see Configure runtime parameters for a logical table.
Runtime configuration	Configure task-level runtime timeout periods and rerun policies for task failures. This prevents resource waste from long-running tasks and improves task execution reliability. For details, see Configure runtime settings for a logical table.
Resource configuration	Configure a scheduling resource group for the current logical table task. The task consumes the resource quota of this group during scheduling. For configuration details, see Configure resources for a logical table.

Click Save and Commit.

Step 6: Save and commit

After configuring the standard dimension table, click Save and Commit.

The system validates the settings for table structure, computing logic, scheduling dependencies, and runtime parameters. Review the Check Results and correct any failed configurations.
After all checks pass, enter a submission comment and click OK and Commit.

When you commit, Dataphin performs data lineage analysis and pre-commit checks. For more information, see Notes on committing standard modeling tasks.

Next steps

If your project uses Dev-Prod mode, publish the logical table to the production environment. For more information, see Manage publishing tasks.

After you publish the logical table to the production environment, you can view and manage its tasks in the O&M Center. For more information, see O&M Center.