Dimensional modeling

更新时间:
复制 MD 格式

The data modeling feature in DataWorks follows the Kimball dimensional modeling methodology. Design and create dimension tables, fact tables, aggregate tables, and application tables, publish models to development engines, and reverse-model existing physical tables into logical models.

Note

Data modeling is for data warehouse planning and design. Models create physical tables in compute resources. Deleting a model does not affect the corresponding physical table.

Modeling perspectives

Organize model tables into layers: the Data Import Layer, the Common Layer, and the Application Layer. Each layer supports different management perspectives.

  • Common Layer: Processes and integrates data from the import layer into unified dimensions and metrics, building reusable fact and aggregate tables. Manage tables from a Data Domain or Business Category perspective.

  • Application Layer: Produces customized statistics from common-layer data for specific application scenarios or products. Manage tables only from a Business Category perspective.

  • Unlayered: Model tables not assigned to the Common Layer or Application Layer appear in the Unlayered section.

After you select a layer, you can only create tables within it. Switch between management perspectives in the directory tree to view and manage tables.

Dimension

Dimensions provide context for a business process, enabling analysis from various angles. For example, in an e-commerce transaction, dimensions include products, regions, and time.

To create a dimension, follow Create a conceptual model: Dimension.

Dimension table

Extract all the dimensions that possibly exist in each data domain, and store the dimensions and attributes of the dimensions in dimension tables. For example, when you analyze e-commerce business data, possible dimensions (attributes of each dimension) include order (order ID, order creation time, buyer ID, and seller ID), user (gender and birthdate), and commodity (commodity ID, commodity name, and commodity put-on-shelf time). You can create the following dimension tables: order dimension table, user dimension table, and commodity dimension table. The attributes of each dimension are used as the fields in the dimension table. You can deploy the dimension tables in a data warehouse and perform extract, transform, and load (ETL) operations to store dimension data in the format defined in the dimension table. This allows business personnel to access the data for subsequent data analysis.

To create a dimension table, follow Create a logical model: Dimension table.

Fact table

Sort and analyze data that is generated in each business process, and store the data in fact tables as fields. For example, you can create a fact table for the business process of placing an order, and record the following information as fields in the fact table: order ID, order creation time, commodity ID, number of commodities, and sales amount. You can deploy the fact tables in a data warehouse and perform ETL operations to summarize and store data in the format defined in the fact table. This allows business personnel to access the data for subsequent data analysis.

To create a fact table, follow Create a logical model: Fact table.

Aggregate table

An aggregate table organizes statistical data for multiple derived metrics within a data domain that share the same time period and dimensions. It supports business queries, OLAP analysis, and data distribution. Create a logical model: Aggregate table.

Application table

An application table serves a specific business scenario, organizing atomic metrics, derived metrics, or statistics at different granularities that share the same time period and dimensions. It supports business queries, OLAP analysis, and data distribution. Create a logical model: Application table.

Reverse modeling

Reverse modeling generates a logical model from an existing physical table, saving you from repeating the modeling steps. Reverse-model physical tables.