You can create Hologres internal tables by using Data Definition Language (DDL) statements or by using the visual interface provided in DataWorks. This topic describes how to use the visual interface in DataWorks to create a Hologres internal table.
Prerequisites
You have added a Hologres computing resource to your workspace and bound it to DataStudio. For more information, see Old version of DataStudio: Bind a Hologres computing resource.
You have the Workspace Administrator or Development role. For more information about how to grant roles, see Manage permissions on workspace-level services.
Background information
Hologres supports two types of tables: internal tables and external tables. The following list describes the differences:
An internal table stores data directly from MaxCompute. You can synchronize data from a MaxCompute source table to a Hologres internal table for fast queries and analysis. This method offers better query performance than using an external table.
An external table does not store data directly. Instead, it maps to a MaxCompute source table to accelerate data queries and analysis. This method avoids data redundancy and eliminates the need for data import and export, allowing you to get query results quickly.
As a data development and processing platform, DataWorks provides a convenient visual interface for creating tables. You can also create tables directly in Hologres by using DDL statements. For more information, see CREATE TABLE.
Limitations
This feature is available only in the China (Shanghai) and China (Beijing) regions.
Procedure
-
Go to the Data Studio page.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Integration.
-
Create a Workflow.
If you already have a Workflow, skip this step.
-
Hover over the
icon and select Create Workflow. -
In the Create Workflow dialog box, enter a Workflow Name.
-
Click Create.
-
Create a Hologres internal table.
Hover over the
icon and select .In the Create Table dialog box, set Table Type to Internal Table, and configure the engine, path, and name for the table.
For Engine Instance, select Hologres. After you complete the configuration, click Create.
Configure the Hologres internal table.
On the configuration page, configure the properties of the Hologres internal table.
Configure basic properties.
The following table describes the main basic properties.
Parameter
Description
Storage Mode
The storage mode of the table in Hologres. The default is columnar storage.
Column-oriented Storage: Suitable for online analytical processing (OLAP) scenarios. It is ideal for complex queries, data joins, scans, filters, and aggregations. The write and update performance is lower than that of row-oriented storage.
Row-oriented Storage: Suitable for key-value (KV) scenarios. It is ideal for point lookups and scans based on primary keys. It provides better write and update performance.
Hybrid Row-column Storage: Suitable for scenarios where both columnar and row-oriented storage are required. It supports both high-performance point lookups and OLAP analysis. This mode incurs higher storage overhead and internal data synchronization costs.
NoteFor more information about storage formats, see the description of the
orientationparameter in CREATE TABLE.Lifecycle
The life cycle of the table, in seconds. By default, the life cycle is permanent.
NoteThe life cycle countdown starts when data is first written to the table. After the specified life cycle ends, the table data is cleared within a random period of time.
Binlog
Specifies whether to enable binlog for the table. If you enable binlog, you must specify its life cycle. By default, the binlog life cycle is permanent.
NoteOnly Hologres V0.9 and later versions support the table-level binlog feature. For more information about binlog, see Subscribe to Hologres binlogs.
Configure business information.
NoteA table's business information is for management purposes only and does not affect the underlying logic.
Parameter
Description
Theme
The primary and secondary folders to which the table belongs. You can classify tables based on business purposes and assign tables of the same type to the same folder.
NoteThe primary and secondary themes are just folder representations in DataWorks, designed to help you better manage your tables.
Layer
The table's physical data warehouse layer. Layers define and manage data warehouse tiers, which typically include the ODS layer, common layer, and analytics layer. You can assign the table to a suitable layer based on your business needs.
NoteClick the
icon to customize layers. For more information, see Manage tables.Physical Category
The physical category of the table, used for more detailed classification from a business perspective. Common categories include Basic Business, Advanced Business, and Other.
NoteClick the
icon to customize physical categories. For more information, see Manage categories.Configure the table schema.
Parameter
Description
Field Design
Add and define the fields for the table. For more information about the data types that Hologres supports, see Data type summary.
Storage Design
Define the storage methods for the table fields.
Distribution Column: Specifies the distribution strategy for the table. Data is distributed to different shards based on the distribution column. Subsequent operations such as computing and scanning are performed at the shard level.
Segment key: Typically, you specify a time-related column as the segmented column. When a query includes this column, the system can quickly locate the data's storage location. This is suitable for data that is strongly time-related, such as logs and traffic data.
Clustering Key: Used to create a clustered index on a specified column. Hologres sorts the data based on the clustered index to accelerate range and filter queries on the indexed column.
Dictionary Encoding Columns: Used to build a dictionary mapping for the values in a specified column. Dictionary encoding can convert string comparisons into numeric comparisons, accelerating queries such as
GROUP BYandFILTER.Bitmap Column: Also known as a bit-encoded column, it allows for fast equality filtering of data within a storage file. Therefore, we recommend that you create a bitmap index for data that is frequently used in equality filter conditions.
For more information about storage methods, see CREATE TABLE.
Partition
Define the partition field for the table.
NoteIf a partitioned table has a primary key, the primary key must include the partition field.
Commit and publish the Hologres internal table.
After you define the table schema, you must commit it to the development environment and the production environment. After a successful commit, you can view the table in the engine project of that environment.
NoteIf your workspace is in basic mode, you only need to commit the table to the production environment. For more information about the differences between basic mode and standard mode workspaces, see Differences between workspace modes.
Actions
Description
Load from Development Environment
Loads the table's information from the development environment and displays it on the current page.
NoteYou can perform this operation only after the table has been committed to the development environment. After you perform this operation, the table's information from the development environment overwrites the information on the current page.
Commit to Development Environment
Commits the table to the development environment of DataWorks. This creates the current table in the development environment's Hologres database.
After the commit, you can view the table schema in the Hologres directory of the corresponding workflow in DataStudio. The workflow is specified by the path you selected when creating the table.
Load from Production Environment
Loads the table's information from the production environment and displays it on the current page.
NoteYou can perform this operation only after the table has been committed to the production environment. After you perform this operation, the table's information from the production environment overwrites the information on the current page.
Commit to Production Environment
Commits the table to the production environment of DataWorks. This creates the current table in the production environment's Hologres database.
Next steps
After you create the Hologres internal table, you can perform the following operations:
Perform data development tasks in Hologres. For more information, see Hologres SQL node and Hologres SQL.
Periodically import MaxCompute data into the Hologres internal table by using a Hologres external table:
Import data by using commands. For more information, see Import data from MaxCompute by using SQL.
Import data by using the visual interface in DataWorks. For more information, see One-click synchronization node for MaxCompute.