Quick start with columnar table

更新时间:
复制 MD 格式

This topic describes how to enable and try out the PolarDB for MySQL columnar table analytics feature.

Applicable scope

Your cluster must run MySQL 8.0.2, with a minor version of 8.0.2.2.34 or later.

Enable Column-oriented Table Analytics

Enable for an existing cluster

  1. Log on to the PolarDB console, click Clusters in the navigation pane on the left, select the Region where the cluster is deployed, and then click the cluster ID to open the cluster details page.

  2. On the Basic Information page, click Enable next to Column-Oriented Table Analysis Acceleration. After the cluster restarts, Column-oriented Table Analytics is enabled.

Enable for a new cluster

  1. Log on to the PolarDB console. In the left-side navigation pane, click Clusters. Select the Region where the cluster resides, and then click Create Cluster.

  2. On the purchase page, configure the following key settings:

    1. Creation Method: Create Primary Cluster

    2. Engine Version: MySQL 8.0.2

    3. Column-Oriented Table Analysis Acceleration: Enable

  3. Configure other settings based on your business requirements and complete the purchase. For more information, see Custom purchase.

Quick experience

After you enable columnar table analytics, tables that you create with the specified storage engine and format use the column-oriented format by default. The following example shows how to create a column-oriented table, bulk-load data, and run an analytical query.

Step 1: Create a column-oriented table

Run the following SQL statement to create a column-oriented table:

CREATE TABLE sales_records (
  sale_id INT PRIMARY KEY AUTO_INCREMENT,
  product_id INT NOT NULL,
  region VARCHAR(50),
  sale_date DATE,
  amount DECIMAL(10, 2)
);

Step 2: Create a stored procedure for bulk loading

Create a stored procedure that inserts one million rows and commits every 5,000 rows:

DELIMITER //

CREATE PROCEDURE InsertMillionRows()
BEGIN
    DECLARE i INT DEFAULT 1;

    SET autocommit = 0;

    WHILE i <= 1000000 DO
        INSERT INTO users (username, email, age)
        VALUES (
            CONCAT('user_', LPAD(i, 7, '0')),
            CONCAT('user_', LPAD(i, 7, '0'), '@test.com'),
            FLOOR(18 + RAND() * 50)
        );

        SET i = i + 1;

        IF i % 5000 = 0 THEN
            COMMIT;
        END IF;
    END WHILE;

    COMMIT;
    SET autocommit = 1;
END //

DELIMITER ;

Step 3: Run the stored procedure

Call the stored procedure to bulk-load data:

CALL InsertMillionRows();

Step 4: Run an aggregate query

Run the following aggregate query to verify that the data is loaded:

SELECT COUNT(*) FROM users;

Expected output:

+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.01 sec)

Step 5: View the columnar execution plan

Run the EXPLAIN command to verify that the query uses the columnar execution engine:

EXPLAIN SELECT COUNT(*) FROM users;

The expected output is as follows. The Extra Info column displays IMCI Execution Plan, which indicates that the query uses the columnar execution engine:

+----+--------------------+-------+--------+--------+---------------------------------------------------------------+
| ID | Operator           | Name  | E-Rows | E-Cost | Extra Info                                                    |
+----+--------------------+-------+--------+--------+---------------------------------------------------------------+
|  1 | Select Statement   |       |        |        | IMCI Execution Plan (max_dop = 12, max_query_mem = 858993459) |
|  2 | └─Aggregation      |       |        |        |                                                               |
|  3 |   └─Table Scan     | users |        |        |                                                               |
+----+--------------------+-------+--------+--------+---------------------------------------------------------------+

Configure row-column execution routing

columnar table are designed for large-scale data analytics. They combine columnar storage, vectorized execution, and parallel processing to deliver high-performance SQL analytics at low cost. Read-only nodes use this columnar execution model by default. However, this model cannot fully leverage primary key indexes for high-concurrency point lookups.

If your workloads require both high-concurrency point lookups and complex analytical queries, create separate cluster endpoints to isolate resources. This ensures the two workload types do not affect each other:

  • Complex analytical queries: Add standard read-only nodes and set the loose_use_imci_engine parameter to ON (the default value). Queries automatically use the columnar execution engine.

  • High-concurrency point queries: Set the loose_use_imci_engine parameter to OFF on the corresponding read-only nodes. Queries use the row-based execution engine to take advantage of primary key indexes.

Cluster parameters for columnar table

After you enable Column-oriented Table Acceleration, X-Engine is enabled by default and the default table format is column-oriented. The following table describes the default parameter settings:

Parameter

Description

Valid values

Default

Restart

loose_xengine

Specifies whether to enable X-Engine.

[0-1]

1

Yes

default_storage_engine

The default storage engine for the cluster.

[InnoDB|xengine]

xengine

Yes

loose_xengine_default_storage_table_format

The default table format for the X-Engine storage engine.

[default|row|column]

column

No

loose_polar_enable_imci_with_ctable

Specifies whether to enable the integration between columnar table and the IMCI execution engine.

[ON|OFF]

ON

Yes

loose_imci_oss_enable_lru

Specifies whether to enable LRU caching for IMCI OSS data.

[ON|OFF]

ON

No

loose_xengine_use_memory_pct

The percentage of memory allocated to the X-Engine storage engine.

[5-95]

90

No

loose_xengine_ctable_use_memory_pct

The percentage of X-Engine memory allocated to Column-oriented Table Analytics.

[0-100]

80

No