Use Spark SQL to create an internal table
更新时间:
复制 MD 格式
AnalyticDB for MySQL supports Spark SQL for creating C-Store tables, which use the XUANWU storage engine. The Spark SQL syntax for C-Store tables includes AnalyticDB-specific properties not found in open source Spark SQL.
Syntax differences from open source Spark SQL
Creating C-Store tables in AnalyticDB for MySQL requires AnalyticDB-specific TBLPROPERTIES keys (see Parameters). For creating databases and querying data, AnalyticDB for MySQL uses the same syntax as open source Spark SQL. For details, see the open source Spark SQL reference.
Limitations
Spark SQL can only create tables that use the XUANWU storage engine.
Syntax
CREATE TABLE [ IF NOT EXISTS ] table_identifier
[ ( column_name column_type [ COMMENT col_comment1 ], ... ) ]
USING adb
[ COMMENT table_comment ]
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]Parameters
Core parameters
| Parameter | Required | Description |
|---|---|---|
table_identifier | Yes | The table name. Use the db_name.table_identifier format to distinguish tables with the same name across databases. For naming conventions, see Naming conventions. |
column_name | Yes | The column name. For naming conventions, see Naming conventions. |
column_type | Yes | The column data type. For supported types, see Data type mappings. |
USING adb | Yes | Specifies that the table is an AnalyticDB for MySQL table. |
COMMENT | No | A comment for the table. |
TBLPROPERTIES | No | Key-value pairs that define table properties. See the TBLPROPERTIES parameters below. |
TBLPROPERTIES: required properties
| Property | Required | Description |
|---|---|---|
distributeType | Yes | The distribution type. Valid values: HASH (distributes table data across shards based on the hash values of the distribution key columns) and BROADCAST (stores a full copy of the table in each shard; keep replicated tables small). |
distributeColumns | Yes | The distribution key columns. Accepts multiple columns. The distribution key cannot be changed after the table is created. |
storagePolicy | Yes | The storage policy. Valid values: HOT (default), COLD, and MIXED. Case-insensitive. See Storage policies for details. |
TBLPROPERTIES: optional properties
| Property | Required | Description |
|---|---|---|
hotPartitionCount | Required when storagePolicy=MIXED | The number of hot partitions for hybrid storage. Partitions are sorted by partition key value in descending order — the first N partitions are hot partitions stored on SSDs, and the remaining partitions are cold partitions stored in OSS. N must be a positive integer. Do not set this property for HOT or COLD storage policies. |
primaryKey | No | The primary key columns. A primary key must include the distribution key and the partition key. Place the distribution key and partition key at the beginning of a composite primary key. |
partitionType | No | The partition type. Set to VALUE. |
partitionColumn | No | The partition key column (subpartition). |
partitionCount | No | The partition lifecycle — the maximum number of subpartitions per shard. If you specify a partition key, you must also set partitionCount, otherwise an error is returned. For example, 'partitionCount'='365' means each shard retains at most 365 subpartitions. If a date column is the partition key, data from day 1 is automatically deleted when new data arrives on day 366. |
indexName | No | The index names, comma-separated. Example: 'indexName'='index1,index2'. |
indexType | No | The index type for a named index. Valid values: FULLTEXT (full-text index) and NORMAL (regular index). Example: 'indexType.index1'='FULLTEXT'. |
indexColumn | No | The indexed column for a named index. Example: 'indexColumn.index1'='c1'. |
indexAll | No | Specifies whether to index all columns. Valid values: Y (default) and N. |
tableEngineName | No | The storage engine. Defaults to XUANWU. |
Storage policies
| Policy | Data location | Performance | Cost | Use when |
|---|---|---|---|---|
HOT (default) | All partitions on SSDs | Best query performance | Highest | Data is frequently accessed |
COLD | All partitions in OSS with zone-redundant storage (multi-AZ) | Lower | Cost-effective | Data is rarely accessed |
MIXED | Hot partitions on SSDs; cold partitions in OSS | High for recent data | Balanced | Data access frequency varies by age; requires hotPartitionCount |
Examples
Create a non-partitioned table
-- Non-partitioned table with HASH distribution and COLD storage
CREATE TABLE orders (
order_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id INT NOT NULL COMMENT 'Customer ID',
order_status STRING NOT NULL COMMENT 'Order status',
total_price BIGINT NOT NULL COMMENT 'Order amount',
order_date TIMESTAMP NOT NULL COMMENT 'Order date'
)
USING adb
TBLPROPERTIES (
'distributeType' = 'HASH',
'distributeColumns' = 'order_status',
'storagePolicy' = 'COLD'
);Create a partitioned table with a lifecycle
-- Partitioned table with primary key, selective indexing, and partition lifecycle
CREATE TABLE customer (
customer_id BIGINT NOT NULL COMMENT 'Customer ID',
customer_name STRING NOT NULL COMMENT 'Customer name',
phone_num BIGINT NOT NULL COMMENT 'Phone number',
city_name STRING NOT NULL COMMENT 'City',
sex INT NOT NULL COMMENT 'Gender',
id_number STRING NOT NULL COMMENT 'ID card number',
home_address STRING NOT NULL COMMENT 'Home address',
office_address STRING NOT NULL COMMENT 'Office address',
age INT NOT NULL COMMENT 'Age',
login_time TIMESTAMP NOT NULL COMMENT 'Logon time'
)
USING adb
TBLPROPERTIES (
'distributeType' = 'HASH',
'primaryKey' = 'customer_id,phone_num,city_name',
'distributeColumns' = 'city_name',
'partitionType' = 'value',
'partitionColumn' = 'city_name',
'partitionCount' = 100,
'indexAll' = 'N',
'storagePolicy' = 'COLD'
);Data type mappings
| Spark SQL type | XIHE SQL type |
|---|---|
| BOOLEAN | BOOLEAN |
| TINYINT | TINYINT |
| SMALLINT | SMALLINT |
| INT | INT |
| INTEGER | INTEGER |
| BIGINT | BIGINT |
| FLOAT | FLOAT |
| DOUBLE | DOUBLE |
| BINARY | BINARY |
| DATE | DATE |
| STRING | VARCHAR, TIME, POINT, JSON, ARRAY, MAP |
| TIMESTAMP | DATETIME, TIMESTAMP |
Next steps
该文章对您有帮助吗?