全局索引(Global Index)是分区表上的一种索引技术,可以创建在分区表的非分区键上,也支持提供唯一约束。
背景信息
分区表的本地索引(Local Index),指每个分区都以一个Local Index分区,Local Index的分区规则和表的分区规则保持一致,因此Local Index只有包含分区键时才能作为唯一索引。
Global Index与Local Index不同,Global Index是分区表父表上的单个索引,该索引映射到许多基础表分区。 父表本身没有单个统一的基础存储,因此,它必须从物理分布式表中检索满足索引约束的数据。Global Index将数据存储在一个位置,因此可以一次访问跨多个分区的数据,而不是分别查询每个分区。
Global Index是建立在分区表父表上的一个Btree Index,分区上则没有。它支持创建在分区表的非分区键上,也支持提供唯一约束。
全局分区索引(Global Partitioned Index)
Global Partitioned Index(GPI)旨在分区表上创建一个全局的分区索引。GPI常用于数据量较大的分区表,其特点是数据量大,构建索引更复杂。相比于全局索引,GPI有自己的分区策略,和分区表的分区并不一致。传统的本地分区索引与子表一一对应,相当于子表的本地索引,而GPI既有本地索引的分区特性,又有全局索引的全局特性。
创建Global Index
语法
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ ( opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ GLOBAL/LOCAL/global_partitioned_index ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
global_partitioned_index:
GLOBAL PARTITION BY {RANGE (column_list) (index_partitioning_clause) |
HASH (column_list) (hash_partitions_by_quantity)}
在分区表上创建一个Global Index。
CREATE unique index m_city_id_idx_global ON measurement(city_id) global;
在分区表上创建一个Global Partitioned Index。
CREATE UNIQUE INDEX m_peaktemp_idx_global ON measurement(peaktemp) global
PARTITION BY range (peaktemp)
(
PARTITION m_peaktemp_idx_global_p1 values less than(1000),
PARTITION m_peaktemp_idx_global_p2 values less than(5000),
PARTITION m_peaktemp_idx_global_p3 values less than(MAXVALUE)
);
说明
GLOBAL/LOCAL
参数指定为GLOBAL
即创建Global Index。如果不指定创建
GLOBAL/LOCAL
参数,则默认创建Local Index。Global Index的CREATE语法支持使用CONCURRENTLY模式创建。
非分区表,包括分区表的子表上不支持创建Global Index。
Global Index不支持表达式索引。
Global Partitioned Index支持将Global Index进行range或hash分区,分区的语法和partition table类似。
Global Partitioned Index不支持按照List分区。
Global Partitioned Index只支持在分区表主表上创建,支持创建在分区键的列上。
Global Partitioned Index支持全局唯一约束。
Global Partitioned Index分区列必须指定索引列的左前缀。
Global Partitioned Range Index支持有序扫描。
Global Partitioned Hash Index不支持有序扫描。
Global Partitioned Index支持索引剪枝优化,即优化器剪枝和执行器剪枝。
Global Index Scan
Global Index和普通的Btree Index一样,当使用索引键进行查询时,Index Scan将会提升查询的性能。
Global Index支持以下几种Index Scan:
Global Index Scan
Global Partitioned Index Scan
Global Index Only Scan
Global Partitioned Index Only Scan
Global Index Bitmap Scan
Global Partitioned Index Bitmap Scan
Global Index Parallel Scan
Global Only Index Parallel Scan
示例
Global Index Scan
explain select * from measurement where city_id = 5; QUERY PLAN ------------------------------------------------------------------------------------------------ Global Index Scan using m_city_id_idx_global on measurement (cost=0.12..8.14 rows=1 width=20) Index Cond: (city_id = 5) (2 rows)
Global Partitioned Index Scan
explain select * from measurement where peaktemp = 5; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Global Partitioned Index Scan(1/3) using m_peaktemp_idx_global on measurement (cost=0.50..8.52 rows=1 width=20) Index Cond: (peaktemp = 5) (2 rows)
Global Index Only Scan
explain select city_id from measurement where city_id = 5; QUERY PLAN ---------------------------------------------------------------------------------------------------- Global Index Only Scan using m_city_id_idx_global on measurement (cost=0.12..8.14 rows=1 width=4) Index Cond: (city_id = 5) (2 rows)
Global Partitioned Index Only Scan
explain select peaktemp from measurement where peaktemp = 5; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Global Partitioned Index Only Scan(1/3) using m_peaktemp_idx_global on measurement (cost=0.12..8.14 rows=1 width=4) Index Cond: (peaktemp = 5) (2 rows)
Global Index Bitmap Scan
explain select * from measurement where city_id > 5 and city_id < 10000; QUERY PLAN ------------------------------------------------------------------------------------------------ Append (cost=1.03..14.47 rows=32 width=20) -> Bitmap Heap Scan on measurement_y2023q1 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q2 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q3 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q4 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) (21 rows)
Global Partitioned Index Bitmap Scan
explain select peaktemp from measurement where peaktemp > 5 and peaktemp< 1000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Append (cost=1.03..14.47 rows=32 width=4) -> Bitmap Heap Scan on measurement_y2023q1 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q2 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q3 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q4 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) (21 rows)
Global Index作为主键/外键引用
PolarDB PostgreSQL版(兼容Oracle)支持分区表中的任意一列作为主键或者外键引用,该优势依赖于Global Index。无论主键还是外键引用都需要有唯一索引,而Local Index作为唯一索引的前提是它的索引键必须包括分区键。如果没有Global Index,主键必须包含分区键,外键引用也必须包括分区键。但Global Index没有该限制,它支持在任意一列上创建主键或者作为外键引用。
您可以在创建分区表时指定主键,PolarDB PostgreSQL版(兼容Oracle)将根据您指定的主键,智能的选择使用Local Index还是Global Index。简单来说,当您指定主键包含分区键时,PolarDB PostgreSQL版(兼容Oracle)会使用Local Index为您创建唯一索引,否则将使用Global Index创建唯一索引。
示例
分区表创建主键。
-- local index CREATE TABLE pk_rel(a int primary key, b int, c int, d int) PARTITION BY RANGE (a); postgres=# \d pk_rel Table "public.pk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | not null | b | integer | | | c | integer | | | d | integer | | | Partition key: RANGE (a) Indexes: "pk_rel_pkey" PRIMARY KEY, btree (a) Number of partitions: 0 --- global index CREATE TABLE pk_rel(a int, b int primary key, c int, d int) PARTITION BY RANGE (a); postgres=# \d pk_rel Table "public.pk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | integer | | not null | c | integer | | | d | integer | | | Partition key: RANGE (a) Indexes: "pk_rel_pkey" PRIMARY KEY, btree (b) GLOBAL Number of partitions: 0
说明您也可以使用语句
alter table xxx add primary key using index xxx
显式的指定某种索引作为主键。PolarDB PostgreSQL版(兼容Oracle)可以使用分区表的任意列作为外键引用,像一个普通表一样,前提是需要为这些列创建唯一索引或者主键。同样的,如果这些列是分区键,则使用Local Index,否则使用Global Index。
分区表引用外键。
CREATE TABLE fk_rel(a int, b int REFERENCES pk_rel(b), c int, d int) PARTITION BY RANGE (a); postgres=# \d fk_rel Table "public.fk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | integer | | | c | integer | | | d | integer | | | Partition key: RANGE (a) Foreign-key constraints: "fk_rel_b_fkey" FOREIGN KEY (b) REFERENCES pk_rel(b) Number of partitions: 0
性能测试
使用pgbench工具产生scale为80000
的数据,分别创建分区表与非分区表。
TPS结果不能作为绝对标准,根据不同物理环境的不同而有差异,这里的数据用于对比Global Index和Local Index的性能。
非分区键上的点查性能
类别 | TPS | |||||
Prepared Statement | 不使用 | 使用 | ||||
并发数 | 1 | 32 | 64 | 1 | 32 | 64 |
普通表 | 27,732 | 494,433 | 430,848 | 53,935 | 985,880 | 886,882 |
分区表+Local Index | 367 | 4,155 | 3,688 | 856 | 8,742 | 6,790 |
分区表+Global Index | 19,006 | 308,128 | 262,941 | 45,090 | 820,924 | 731,557 |
非分区键上的TPC-B性能
包含了点查和DML。
类别 | TPS | |||||
Prepared Statement | 不使用 | 使用 | ||||
并发数 | 1 | 32 | 64 | 1 | 32 | 64 |
普通表 | 1,115 | 51,025 | 60,409 | 4,822 | 90,312 | 100,802 |
分区表+Local Index | 271 | 2,903 | 2,524 | 550 | 5,276 | 4,237 |
分区表+Global Index | 3,453 | 36,320 | 39,941 | 4,334 | 69,040 | 75,232 |
结论
Global Index对分区表的点查和DML能够带来一个数量级的性能提升。