如何在Hologres中使用RuntimeFilter_实时数仓 Hologres(Hologres)-阿里云帮助中心

Hologres从V2.0版本开始支持Runtime Filter，在多表Join场景下自动优化Join过程的过滤行为，提升Join的查询性能。本文为您介绍在Hologres中Runtime Filter的使用。

背景信息

应用场景

Hologres从V2.0版本开始支持Runtime Filter，通常应用在多表（两表及以上）Join的Hash Join场景，尤其是大表Join小表的场景中，无需手动设置，优化器和执行引擎会在查询时自动优化Join过程的过滤行为，从而降低I/O开销，以提升Join的查询性能。

原理介绍

在了解Runtime Filter原理之前，需先了解Join过程。两个表Join的SQL示例如下：

select * from test1 join test2 on test1.x = test2.x;

其对应的执行计划如下。

image..png

如上执行计划，两个表Join时，会通过test2表构建Hash表，然后匹配test1表的数据，最后返回结果。在这个过程中，Join时会涉及到两个名词：

build端：两表（或者子查询）做Hash Join时，其中一张表（子查询）的数据会构建成Hash表，这一部分称为build端，对应计划里的Hash节点。
probe端：Hash Join的另一边，主要是读取数据然后和build端的Hash表进行匹配，这一部分称为probe端。

通常来说，在执行计划正确的情况下，小表是build端，大表是probe端。

Runtime Filter的原理就是在HashJoin过程中，利用build端的数据分布，生成一个轻量的过滤器（filter），发送给probe端，对probe端的数据进行裁剪，从而减少probe端真正参与Hash Join以及网络传输的数据量，以此来提升Hash Join性能。

因此Runtime Filter更适用于大小表Join，且表数据量相差较大的场景，性能将会比普通Join有更多的提升。

使用限制和触发条件

使用限制

仅Hologres V2.0及以上版本支持Runtime Filter。
仅支持Join条件中只有一个字段，如果有多个字段将不会触发Runtime Filter。从Hologres V2.1版本开始，Runtime Filter支持多个字段Join，如果多个Join字段满足触发条件，也会触发Runtime Filter。

触发条件

Hologres本身支持高性能的Join，因此Runtime Filter会根据查询条件在底层自动触发，但是需要SQL满足下述所有条件才能触发：

probe端的数据量在100000行及以上。
扫描的数据量比例：build端 / probe端 <= 0.1（比例越小，越容易触发Runtime Filter）。
Join出的数据量比例：build端 / probe端 <= 0.1（比例越小，越容易触发Runtime Filter）。

Runtime Filter的类型

可以根据以下两个维度对Runtime Filter进行分类。

按照Hash Join的probe端是否需要进行Shuffle，可分为Local和Global类型。
- Local类型：Hologres V2.0及以上版本支持。当Hash Join的probe端不需要Shuffle时，build端数据有如下三种情况，均可以使用Local类型的Runtime Filter：
  - build端和probe端的Join Key是同一种分布方式。
  - build端数据broadcast给probe端。
  - build端数据按照probe端数据的分布方式Shuffle给Probe端。
- Global类型：Hologres V2.2及以上版本支持。当probe端数据需要Shuffle时，Runtime Filter需要合并后才可以使用，这种情况需要使用Global类型的Runtime Filter。
Local类型的Runtime Filer仅可能减少数据扫描量以及参与Hash Join计算的数据量，Global类型的Runtime Filter由于probe端数据会Shuffle，在数据Shuffle之前做过滤还可以减少数据的网络传输量。类型都无需手动指定，引擎会自适应。
按照Filter类型，可分为Bloom Filter、In Filter和MinMAX Filter。
- Bloom Filter：Hologres V2.0及以上版本支持。Bloom Filter具有一定假阳性，导致少过滤一些数据，但其应用范围广，在build端数据量较多是仍能有较高的过滤效率，提升查询性能。
- In Filter：Hologres V2.0及以上版本支持。In Filter在build端数据NDV（Number of Distinct Value，列的非重复值的个数）较小时使用，其会使用build端数据构建一个HashSet发送给probe端进行过滤，In Filter的优势是可以过滤所有应该过滤的数据，且可以和Bitmap索引结合使用。
- MinMAX Filter：Hologres V2.0及以上版本支持。MinMAX Filter会根据build端数据得到最大值和最小值，发送给probe端做过滤，其优势为可能根据元数据信息直接过滤掉文件或一个Batch的数据，减少I/O成本。
三种Filter类型无需您手动指定，Hologres会根据运行时Join情况自适应使用各种类型的Filter。

验证Runtime Filter

如下示例帮助您更好地理解Runtime Filter。

示例1：Join条件中只有1列，使用Local类型Runtime Filter

示例代码：

begin; 
create table test1(x int, y int);
call set_table_property('test1', 'distribution_key', 'x');

create table test2(x int, y int);
call set_table_property('test2', 'distribution_key', 'x');
end;

insert into test1 select t, t from generate_series(1, 100000) t;
insert into test2 select t, t from generate_series(1, 1000) t;
analyze test1;
analyze test2;

explain analyze select * from test1 join test2 on test1.x = test2.x;

执行计划：
- test2表只有1000行，test1表有100000行，build端和probe端的数据量比例是0.01，小于0.1，且Join出来的数据量build端和probe端比例是0.01，小于0.1，满足Runtime Filter的默认触发条件，因此引擎会自动使用Runtime Filter。
- probe端的test1表有Runtime Filter Target Expr节点，表示probe端使用了Runtime Filter下推。
- probe端的scan_rows代表从存储中读取的数据，有100000行，rows代表使用Runtime Filter过滤后，scan算子的行数，为1000行，可以从这两个数据上看Runtime Filter的过滤效果。

示例2：Join条件中有多列（Hologres V2.1版本支持），使用Local类型Runtime Filter

示例代码：

drop table if exists test1, test2;
begin;
create table test1(x int, y int);
create table test2(x int, y int);
end;
insert into test1 select t, t from generate_series(1, 1000000) t;
insert into test2 select t, t from generate_series(1, 1000) t;
analyze test1;
analyze test2;

explain analyze select * from test1 join test2 on test1.x = test2.x and test1.y = test2.y;

执行计划：
- Join条件有多列，Runtime Filter也生成了多列。
- build端broadcast，可以使用Local类型的Runtime Filter。

示例3：Global类型Runtime Filter支持Shuffle Join（Hologres V2.2版本支持）

示例代码：

SET hg_experimental_enable_result_cache = OFF;

drop table if exists test1, test2;
begin;
create table test1(x int, y int);
create table test2(x int, y int);
end;
insert into test1 select t, t from generate_series(1, 100000) t;
insert into test2 select t, t from generate_series(1, 1000) t;
analyze test1;
analyze test2;

explain analyze select * from test1 join test2 on test1.x = test2.x;

执行计划：
从上述执行计划可以看出，probe端数据被Shuffle到Hash Join算子，引擎会自动使用Global Runtime Filter来加速查询。

示例4：In类型的Filter结合bitmap索引（Hologres V2.2版本支持）

示例代码：

set hg_experimental_enable_result_cache=off;

drop table if exists test1, test2;

begin;
create table test1(x text, y text);
call set_table_property('test1', 'distribution_key', 'x');
call set_table_property('test1', 'bitmap_columns', 'x');
call set_table_property('test1', 'dictionary_encoding_columns', '');

create table test2(x text, y text);
call set_table_property('test2', 'distribution_key', 'x');
end;

insert into test1 select t::text, t::text from generate_series(1, 10000000) t;

insert into test2 select t::text, t::text from generate_series(1, 50) t;

analyze test1;
analyze test2;

explain analyze select * from test1 join test2 on test1.x = test2.x;

执行计划：
从上述执行计划可以看出，在probe端的scan算子上，使用了bitmap，因为In Filter可以精确过滤，因此过滤后还剩50行，scan算子中的scan_rows为700多万，比原始行数1000万少，这是因为In Filter可以推到存储引擎，有可能减少I/O成本，最终结果是从存储引擎中读取的数据变少了，In类型的Runtime Filter结合bitmap通常在Join Key为STRING类型时，有明显作用。

示例5：MinMax类型的Filter减少I/O（Hologres V2.2版本支持）

示例代码：

set hg_experimental_enable_result_cache=off;

drop table if exists test1, test2;

begin;
create table test1(x int, y int);
call set_table_property('test1', 'distribution_key', 'x');

create table test2(x int, y int);
call set_table_property('test2', 'distribution_key', 'x');
end;

insert into test1 select t::int, t::int from generate_series(1, 10000000) t;
insert into test2 select t::int, t::int from generate_series(1, 100000) t;

analyze test1;
analyze test2;

explain analyze select * from test1 join test2 on test1.x = test2.x;

执行计划：
从上述执行计划可以看出，probe端scan算子从存储引擎读取的行数为32万多，比原始行数1000万少了很多，这是因为Runtime Filter被下推到存储引擎，利用一个batch数据的meta信息整批过滤数据，有可能大量减少I/O成本。通常在Join Key为数值类型，且build端值域范围比probe端的值域范围小时，有明显效果。