使用fetch_summary子句获取summary-智能开放搜索 OpenSearch-阿里云

子句说明

通过在查询语句中增加fetch_summary子句，用户可以只进行第二阶段查询，即获取summary。目前引擎共有3种获取summary的方式：通过docid、通过pk hash值、通过pk原始值获取summary。

子句语法

通过docid取summary

  config=fetch_summary_type:docid&&fetch_summary=gid[, gid]

用户需要在config子句中表明fetch_summary_type是docid，并在fetch_summary子句中填上想要获取summary的gid。用户一般不需要关心gid的具体涵义，只需要从第一阶段查询结果中取出gid即可。

示例：

config=format:xml,fetch_summary_type:docid,cluster:daogou&&query=test&&fetch_summary=daogou|6|0|0|0|00000000000000004cd645cfd1c63041|184140777,daogou|6|0|0|1|00000000000000005b3ceae33e5ab800|184140777 

返回结果示例：
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<TotalTime>0.003</TotalTime>
<hits numhits="2" totalhits="0" coveredPercent="0.00">
<hit cluster_name="daogou" hash_id="0" docid="0" gid="daogou|6|0|0|0|00000000000000004cd645cfd1c63041|184140777">
<fields>
<id>1</id>
</fields>
<property>
</property>
<sortExprValues></sortExprValues>
</hit>
<hit cluster_name="daogou" hash_id="0" docid="1" gid="daogou|6|0|0|1|00000000000000005b3ceae33e5ab800|184140777">
<fields>
<id>2</id>
</fields>
<property>
</property>
<sortExprValues></sortExprValues>
</hit>
</hits>
<AggregateResults>
</AggregateResults>
<Error>
<ErrorCode>0</ErrorCode>
<ErrorDescription></ErrorDescription>
</Error>
</Root>

注：独立summary查询返回结果中，totalhits展示数为0，coveredPercent展示数为0。 hits中hit的顺序跟查询串中fetch_summary clause中gid的顺序保持一致。如果取不到summary（比如hashid或者docId不存在）， 则结果中对应hit的fields域值为空

通过pk的hash值取summary

通过pk的hash值取summary的方法与通过docid取基本一样，也是通过gid的形式来表示想要取summary的文档，不同之处在于：

需要在config子句设置fetch_summary_type为pk
虽然都是用gid来表示文档，但是pk与docid还是存在一些区别。一般我们认为一个pk可以唯一的表示一个文档，但docid不行。因此我们在使用docid方式取summary时，还需要借助全量版本和增量版本号来定位文档，但在使用pk来定位文档时，就可以忽略版本信息。所以在使用pk获取summary时，gid中的全量版本和增量版本以及docid这几个字段都是不起作用的。
如果想使用pk的hash值来取summary，必须在这个cluster的schema中配置primary key索引，并设置"has_primary_key_attribute" : true

示例：

config=fetch_summary_type:pk&&fetch_summary=daogou|6|100|100|100|00000000000000004cd645cfd1c63041|184140777,daogou|6|200|200|200|00000000000000005b3ceae33e5ab800|184140777

通过pk的原文取summary

通过pk的原文取summary的方式与上述两种方式都不同，因为它不采用gid来定位文档，而是直接采用文档的pk的原文来定位。通过这种方式取summary，用户需要：

在config子句设置fetch_summary_type为rawpk
目标集群的schema中配置了primary key索引，并且集群配置的hash field必须和primary key是同一个字段语法

示例：

config=fetch_summary_type:rawpk&&fetch_summary=cluster1:pk1,pk2;cluster2:pk3,pk4
返回结果：
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<TotalTime>0.003</TotalTime>
<hits numhits="2" totalhits="0" coveredPercent="0.00">
<hit cluster_name="daogou" hash_id="17871" docid="0" gid="daogou|0|0|17871|0|4cd645cfd1c63041391f27d3272cfeeb|4294967295">
<fields>
<id>1</id>
</fields>
<property>
</property>
<sortExprValues></sortExprValues>
<raw_pk>111</raw_pk>
</hit>
<hit cluster_name="daogou" hash_id="60131" docid="0" gid="daogou|0|0|60131|0|5b3ceae33e5ab800352f040b4d9c05e9|4294967295">
<fields>
<id>2</id>
</fields>
<property>
</property>
<sortExprValues></sortExprValues>
<raw_pk>112</raw_pk>
</hit>
</hits>
<AggregateResults>
</AggregateResults>
<Error>
<ErrorCode>0</ErrorCode>
<ErrorDescription></ErrorDescription>
</Error>
</Root>

注：cluster表示要查询的cluster名称。查询结果中，多了rawpk这个字段。由于pk原文可能出现任意字符，有可能与我们查询串中的保留字符冲突，因此需要用户对所有 引擎查询串的保留字符进行转义，在字符前面加上\（反斜杠）。具体需要转义的字符有：逗号，冒号，分号，&（与号），等于号，斜杠本身。 例如你的pk原文是abc,d:e\，则传给引擎的pk原文应转义成abc\,d\:e\\

注意事项

fetch_summary子句是可选子句
获取summary时候可能会出现summary不存在的问题，可能的原因是集群不稳定导致取summary超时，或者是由于实时数据更新，对应的文档在瞬时处于删除状态（更新数据时先删除在添加）。
不建议使用docid的方式获取summary，因为docid是一个变化的值，当切增量或者实时数据更新时docid可能发生变化。