性能白皮书-云原生多模数据库 Lindorm(Lindorm)-阿里云帮助中心

本文介绍通过OpenSearch Benchmark对Lindorm搜索引擎进行性能基准测试的相关信息，包括：实例配置、压测数据、压测指标、压测任务及压测结果。

Lindorm实例配置

项目	配置
搜索引擎版本	3.9.30.2
数据节点规格	8核32GB
数据节点数量	3
存储类型	性能型云存储
存储空间	320GiB

压测数据

使用官方负载geonames、http_logs及nyc_taxis进行测试，数据集信息如下：

Workload	基本信息	测试项目	压测结果
geonames	全球地理位置信息具体信息可参考：https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/geonames	Term queries term查询 Geospatial queries (point-in-polygon, bounding box) 地理空间查询（多边形内点、边界框） Geodistance queries 地理距离查询 Aggregations 聚合	geonames测试结果
http_logs	1998 年世界杯网站的HTTP访问日志具体信息可参考：https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/http_logs/README.md	Time range queries 时间范围查询 Term queries term查询 Aggregations for metrics like request count and average response size 请求数和平均响应大小等指标的聚合 Cardinality aggregations 基数聚合	http_logs测试结果
nyc_taxis	2015 年纽约市黄色出租车的乘车记录具体信息可参考：https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/nyc_taxis	Range queries 范围查询 Term queries on various fields 针对不同字段的术语查询 Geodistance queries 地理距离查询 Aggregations 聚合	nyc_taxis测试结果

压测指标

以下列出部分重要指标，更多信息可参考：https://opensearch.org/docs/latest/benchmark/reference/metrics/metric-keys/。

指标类型	Metric	指标说明
主分片相关指标	Cumulative indexing time of primary shards	主分片索引累积时间总和
	Min cumulative indexing time across primary shards	跨主分片索引累积时间的最小值
	Median cumulative indexing time across primary shards	跨主分片索引累积时间的中值
	Max cumulative indexing time across primary shards	跨主分片索引累积时间的最大值
	Cumulative indexing throttle time of primary shards	主分片索引时被限流的累积时间
	Min cumulative indexing throttle time across primary shards	跨主分片索引时被限流的累积时间的最小值
	Median cumulative indexing throttle time across primary shards	跨主分片索引时被限流的累积时间的中值
	Max cumulative indexing throttle time across primary shards	跨主分片索引时被限流的累积时间的最大值
	Cumulative merge time of primary shards	主分片的累积合并的运行时间
	Cumulative merge count of primary shards	主分片合并的累积次数
	Min cumulative merge time across primary shards	跨主分片索引合并累积时间的最小值
	Median cumulative merge time across primary shards	跨主分片索引合并累积时间的中值
	Max cumulative merge time across primary shards	跨主分片索引合并累积时间的最大值
	Cumulative merge throttle time of primary shards	主分片的累积合并限制时间
	Min cumulative merge throttle time across primary shards	跨主分片索引合并累积的时间最小值
	Median cumulative merge throttle time across primary shards	跨主分片索引合并累积时间的中值
	Max cumulative merge throttle time across primary shards	跨主分片索引合并累积的时间最大值
	Cumulative refresh time of primary shards	主分片累积刷新的时间
	Cumulative refresh count of primary shards	主分片累积刷新的次数
	Min cumulative refresh time across primary shards	跨主分片索引刷新时间的最小值
	Median cumulative refresh time across primary shards	跨主分片索引刷新时间的中值
	Max cumulative refresh time across primary shards	跨主分片索引刷新时间的最大值
	Cumulative flush time of primary shards	主分片索引flush（从缓存冲洗到磁盘）的累积时间
	Cumulative flush count of primary shards	主分片索引flush（从缓存冲洗到磁盘）的累积次数
	Min cumulative flush time across primary shards	跨主分片索引，flush累积时间的最小值
	Median cumulative flush time across primary shards	跨主分片索引，flush累积时间的中值
	Max cumulative flush time across primary shards	跨主分片索引，flush累积时间的最大值
	Store size	索引大小
	Translog size	传输日志大小
	Segment count	segments数目
垃圾回收指标	Total Young Gen GC time	年轻代垃圾收集器的总运行时间
	Total Young Gen GC count	年轻代垃圾收集器的总运行次数
	Total Old Gen GC time	老年代垃圾收集器的总运行时间
	Total Old Gen GC count	老年代垃圾收集器的总运行次数
heap相关指标	Heap used for segments	segments占用堆的量
	Heap used for doc values	文档占用堆的量
	Heap used for terms	terms占用堆的量
	Heap used for norms	norms占用堆的量
	Heap used for points	points占用堆的量
	Heap used for stored fields	字段存储占用堆的量
吞吐量	Min Throughput	每个任务的最小吞吐量
	Median Throughput	每个任务的平均吞吐量
	Max Throughput	每个任务的最大吞吐量
延迟	50th percentile latency	50%的请求所经历的时间
	90th percentile latency	90%的请求所经历的时间
	99.9th percentile latency	99.9%的请求所经历的时间
	100th percentile latency	100%的请求所经历的时间
服务时间	50th percentile service time	50%的请求所经历的服务时间
	90th percentile service time	90%的请求所经历的服务时间
	99.9th percentile service time	99.9%的请求所经历的服务时间
	100th percentile service time	100%的请求所经历的服务时间
错误率	error rate	每个任务的响应错误率

压测任务

可以从任务纬度查看实例的各项指标，主要的测试任务如下：

Task	任务说明
index-append	索引追加写入
index-stats	获取索引信息
node-stats	获取节点信息
default	默认搜索请求
term	单词精确匹配
phrase	短语精确查询
country_agg_uncached	agg无缓存
country_agg_cached	agg有缓存
scroll	大量结果集滚动查询
expression	表达式查询
painless_static	静态编译
painless_dynamic	动态脚本
decay_geo_gauss_function_score	基于距离权重调整分数的查询效果功能评分
decay_geo_gauss_script_score	基于距离权重调整分数的查询效果脚本评分
field_value_function_score	根据字段值应用不同的函数评分
field_value_script_score	根据字段值应用不同的脚本评分
large_terms	大量词条
large_filtered_terms	大量词条过滤
large_prohibited_terms	禁止大量词条
desc_sort_population	按照人口数量降序
asc_sort_population	按照人口数量升序
asc_sort_with_after_population	某个起始点之后继续升序
desc_sort_geonameid	根据地理位置ID进行降序
desc_sort_with_after_geonameid	某个起始点之后继续升序
asc_sort_geonameid	根据地理位置ID进行升序
asc_sort_with_after_geonameid	某个起始点之后继续降序
range	范围查询
status-200s-in-range	状态码属于200查询
status-400s-in-range	状态码属于400查询
desc_sort_timestamp	照时间戳降序
asc_sort_timestamp	照时间戳升序
desc_sort_with_after_timestamp	某个起始点之后继续降序
asc_sort_with_after_timestamp	某个起始点之后继续升序
desc-sort-timestamp-after-force-merge-1-seg	执行了强制合并之后，再按时间戳进行降序
asc-sort-timestamp-after-force-merge-1-seg	执行了强制合并之后，再按时间戳进行升序
desc-sort-with-after-timestamp-after-force-merge-1-seg	先进行了强制合并，再根据特定的时间戳降序，并且使用了search_after
asc-sort-with-after-timestamp-after-force-merge-1-seg	先进行了强制合并，再根据特定的时间戳升序，并且使用了search_after
autohisto_agg	直方图聚合查询
date_histogram_agg	日期直方图聚合查询