使用向量优化功能减少资源消耗并提升搜索效率-检索分析服务 Elasticsearch版-阿里云

阿里云ES团队从索引构建、查询等维度对向量字段的使用情况进行分析，在Serverless产品中提供了针对索引向量字段的默认优化及配置能力（例如，存储时删除向量字段、压缩向量大小），减少使用向量字段时的资源消耗、提高搜索效率。

进入向量优化

进入应用详情页。
1. 登录Elasticsearch Serverless控制台，在顶部菜单栏切换至目标地域。
2. 在左侧导航栏单击应用管理，单击已创建的应用名称，进入应用详情页。
在左侧导航栏的搜索调优中心 > 向量优化页签，您可通过配置智能过滤向量字段、向量默认量化策略、向量自适应预热等优化项，实现应用的搜索调优。

智能过滤向量字段

功能概述

从存储的_source字段中排除向量字段，节省存储与传输资源、加速搜索，避免不必要的数据暴露。

背景说明

ES会将索引时传递的原始JSON文档存储在_source字段中，默认情况下，搜索响应中的每个命中结果都会包含该字段的全部内容。当文档包含高维稠密向量字段（dense_vector ）时，可能导致_source的数据量非常大，加载及序列化该类向量数据将带来昂贵的I/O和网络开销，严重影响KNN搜索的性能与响应速度。

说明

向量字段（例如， "image_vector": [0.1, 0.5, ..., 0.9]）通常较长，会占用大量空间。

逻辑介绍

启用智能过滤向量字段优化项后，系统可通过excludes映射参数排除_source中存储的稠密向量字段，避免在搜索期间加载并返回大量原始向量数据，从而降低网络开销并减小索引大小。该优化项默认启用，您也可按需选择启用或停用该配置。

重要

若用户自行指定了excludes字段，则优先以用户的配置为准。例如，用户指定在excludes中排除my_text字段，则查询索引数据时，返回结果只会排除my_text字段，不会排除向量字段。
由于knn搜索过程依赖于独立的数据结构，因此在_source中排除的向量仍然可以在knn搜索中使用。
Reindex、Update、Update By Query操作通常需要使用_source字段，排除_source中的向量字段，可能会导致该类操作出现数据丢失或产生异常。例如，重新索引时，新索引中可能不包含dense_vector字段。

效果展示

您可基于如下场景，通过示例代码，了解启用或关闭智能过滤向量字段时的效果。

场景一：启用智能过滤向量字段，并且不指定excludes。

示例创建索引my_vector_index。

PUT /my_vector_index
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

查看已创建索引。

GET /my_vector_index

返回结果如下。可看到，系统会自动排除稠密向量字段，即在mapping._source.excludes中出现my_vector字段。

示例代码

{
  "my_vector_index": {
    "aliases": {},
    "mappings": {
      "_source": {
        "excludes": [
          "my_vector"
        ]
      },
      "properties": {
        "my_text": {
          "type": "keyword"
        },
        "my_vector": {
          "type": "dense_vector",
          "dims": 3,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "int8_hnsw",
            "m": 16,
            "ef_construction": 100
          }
        }
      }
    },
    "settings": {
      "index": {
        "max_prefix_length": "50",
        "mapping": {
          "nested_objects": {
            "limit": "100"
          },
          "field_name_length": {
            "limit": "100"
          }
        },
        "refresh_interval": "1s",
        "number_of_shards": "1",
        "max_wildcard_length": "50",
        "max_refresh_listeners": "20",
        "max_regex_length": "50",
        "max_terms_count": "1024",
        "number_of_replicas": "1"
      }
    }
  }
}

场景二：启用智能过滤向量字段，并且指定excludes。

示例创建索引my_vector_index，并且指定在excludes中排除my_text字段。

PUT /my_vector_index
{
  "mappings": {
    "_source": {
        "excludes":["my_text"]
    },
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

查看已创建索引。

GET /my_vector_index

返回结果如下。可看到，系统不会排除稠密向量字段，而是排除指定字段my_text，即在mapping._source.excludes中未出现my_vector字段，只出现创建索引时指定的my_text字段。

示例代码

{
  "my_vector_index": {
    "aliases": {},
    "mappings": {
      "_source": {
        "excludes": [
          "my_text"
        ]
      },
      "properties": {
        "my_text": {
          "type": "keyword"
        },
        "my_vector": {
          "type": "dense_vector",
          "dims": 3,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "int8_hnsw",
            "m": 16,
            "ef_construction": 100
          }
        }
      }
    },
    "settings": {
      "index": {
        "apack": {
          "metadata": {
            "index_dense_vector_value_count_limit": "1000000",
            "index_max_storage": "20.0",
            "index_dense_vector_mem_limit": "1.0"
          }
        },
        "max_prefix_length": "50",
        "mapping": {
          "nested_objects": {
            "limit": "100"
          },
          "field_name_length": {
            "limit": "100"
          }
        },
        "refresh_interval": "1s",
        "number_of_shards": "1",
        "blocks": {
          "read_only_allow_delete": "false"
        },
        "max_wildcard_length": "50",
        "max_refresh_listeners": "20",
        "max_regex_length": "50",
        "max_terms_count": "1024",
        "number_of_replicas": "1"
      }
    }
  }
}

场景三：关闭智能过滤向量字段。

示例创建索引my_vector_index。

PUT /my_vector_index
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

查看已创建索引。

GET /my_vector_index

返回结果如下。可看到，系统不会排除任何字段，即返回结果中不包含mapping._source.excludes的相关信息。

示例代码

{
  "my_vector_index": {
    "aliases": {},
    "mappings": {
      "properties": {
        "my_text": {
          "type": "keyword"
        },
        "my_vector": {
          "type": "dense_vector",
          "dims": 3,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "int8_hnsw",
            "m": 16,
            "ef_construction": 100
          }
        }
      }
    },
    "settings": {
      "index": {
        "max_prefix_length": "50",
        "mapping": {
          "nested_objects": {
            "limit": "100"
          },
          "field_name_length": {
            "limit": "100"
          }
        },
        "refresh_interval": "1s",
        "number_of_shards": "1",
        "max_wildcard_length": "50",
        "max_refresh_listeners": "20",
        "max_regex_length": "50",
        "max_terms_count": "1024",
        "number_of_replicas": "1"
      }
    }
  }
}

向量默认量化策略

功能概述

将使用 float32/4字节存储的向量，转换为更低精度的格式（例如，int8/1字节），从而压缩向量大小。量化能大幅节省存储和内存、加速计算，同时在多数场景下保持较高精度，平衡效率与准确性。

策略说明

可按需选择量化策略，但不可停用该功能。支持使用的策略如下：

int8（默认值）：每个值占用1字节，向量内存容量可减少4倍。
int4：每个值占用半个字节，向量内存容量可减少8倍。
BBQ：每个值占用1比特，8个值合计为1字节，向量内存容量可减少32倍。

说明

若您在创建索引时指定了量化类型，则以创建索引指定的配置为准。

效果展示

您可基于如下场景，通过示例代码，了解向量默认量化策略优化项的相关效果。

场景一：应用默认量化策略配置为int8，在创建索引时未指定量化类型。

示例创建索引my_vector_index。

PUT /my_vector_index
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 3
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

查看已创建索引。

GET /my_vector_index

返回结果如下。可看到，my_vector字段的量化类型被设置为int8_hnsw。

示例代码

{
  "my_vector_index": {
    "aliases": {},
    "mappings": {
      "properties": {
        "my_text": {
          "type": "keyword"
        },
        "my_vector": {
          "type": "dense_vector",
          "dims": 3,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "int8_hnsw",
            "m": 16,
            "ef_construction": 100
          }
        }
      }
    },
    "settings": {
      "index": {
        "max_prefix_length": "50",
        "mapping": {
          "nested_objects": {
            "limit": "100"
          },
          "field_name_length": {
            "limit": "100"
          }
        },
        "refresh_interval": "1s",
        "number_of_shards": "1",
        "max_wildcard_length": "50",
        "max_refresh_listeners": "20",
        "max_regex_length": "50",
        "max_terms_count": "1024",
        "number_of_replicas": "1"
      }
    }
  }
}

场景二：应用默认量化策略配置为int8，在创建索引时指定量化类型。

示例创建索引my_vector_index，并指定my_vector字段的量化类型为int4_hnsw。

PUT /my_vector_index
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 4,
        "index_options": {
            "type": "int4_hnsw"
        }
      },
      "my_text" : {
        "type" : "keyword"
      }
    }
  }
}

查看已创建索引。

GET /my_vector_index

返回结果如下。可看到，my_vector字段的量化类型被设置为int4_hnsw。

示例代码

{
  "my_vector_index": {
    "aliases": {},
    "mappings": {
      "properties": {
        "my_text": {
          "type": "keyword"
        },
        "my_vector": {
          "type": "dense_vector",
          "dims": 4,
          "index": true,
          "similarity": "cosine",
          "index_options": {
            "type": "int4_hnsw",
            "m": 16,
            "ef_construction": 100,
            "confidence_interval": 0
          }
        }
      }
    },
    "settings": {
      "index": {
        "apack": {
          "metadata": {
            "index_dense_vector_value_count_limit": "1000000",
            "index_max_storage": "20.0",
            "index_dense_vector_mem_limit": "1.0"
          }
        },
        "max_prefix_length": "50",
        "mapping": {
          "nested_objects": {
            "limit": "100"
          },
          "field_name_length": {
            "limit": "100"
          }
        },
        "refresh_interval": "1s",
        "number_of_shards": "1",
        "blocks": {
          "read_only_allow_delete": "false"
        },
        "max_wildcard_length": "50",
        "max_refresh_listeners": "20",
        "max_regex_length": "50",
        "max_terms_count": "1024",
        "number_of_replicas": "1"
      }
    }
  }
}

向量自适应预热

在服务启动或索引加载后，自动将向量索引预加载到内存，减少首次搜索因冷启动导致延迟过高，提升高并发场景下的响应稳定性与吞吐量。该优化项默认启用，并且不可停用及编辑。