在控制台更新实例配置-向量检索服务 Milvus 版-阿里云

向量检索服务Milvus版（简称Milvus）提供了通过控制台的方式查询和修改实例的配置信息。本文将介绍如何在控制台上更新Milvus实例配置，以满足不同的业务需求。
操作步骤

进入实例配置页面。
1. 登录阿里云Milvus控制台。
2. 在左侧导航栏，单击Milvus实例。
3. 在顶部菜单栏处，根据实际情况选择地域。
4. 在Milvus实例页面，单击目标实例名称。
5. 单击实例配置页签。
在实例配置输入框内输入需要覆盖默认配置的参数，然后单击保存配置。
参数格式为YAML格式，相关配置参数示例如下所示。
# Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requests
rootCoord:
  maxDatabaseNum: 64 # Maximum number of database
  maxPartitionNum: 4096 # Maximum number of partitions in a collection
  minSegmentSizeToEnableIndex: 1024 # It's a threshold. When the segment size is less than this value, the segment will not be indexed
  importTaskExpiration: 900 # (in seconds) Duration after which an import task will expire (be killed). Default 900 seconds (15 minutes).
  importTaskRetention: 86400 # (in seconds) Milvus will keep the record of import tasks for at least `importTaskRetention` seconds. Default 86400, seconds (24 hours).
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912
    
# Related configuration of proxy, used to validate client requests and reduce the returned results.
proxy:
  timeTickInterval: 200 # ms, the interval that proxy synchronize the time tick
  healthCheckTimeout: 3000 # ms, the interval that to do component healthy check
  maxNameLength: 255 # Maximum length of name for a collection or alias
  # Maximum number of fields in a collection.
  # As of today (2.2.0 and after) it is strongly DISCOURAGED to set maxFieldNum >= 64.
  # So adjust at your risk!
  maxFieldNum: 64
  maxTaskNum: 1024 # max task number of proxy task queue
  grpc:
    serverMaxSendSize: 268435456
    serverMaxRecvSize: 67108864
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 67108864

# Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.
queryCoord:
  autoHandoff: true # Enable auto handoff
  autoBalance: true # Enable auto balance
  balancer: ScoreBasedBalancer # Balancer to use
  overloadedMemoryThresholdPercentage: 90 # The threshold percentage that memory overload
  balanceIntervalSeconds: 60
  memoryUsageMaxDifferencePercentage: 30
  checkInterval: 1000
  channelTaskTimeout: 60000 # 1 minute
  segmentTaskTimeout: 120000 # 2 minute
  distPullInterval: 500
  heartbeatAvailableInterval: 10000 # 10s, Only QueryNodes which fetched heartbeats within the duration are available
  loadTimeoutSeconds: 600
  checkHandoffInterval: 5000
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912

# Related configuration of queryNode, used to run hybrid search between vector and scalar data.
queryNode:
  dataSync:
    flowGraph:
      maxQueueLength: 16 # Maximum length of task queue in flowgraph
      maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
  stats:
    publishInterval: 1000 # Interval for querynode to report node information (milliseconds)
  segcore:
    cgoPoolSizeRatio: 2.0 # cgo pool size ratio to max read concurrency
    knowhereThreadPoolNumRatio: 4
    # Use more threads to make better use of SSD throughput in disk index.
    # This parameter is only useful when enable-disk = true.
    # And this value should be a number greater than 1 and less than 32.
    chunkRows: 128 # The number of vectors in a chunk.
    exprEvalBatchSize: 8192 # The batch size for executor get next
    interimIndex: # build a vector temperate index for growing segment or binlog to accelerate search
      enableIndex: true
      nlist: 128 # segment index nlist
      nprobe: 16 # nprobe to search segment, based on your accuracy requirement, must smaller than nlist
      memExpansionRate: 1.15 # the ratio of building interim index memory usage to raw data
  loadMemoryUsageFactor: 1 # The multiply factor of calculating the memory usage while loading segments
  enableDisk: false # enable querynode load disk index, and search on disk index
  maxDiskUsagePercentage: 95
  grouping:
    enabled: true
    maxNQ: 1000
    topKMergeRatio: 20
  scheduler:
    receiveChanSize: 10240
    unsolvedQueueSize: 10240
    # maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
    # Max read concurrency would be the value of runtime.NumCPU * maxReadConcurrentRatio.
    # It defaults to 2.0, which means max read concurrency would be the value of runtime.NumCPU * 2.
    # Max read concurrency must greater than or equal to 1, and less than or equal to runtime.NumCPU * 100.
    # (0, 100]
    maxReadConcurrentRatio: 1
    cpuRatio: 10 # ratio used to estimate read task cpu usage.
    maxTimestampLag: 86400
    # read task schedule policy: fifo(by default), user-task-polling.
    scheduleReadPolicy:
      # fifo: A FIFO queue support the schedule.
      # user-task-polling:
      #     The user's tasks will be polled one by one and scheduled.
      #     Scheduling is fair on task granularity.
      #     The policy is based on the username for authentication.
      #     And an empty username is considered the same user.
      #     When there are no multi-users, the policy decay into FIFO
      name: fifo
      maxPendingTask: 10240
      # user-task-polling configure:
      taskQueueExpire: 60 # 1 min by default, expire time of inner user task queue since queue is empty.
      enableCrossUserGrouping: false # false by default Enable Cross user grouping when using user-task-polling policy. (close it if task of any user can not merge others).
      maxPendingTaskPerUser: 1024 # 50 by default, max pending task in scheduler per user.
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912

indexCoord:
  bindIndexNodeMode:
    enable: false
    withCred: false
  segment:
    minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexed

indexNode:
  scheduler:
    buildParallel: 1
  enableDisk: true # enable index node build disk vector index
  maxDiskUsagePercentage: 95
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912

dataCoord:
  channel:
    watchTimeoutInterval: 300 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.
    balanceSilentDuration: 300 # The duration before the channelBalancer on datacoord to run
    balanceInterval: 360 #The interval for the channelBalancer on datacoord to check balance status
  segment:
    maxSize: 1024 # Maximum size of a segment in MB
    diskSegmentMaxSize: 2048 # Maximum size of a segment in MB for collection which has Disk index
    sealProportion: 0.12
    # The time of the assignment expiration in ms
    # Warning! this parameter is an expert variable and closely related to data integrity. Without specific
    # target and solid understanding of the scenarios, it should not be changed. If it's necessary to alter
    # this parameter, make sure that the newly changed value is larger than the previous value used before restart
    # otherwise there could be a large possibility of data loss
    assignmentExpiration: 2000
    maxLife: 86400 # The max lifetime of segment in seconds, 24*60*60
    # If a segment didn't accept dml records in maxIdleTime and the size of segment is greater than
    # minSizeFromIdleToSealed, Milvus will automatically seal it.
    # The max idle time of segment in seconds, 10*60.
    maxIdleTime: 600
    minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.
    # The max number of binlog file for one segment, the segment will be sealed if
    # the number of binlog file reaches to max value.
    maxBinlogFileNumber: 32
    smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than
    # (smallProportion * segment max # of rows).
    # A compaction will happen on small segments if the segment after compaction will have
    compactableProportion: 0.85
    # over (compactableProportion * segment max # of rows) rows.
    # MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!
    # During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.
    expansionRate: 1.25
    # Whether to enable levelzero segment
    enableLevelZero: false
  enableCompaction: true # Enable data segment compaction
  compaction:
    enableAutoCompaction: true
    rpcTimeout: 10 # compaction rpc request timeout in seconds
    maxParallelTaskNum: 10 # max parallel compaction task number
    indexBasedCompaction: true

    levelzero:
      forceTrigger:
        minSize: 8 # The minmum size in MB to force trigger a LevelZero Compaction
        deltalogMinNum: 10 # the minimum number of deltalog files to force trigger a LevelZero Compaction

  enableGarbageCollection: true
  gc:
    interval: 3600 # gc interval in seconds
    missingTolerance: 3600 # file meta missing tolerance duration in seconds, 3600
    dropTolerance: 10800 # file belongs to dropped entity tolerance duration in seconds. 10800
  enableActiveStandby: false
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912

dataNode:
  dataSync:
    flowGraph:
      maxQueueLength: 16 # Maximum length of task queue in flowgraph
      maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
    maxParallelSyncMgrTasks: 256 #The max concurrent sync task number of datanode sync mgr globally 
    skipMode:
      # when there are only timetick msg in flowgraph for a while (longer than coldTime),
      # flowGraph will turn on skip mode to skip most timeticks to reduce cost, especially there are a lot of channels
      enable: true
      skipNum: 4
      coldTime: 60
  segment:
    insertBufSize: 16777216 # Max buffer size to flush for a single segment.
    deleteBufBytes: 67108864 # Max buffer size to flush del for a single channel
    syncPeriod: 600 # The period to sync segments if buffer is not empty.
  # can specify ip for example
  # ip: 127.0.0.1
  grpc:
    serverMaxSendSize: 536870912
    serverMaxRecvSize: 268435456
    clientMaxSendSize: 268435456
    clientMaxRecvSize: 536870912
  memory:
    forceSyncEnable: true # `true` to force sync if memory usage is too high
    forceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.
    watermarkStandalone: 0.2 # memory watermark for standalone, upon reaching this watermark, segments will be synced.
    watermarkCluster: 0.5 # memory watermark for cluster, upon reaching this watermark, segments will be synced.
  timetick:
    byRPC: true
  channel:
    # specify the size of global work pool of all channels
    # if this parameter <= 0, will set it as the maximum number of CPUs that can be executing
    # suggest to set it bigger on large collection numbers to avoid blocking
    workPoolSize: -1
    # specify the size of global work pool for channel checkpoint updating
    # if this parameter <= 0, will set it as 1000
    # suggest to set it bigger on large collection numbers to avoid blocking
    updateChannelCheckpointMaxParallel: 1000

grpc:
  client:
    compressionEnabled: false
    dialTimeout: 200
    keepAliveTime: 10000
    keepAliveTimeout: 20000
    maxMaxAttempts: 10
    initialBackOff: 0.2 # seconds
    maxBackoff: 10 # seconds
    
quotaAndLimits:
  enabled: true # `true` to enable quota and limits, `false` to disable.
  limits:
    maxCollectionNum: 65536
    maxCollectionNumPerDB: 65536
  # quotaCenterCollectInterval is the time interval that quotaCenter
  # collects metrics from Proxies, Query cluster and Data cluster.
  # seconds, (0 ~ 65536)
  quotaCenterCollectInterval: 3
  ddl:
    enabled: false
    collectionRate: -1 # qps, default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
    partitionRate: -1 # qps, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
  indexRate:
    enabled: false
    max: -1 # qps, default no limit, rate for CreateIndex, DropIndex
  flushRate:
    enabled: false
    max: -1 # qps, default no limit, rate for flush
  compactionRate:
    enabled: false
    max: -1 # qps, default no limit, rate for manualCompaction
  dml:
    # dml limit rates, default no limit.
    # The maximum rate will not be greater than max.
    enabled: false
    insertRate:
      collection:
        max: -1 # MB/s, default no limit
      max: -1 # MB/s, default no limit
    upsertRate:
      collection:
        max: -1 # MB/s, default no limit
      max: -1 # MB/s, default no limit
    deleteRate:
      collection:
        max: -1 # MB/s, default no limit
      max: -1 # MB/s, default no limit
    bulkLoadRate:
      collection:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
      max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
  dql:
    # dql limit rates, default no limit.
    # The maximum rate will not be greater than max.
    enabled: false
    searchRate:
      collection:
        max: -1 # vps (vectors per second), default no limit
      max: -1 # vps (vectors per second), default no limit
    queryRate:
      collection:
        max: -1 # qps, default no limit
      max: -1 # qps, default no limit
  limitWriting:
    # forceDeny false means dml requests are allowed (except for some
    # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
    forceDeny: false
    ttProtection:
      enabled: false
      # maxTimeTickDelay indicates the backpressure for DML Operations.
      # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
      # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
      # seconds
      maxTimeTickDelay: 300
    memProtection:
      # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
      # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
      # When memory usage < memoryLowWaterLevel, no action.
      enabled: true
      dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
      dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
      queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
      queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
    growingSegmentsSizeProtection:
      # No action will be taken if the growing segments size is less than the low watermark.
      # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
      # but the rate will not be lower than `minRateRatio * dmlRate`.
      enabled: false
      minRateRatio: 0.5
      lowWaterLevel: 0.2
      highWaterLevel: 0.4
    diskProtection:
      enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
      diskQuota: -1 # MB, (0, +inf), default no limit
      diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
  limitReading:
    # forceDeny false means dql requests are allowed (except for some
    # specific conditions, such as collection has been dropped), true means always reject all dql requests.
    forceDeny: false
    queueProtection:
      enabled: false
      # nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
      # If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
      # until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
      # int, default no limit
      nqInQueueThreshold: -1
      # queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
      # If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
      # until the latency of queuing no longer exceeds queueLatencyThreshold.
      # The latency here refers to the averaged latency over a period of time.
      # milliseconds, default no limit
      queueLatencyThreshold: -1
    resultProtection:
      enabled: false
      # maxReadResultRate indicated that the system was under backpressure for Search/Query path.
      # If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
      # until the read result rate no longer exceeds maxReadResultRate.
      # MB/s, default no limit
      maxReadResultRate: -1
    # colOffSpeed is the speed of search&query rates cool off.
    # (0, 1]
    coolOffSpeed: 0.9
在弹出的提示对话框中，输入变更原因，单击确定。
配置修改请求提交后，若所修改配置项需要重启以生效，则在配置修改完成后将重启实例。此时实例将进入升级中状态，待配置更新完成后，集群将恢复至运行中状态。