更新实例配置

更新时间:

阿里云Milvus提供了通过控制台的方式查询和修改实例的配置信息。本文将介绍如何在控制台上更新Milvus实例配置,以满足不同的业务需求。

操作步骤

  1. 进入实例配置页面。

    1. 登录阿里云Milvus控制台

    2. 在左侧导航栏,单击Milvus实例

    3. 在顶部菜单栏处,根据实际情况选择地域。

    4. Milvus实例页面,单击目标实例名称。

    5. 单击实例配置页签。

  2. 实例配置输入框内输入需要覆盖默认配置的参数,然后单击保存配置

    参数格式为YAML格式,相关配置参数示例如下所示。

    # Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requests
    rootCoord:
      maxDatabaseNum: 64 # Maximum number of database
      maxPartitionNum: 4096 # Maximum number of partitions in a collection
      minSegmentSizeToEnableIndex: 1024 # It's a threshold. When the segment size is less than this value, the segment will not be indexed
      importTaskExpiration: 900 # (in seconds) Duration after which an import task will expire (be killed). Default 900 seconds (15 minutes).
      importTaskRetention: 86400 # (in seconds) Milvus will keep the record of import tasks for at least `importTaskRetention` seconds. Default 86400, seconds (24 hours).
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
        
    # Related configuration of proxy, used to validate client requests and reduce the returned results.
    proxy:
      timeTickInterval: 200 # ms, the interval that proxy synchronize the time tick
      healthCheckTimeout: 3000 # ms, the interval that to do component healthy check
      maxNameLength: 255 # Maximum length of name for a collection or alias
      # Maximum number of fields in a collection.
      # As of today (2.2.0 and after) it is strongly DISCOURAGED to set maxFieldNum >= 64.
      # So adjust at your risk!
      maxFieldNum: 64
      maxTaskNum: 1024 # max task number of proxy task queue
      grpc:
        serverMaxSendSize: 268435456
        serverMaxRecvSize: 67108864
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 67108864
    
    # Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.
    queryCoord:
      autoHandoff: true # Enable auto handoff
      autoBalance: true # Enable auto balance
      balancer: ScoreBasedBalancer # Balancer to use
      overloadedMemoryThresholdPercentage: 90 # The threshold percentage that memory overload
      balanceIntervalSeconds: 60
      memoryUsageMaxDifferencePercentage: 30
      checkInterval: 1000
      channelTaskTimeout: 60000 # 1 minute
      segmentTaskTimeout: 120000 # 2 minute
      distPullInterval: 500
      heartbeatAvailableInterval: 10000 # 10s, Only QueryNodes which fetched heartbeats within the duration are available
      loadTimeoutSeconds: 600
      checkHandoffInterval: 5000
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
    
    # Related configuration of queryNode, used to run hybrid search between vector and scalar data.
    queryNode:
      dataSync:
        flowGraph:
          maxQueueLength: 16 # Maximum length of task queue in flowgraph
          maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
      stats:
        publishInterval: 1000 # Interval for querynode to report node information (milliseconds)
      segcore:
        cgoPoolSizeRatio: 2.0 # cgo pool size ratio to max read concurrency
        knowhereThreadPoolNumRatio: 4
        # Use more threads to make better use of SSD throughput in disk index.
        # This parameter is only useful when enable-disk = true.
        # And this value should be a number greater than 1 and less than 32.
        chunkRows: 128 # The number of vectors in a chunk.
        exprEvalBatchSize: 8192 # The batch size for executor get next
        interimIndex: # build a vector temperate index for growing segment or binlog to accelerate search
          enableIndex: true
          nlist: 128 # segment index nlist
          nprobe: 16 # nprobe to search segment, based on your accuracy requirement, must smaller than nlist
          memExpansionRate: 1.15 # the ratio of building interim index memory usage to raw data
      loadMemoryUsageFactor: 1 # The multiply factor of calculating the memory usage while loading segments
      enableDisk: false # enable querynode load disk index, and search on disk index
      maxDiskUsagePercentage: 95
      grouping:
        enabled: true
        maxNQ: 1000
        topKMergeRatio: 20
      scheduler:
        receiveChanSize: 10240
        unsolvedQueueSize: 10240
        # maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
        # Max read concurrency would be the value of runtime.NumCPU * maxReadConcurrentRatio.
        # It defaults to 2.0, which means max read concurrency would be the value of runtime.NumCPU * 2.
        # Max read concurrency must greater than or equal to 1, and less than or equal to runtime.NumCPU * 100.
        # (0, 100]
        maxReadConcurrentRatio: 1
        cpuRatio: 10 # ratio used to estimate read task cpu usage.
        maxTimestampLag: 86400
        # read task schedule policy: fifo(by default), user-task-polling.
        scheduleReadPolicy:
          # fifo: A FIFO queue support the schedule.
          # user-task-polling:
          #     The user's tasks will be polled one by one and scheduled.
          #     Scheduling is fair on task granularity.
          #     The policy is based on the username for authentication.
          #     And an empty username is considered the same user.
          #     When there are no multi-users, the policy decay into FIFO
          name: fifo
          maxPendingTask: 10240
          # user-task-polling configure:
          taskQueueExpire: 60 # 1 min by default, expire time of inner user task queue since queue is empty.
          enableCrossUserGrouping: false # false by default Enable Cross user grouping when using user-task-polling policy. (close it if task of any user can not merge others).
          maxPendingTaskPerUser: 1024 # 50 by default, max pending task in scheduler per user.
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
    
    indexCoord:
      bindIndexNodeMode:
        enable: false
        withCred: false
      segment:
        minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexed
    
    indexNode:
      scheduler:
        buildParallel: 1
      enableDisk: true # enable index node build disk vector index
      maxDiskUsagePercentage: 95
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
    
    dataCoord:
      channel:
        watchTimeoutInterval: 300 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.
        balanceSilentDuration: 300 # The duration before the channelBalancer on datacoord to run
        balanceInterval: 360 #The interval for the channelBalancer on datacoord to check balance status
      segment:
        maxSize: 1024 # Maximum size of a segment in MB
        diskSegmentMaxSize: 2048 # Maximum size of a segment in MB for collection which has Disk index
        sealProportion: 0.12
        # The time of the assignment expiration in ms
        # Warning! this parameter is an expert variable and closely related to data integrity. Without specific
        # target and solid understanding of the scenarios, it should not be changed. If it's necessary to alter
        # this parameter, make sure that the newly changed value is larger than the previous value used before restart
        # otherwise there could be a large possibility of data loss
        assignmentExpiration: 2000
        maxLife: 86400 # The max lifetime of segment in seconds, 24*60*60
        # If a segment didn't accept dml records in maxIdleTime and the size of segment is greater than
        # minSizeFromIdleToSealed, Milvus will automatically seal it.
        # The max idle time of segment in seconds, 10*60.
        maxIdleTime: 600
        minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.
        # The max number of binlog file for one segment, the segment will be sealed if
        # the number of binlog file reaches to max value.
        maxBinlogFileNumber: 32
        smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than
        # (smallProportion * segment max # of rows).
        # A compaction will happen on small segments if the segment after compaction will have
        compactableProportion: 0.85
        # over (compactableProportion * segment max # of rows) rows.
        # MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!
        # During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.
        expansionRate: 1.25
        # Whether to enable levelzero segment
        enableLevelZero: false
      enableCompaction: true # Enable data segment compaction
      compaction:
        enableAutoCompaction: true
        rpcTimeout: 10 # compaction rpc request timeout in seconds
        maxParallelTaskNum: 10 # max parallel compaction task number
        indexBasedCompaction: true
    
        levelzero:
          forceTrigger:
            minSize: 8 # The minmum size in MB to force trigger a LevelZero Compaction
            deltalogMinNum: 10 # the minimum number of deltalog files to force trigger a LevelZero Compaction
    
      enableGarbageCollection: true
      gc:
        interval: 3600 # gc interval in seconds
        missingTolerance: 3600 # file meta missing tolerance duration in seconds, 3600
        dropTolerance: 10800 # file belongs to dropped entity tolerance duration in seconds. 10800
      enableActiveStandby: false
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
    
    dataNode:
      dataSync:
        flowGraph:
          maxQueueLength: 16 # Maximum length of task queue in flowgraph
          maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
        maxParallelSyncMgrTasks: 256 #The max concurrent sync task number of datanode sync mgr globally 
        skipMode:
          # when there are only timetick msg in flowgraph for a while (longer than coldTime),
          # flowGraph will turn on skip mode to skip most timeticks to reduce cost, especially there are a lot of channels
          enable: true
          skipNum: 4
          coldTime: 60
      segment:
        insertBufSize: 16777216 # Max buffer size to flush for a single segment.
        deleteBufBytes: 67108864 # Max buffer size to flush del for a single channel
        syncPeriod: 600 # The period to sync segments if buffer is not empty.
      # can specify ip for example
      # ip: 127.0.0.1
      grpc:
        serverMaxSendSize: 536870912
        serverMaxRecvSize: 268435456
        clientMaxSendSize: 268435456
        clientMaxRecvSize: 536870912
      memory:
        forceSyncEnable: true # `true` to force sync if memory usage is too high
        forceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.
        watermarkStandalone: 0.2 # memory watermark for standalone, upon reaching this watermark, segments will be synced.
        watermarkCluster: 0.5 # memory watermark for cluster, upon reaching this watermark, segments will be synced.
      timetick:
        byRPC: true
      channel:
        # specify the size of global work pool of all channels
        # if this parameter <= 0, will set it as the maximum number of CPUs that can be executing
        # suggest to set it bigger on large collection numbers to avoid blocking
        workPoolSize: -1
        # specify the size of global work pool for channel checkpoint updating
        # if this parameter <= 0, will set it as 1000
        # suggest to set it bigger on large collection numbers to avoid blocking
        updateChannelCheckpointMaxParallel: 1000
    
    grpc:
      client:
        compressionEnabled: false
        dialTimeout: 200
        keepAliveTime: 10000
        keepAliveTimeout: 20000
        maxMaxAttempts: 10
        initialBackOff: 0.2 # seconds
        maxBackoff: 10 # seconds
        
    quotaAndLimits:
      enabled: true # `true` to enable quota and limits, `false` to disable.
      limits:
        maxCollectionNum: 65536
        maxCollectionNumPerDB: 65536
      # quotaCenterCollectInterval is the time interval that quotaCenter
      # collects metrics from Proxies, Query cluster and Data cluster.
      # seconds, (0 ~ 65536)
      quotaCenterCollectInterval: 3
      ddl:
        enabled: false
        collectionRate: -1 # qps, default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
        partitionRate: -1 # qps, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
      indexRate:
        enabled: false
        max: -1 # qps, default no limit, rate for CreateIndex, DropIndex
      flushRate:
        enabled: false
        max: -1 # qps, default no limit, rate for flush
      compactionRate:
        enabled: false
        max: -1 # qps, default no limit, rate for manualCompaction
      dml:
        # dml limit rates, default no limit.
        # The maximum rate will not be greater than max.
        enabled: false
        insertRate:
          collection:
            max: -1 # MB/s, default no limit
          max: -1 # MB/s, default no limit
        upsertRate:
          collection:
            max: -1 # MB/s, default no limit
          max: -1 # MB/s, default no limit
        deleteRate:
          collection:
            max: -1 # MB/s, default no limit
          max: -1 # MB/s, default no limit
        bulkLoadRate:
          collection:
            max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
          max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
      dql:
        # dql limit rates, default no limit.
        # The maximum rate will not be greater than max.
        enabled: false
        searchRate:
          collection:
            max: -1 # vps (vectors per second), default no limit
          max: -1 # vps (vectors per second), default no limit
        queryRate:
          collection:
            max: -1 # qps, default no limit
          max: -1 # qps, default no limit
      limitWriting:
        # forceDeny false means dml requests are allowed (except for some
        # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
        forceDeny: false
        ttProtection:
          enabled: false
          # maxTimeTickDelay indicates the backpressure for DML Operations.
          # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
          # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
          # seconds
          maxTimeTickDelay: 300
        memProtection:
          # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
          # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
          # When memory usage < memoryLowWaterLevel, no action.
          enabled: true
          dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
          dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
          queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
          queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
        growingSegmentsSizeProtection:
          # No action will be taken if the growing segments size is less than the low watermark.
          # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
          # but the rate will not be lower than `minRateRatio * dmlRate`.
          enabled: false
          minRateRatio: 0.5
          lowWaterLevel: 0.2
          highWaterLevel: 0.4
        diskProtection:
          enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
          diskQuota: -1 # MB, (0, +inf), default no limit
          diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
      limitReading:
        # forceDeny false means dql requests are allowed (except for some
        # specific conditions, such as collection has been dropped), true means always reject all dql requests.
        forceDeny: false
        queueProtection:
          enabled: false
          # nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
          # If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
          # until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
          # int, default no limit
          nqInQueueThreshold: -1
          # queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
          # If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
          # until the latency of queuing no longer exceeds queueLatencyThreshold.
          # The latency here refers to the averaged latency over a period of time.
          # milliseconds, default no limit
          queueLatencyThreshold: -1
        resultProtection:
          enabled: false
          # maxReadResultRate indicated that the system was under backpressure for Search/Query path.
          # If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
          # until the read result rate no longer exceeds maxReadResultRate.
          # MB/s, default no limit
          maxReadResultRate: -1
        # colOffSpeed is the speed of search&query rates cool off.
        # (0, 1]
        coolOffSpeed: 0.9
  3. 在弹出的提示对话框中,输入变更原因,单击确定

    配置修改请求提交后,若所修改配置项需要重启以生效,则在配置修改完成后将重启实例。此时实例将进入升级中状态,待配置更新完成后,集群将恢复至运行中状态。