Adjust parameters

更新时间:
复制 MD 格式

If a deployed model's configuration does not meet your current business needs, you can adjust the parameters online using a RESTful API operation. This method dynamically updates the model configuration file without affecting the original model or the model service. This topic describes how to use this API operation to adjust some parameter settings.

Prerequisites

A model has been created or imported, and its status is READY. For more information, see View model details.

API operations

POST v1/ai/models/${MODEL_NAME}/update_config

Request parameters

Parameter

Type

Description

instance_count

INT

The number of model instances to update. Increasing the number of model instances improves inference performance but also increases GPU memory usage.

max_batch_size

INT

The maximum batch size to support. The default value is 1024. The value range is [1,1024].

Important

Only embedding models support updating the max_batch_size value that was set during deployment.

Example

Request:

POST v1/ai/models/bge_m3_model/update_config HTTP/1.1
Content-Type: application/json
{
    "instance_count": "4",
    "max_batch_size": "1024"
}

Response:

HTTP/1.1 200 OK
Date: Tue, 28 Nov 2023 03:18:55 GMT
Content-type: application/json
Content-length: 17

{
  "code": 0,
  "msg": "SUCCESS",
  "data": null,
  "success": true
}