You can call the GetIndexMonitor operation to query monitoring data for a specified knowledge base within a specific time range. This data is crucial for App Performance Analytics, capacity planning, and cost management. The monitoring data includes two main dimensions: storage and retrieval. Storage monitoring retrieves the index storage limit and current usage of the knowledge base. Retrieval monitoring retrieves performance metrics for the query period, such as peak queries per second (QPS), total requests, and average QPS. The metrics are provided as totals and are also broken down by time window. The requests are categorized as successful, failed, and rate-limited.
Operation description
Before you call this operation, a RAM user must obtain the required API permissions for Alibaba Cloud Model Studio (which requires the
AliyunBailianDataFullAccesspermission) and join a workspace. Alibaba Cloud accounts can call this operation directly without authorization. You can call this operation using the latest version of the Alibaba Cloud Model Studio software development kit (SDK). Before you call this operation, make sure that the specified knowledge base has been created and has not been deleted. This means that the knowledge base ID (IndexId) must be valid. This operation is idempotent. The maximum query time range (EndTimestamp - StartTimestamp) is 30 days. The granularity of the time window in the returned data is dynamically adjusted based on the query time range.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
sfm:GetIndexMonitor |
get |
*All Resource
|
None | None |
Request syntax
GET /{WorkspaceId}/rag/index/monitor HTTP/1.1
Path Parameters
|
Parameter |
Type |
Required |
Description |
Example |
| WorkspaceId |
string |
Yes |
The ID of the workspace where the knowledge base is located. |
llm-3shx2gu255oqxxxx |
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| IndexId |
string |
Yes |
The unique ID of the target knowledge base. |
kb-123456xxxx |
| StartTimestamp |
integer |
Yes |
The start of the time range to query. This is a UNIX timestamp in seconds. |
1767604500 |
| EndTimestamp |
integer |
Yes |
The end of the time range to query. The end time can be a maximum of 30 days after the start time. This is a UNIX timestamp in seconds. |
1767604500 |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
Schema of Response |
||
| RequestId |
string |
The request ID. |
778C0B3B-xxxx-5FC1-A947-36EDD13606AB |
| Code |
string |
The status code. |
200 |
| Data |
any |
The core data object of the response. pipelineCommercialType (String): The edition of the knowledge base.
storageMonitorData (Object): The storage monitoring data of the knowledge base.
pipelineCommercialCu (Integer): The number of RCU for the Ultimate Edition knowledge base. For example: 2. qpsMonitorData (Object): The aggregated retrieval monitoring data for the knowledge base over the entire query period.
|
{ "code": "Success", "status_code": 200, "data": { "pipelineCommercialType": "standard", "storageMonitorData": Object{...}, "qpsMonitorData": Object{...} }, "success": true, "message": "success", "request_id": "65d34b79-b97e-478e-a0a3-xxx", "status": "SUCCESS" } |
| Message |
string |
The status message. |
success |
| Success |
boolean |
Indicates whether the request was successful. |
true |
| Status |
integer |
The status code returned by the operation. |
SUCCESS |
Examples
Success response
JSON format
{
"RequestId": "778C0B3B-xxxx-5FC1-A947-36EDD13606AB",
"Code": "200",
"Data": "{\n \"code\": \"Success\",\n \"status_code\": 200,\n \"data\": {\n\"pipelineCommercialType\": \"standard\", \"storageMonitorData\": Object{...},\n \"qpsMonitorData\": Object{...}\n },\n \"success\": true,\n \"message\": \"success\",\n \"request_id\": \"65d34b79-b97e-478e-a0a3-xxx\",\n \"status\": \"SUCCESS\"\n}",
"Message": "success",
"Success": true,
"Status": 0
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.