This document explains how to use the API to calculate the token count for service calls to AI Search Open Platform.
Token calculation
In a language model, a token is the smallest unit of text that a model can process. A token can be a word, part of a word, a punctuation mark, or a single character. Different models use different tokenization methods, so the number of characters does not always correspond to the number of tokens. For example, in AI Search Open Platform:
-
"Apple"corresponds to 1 token. -
"test case"corresponds to 3 tokens -
The word
"OpenSearch"corresponds to two tokens.
AI Search Open Platform meters and bills its large language model services based on the number of input and output tokens. Use the token calculation API to estimate the potential cost of a service call.
Supported models
You can calculate the token count for the following models.
|
Model classification |
Service ID (service_id) |
|
OpenSearch SFT model |
ops-qwen-turbo |
|
Qwen model |
qwen-turbo qwen-plus qwen-max |
API reference
Prerequisites
Obtain authentication credentials
To obtain an API key, see Get an API key.
Obtain a service endpoint
You can call the service over the public network or via a VPC. For more information, see Obtain a service endpoint.
General notes
-
The request body cannot exceed 8 MB.
Request method
POST
URL
{host}/v3/openapi/workspaces/{workspace_name}/text-generation/{service_id}/tokenizer
-
host: The service endpoint. You can call the API service over the public network or via a VPC. For more information, see Obtain a service endpoint.
On the API Keys page, select a target workspace at the top, such as the default workspace. In the access domain section, you can find the public API domain and private API domain.
-
workspace_name: The name of the workspace, for example, default.
-
service_id: The ID of the built-in service, for example, ops-qwen-turbo.
Request parameters
Header parameters
API key authentication
|
Parameter |
Type |
Required |
Description |
Example |
|
Content-Type |
String |
Yes |
The format of the request body. Set the value to |
application/json |
|
Authorization |
String |
Yes |
Your API key. |
Bearer OS-d1**2a |
Body parameters
|
messages |
List |
Yes |
The conversation history. Each element in the list is an object with a
|
[{"role": "user", "content": "Test token calculation API"}] |
Response parameters
|
Parameter |
Type |
Description |
Example |
|
request_id |
String |
A unique identifier for the API request. |
310032DA-****-46CC-94D1-0FE789BAE3A7 |
|
latency |
Float/Int |
The request latency in milliseconds (ms). |
10 |
|
usage |
Object |
The metering information for the call. |
"usage":{"input_tokens":4} |
|
usage.input_tokens |
Integer |
The number of tokens in the input text. |
4 |
|
result.token_ids |
List<Integer> |
The token IDs corresponding to the input text. |
[81705,5839,100768,107736] |
|
result.tokens |
List<String> |
The tokens corresponding to the input text. |
["Test","token","calculation","API"] |
cURL request example
curl -XPOST -H "Content-Type:application/json" \
"http://****-shanghai.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-generation/ops-qwen-turbo/tokenizer" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"messages":[
{
"role":"user",
"content":"Test token calculation API"
}
]}'
Response examples
Successful response example
{
"request_id":"9d197d47-d6b5-****-964e-12b893c47a8b",
"latency":11,
"usage":{
"input_tokens":4
},
"result":{
"token_ids":[81705,5839,100768,107736],
"tokens":["Test","token","calculation","API"]
}
}
Error response example
If an error occurs, the response includes the code and message fields to explain the error.
{
"request_id":"388476DB-C4D4-****-A7A6-7594F92885FA",
"latency":0,
"code":"InvalidParameter",
"message":"Messages must be end with role[user]."
}
Status codes
For more information, see the status code description for AI Search Open Platform.