AI Search Open Platform token calculation API details-OpenSearch(Open Search)-阿里云帮助中心

This document explains how to use the API to calculate the token count for service calls to AI Search Open Platform.

Token calculation

In a language model, a token is the smallest unit of text that a model can process. A token can be a word, part of a word, a punctuation mark, or a single character. Different models use different tokenization methods, so the number of characters does not always correspond to the number of tokens. For example, in AI Search Open Platform:

"Apple" corresponds to 1 token.
"test case" corresponds to 3 tokens
The word "OpenSearch" corresponds to two tokens.

AI Search Open Platform meters and bills its large language model services based on the number of input and output tokens. Use the token calculation API to estimate the potential cost of a service call.

Supported models

You can calculate the token count for the following models.

Model classification

Service ID (service_id)

OpenSearch SFT model

ops-qwen-turbo

Qwen model

qwen-turbo

qwen-plus

qwen-max

API reference

Prerequisites

Obtain authentication credentials
To obtain an API key, see Get an API key.
Obtain a service endpoint
You can call the service over the public network or via a VPC. For more information, see Obtain a service endpoint.

General notes

The request body cannot exceed 8 MB.

Request method

POST

URL

{host}/v3/openapi/workspaces/{workspace_name}/text-generation/{service_id}/tokenizer

host: The service endpoint. You can call the API service over the public network or via a VPC. For more information, see Obtain a service endpoint.

On the API Keys page, select a target workspace at the top, such as the default workspace. In the access domain section, you can find the public API domain and private API domain.
workspace_name: The name of the workspace, for example, default.
service_id: The ID of the built-in service, for example, ops-qwen-turbo.

Request parameters

Header parameters

API key authentication

Parameter	Type	Required	Description	Example
Content-Type	String	Yes	The format of the request body. Set the value to `application/json`.	application/json
Authorization	String	Yes	Your API key.	Bearer OS-d1**2a

Body parameters

messages

List

Yes

The conversation history. Each element in the list is an object with a role and content. Valid values for role are system, user, and assistant.

system: A system-level message. This role is optional. If used, it must be the first message (messages[0]) in the conversation history.
user and assistant: The conversation between the user and the model. Messages with these roles must alternate to simulate a natural conversation flow.

[{"role":

"user", "content":

"Test token calculation API"}]

Response parameters

Parameter	Type	Description	Example
request_id	String	A unique identifier for the API request.	310032DA-****-46CC-94D1-0FE789BAE3A7
latency	Float/Int	The request latency in milliseconds (ms).	10
usage	Object	The metering information for the call.	"usage":{"input_tokens":4}
usage.input_tokens	Integer	The number of tokens in the input text.	4
result.token_ids	List<Integer>	The token IDs corresponding to the input text.	[81705,5839,100768,107736]
result.tokens	List<String>	The tokens corresponding to the input text.	["Test","token","calculation","API"]

cURL request example

curl -XPOST -H "Content-Type:application/json" \
"http://****-shanghai.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-generation/ops-qwen-turbo/tokenizer" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
    "messages":[
                    {
                        "role":"user",
                        "content":"Test token calculation API"
                    }
    ]}'

Response examples

Successful response example

{
  "request_id":"9d197d47-d6b5-****-964e-12b893c47a8b",
  "latency":11,
  "usage":{
    "input_tokens":4
  },
  "result":{
    "token_ids":[81705,5839,100768,107736],
    "tokens":["Test","token","calculation","API"]
  }
}

Error response example

If an error occurs, the response includes the code and message fields to explain the error.

{
  "request_id":"388476DB-C4D4-****-A7A6-7594F92885FA",
  "latency":0,
  "code":"InvalidParameter",
  "message":"Messages must be end with role[user]."
}

Status codes

For more information, see the status code description for AI Search Open Platform.