Token calculation

更新时间:
复制 MD 格式

This document explains how to use the API to calculate the token count for service calls to AI Search Open Platform.

Token calculation

In a language model, a token is the smallest unit of text that a model can process. A token can be a word, part of a word, a punctuation mark, or a single character. Different models use different tokenization methods, so the number of characters does not always correspond to the number of tokens. For example, in AI Search Open Platform:

  • "Apple" corresponds to 1 token.

  • "test case" corresponds to 3 tokens

  • The word "OpenSearch" corresponds to two tokens.

AI Search Open Platform meters and bills its large language model services based on the number of input and output tokens. Use the token calculation API to estimate the potential cost of a service call.

Supported models

You can calculate the token count for the following models.

Model classification

Service ID (service_id)

OpenSearch SFT model

ops-qwen-turbo

Qwen model

qwen-turbo

qwen-plus

qwen-max

API reference

Prerequisites

  • Obtain authentication credentials

    To obtain an API key, see Get an API key.

  • Obtain a service endpoint

    You can call the service over the public network or via a VPC. For more information, see Obtain a service endpoint.

General notes

  • The request body cannot exceed 8 MB.

Request method

POST

URL

{host}/v3/openapi/workspaces/{workspace_name}/text-generation/{service_id}/tokenizer
  • host: The service endpoint. You can call the API service over the public network or via a VPC. For more information, see Obtain a service endpoint.

    On the API Keys page, select a target workspace at the top, such as the default workspace. In the access domain section, you can find the public API domain and private API domain.

  • workspace_name: The name of the workspace, for example, default.

  • service_id: The ID of the built-in service, for example, ops-qwen-turbo.

Request parameters

Header parameters

API key authentication

Parameter

Type

Required

Description

Example

Content-Type

String

Yes

The format of the request body. Set the value to application/json.

application/json

Authorization

String

Yes

Your API key.

Bearer OS-d1**2a

Body parameters

messages

List

Yes

The conversation history. Each element in the list is an object with a role and content. Valid values for role are system, user, and assistant.

  • system: A system-level message. This role is optional. If used, it must be the first message (messages[0]) in the conversation history.

  • user and assistant: The conversation between the user and the model. Messages with these roles must alternate to simulate a natural conversation flow.

[{"role":

"user", "content":

"Test token calculation API"}]

Response parameters

Parameter

Type

Description

Example

request_id

String

A unique identifier for the API request.

310032DA-****-46CC-94D1-0FE789BAE3A7

latency

Float/Int

The request latency in milliseconds (ms).

10

usage

Object

The metering information for the call.

"usage":{"input_tokens":4}

usage.input_tokens

Integer

The number of tokens in the input text.

4

result.token_ids

List<Integer>

The token IDs corresponding to the input text.

[81705,5839,100768,107736]

result.tokens

List<String>

The tokens corresponding to the input text.

["Test","token","calculation","API"]

cURL request example

curl -XPOST -H "Content-Type:application/json" \
"http://****-shanghai.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-generation/ops-qwen-turbo/tokenizer" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
    "messages":[
                    {
                        "role":"user",
                        "content":"Test token calculation API"
                    }
    ]}'

Response examples

Successful response example

{
  "request_id":"9d197d47-d6b5-****-964e-12b893c47a8b",
  "latency":11,
  "usage":{
    "input_tokens":4
  },
  "result":{
    "token_ids":[81705,5839,100768,107736],
    "tokens":["Test","token","calculation","API"]
  }
}

Error response example

If an error occurs, the response includes the code and message fields to explain the error.

{
  "request_id":"388476DB-C4D4-****-A7A6-7594F92885FA",
  "latency":0,
  "code":"InvalidParameter",
  "message":"Messages must be end with role[user]."
}

Status codes

For more information, see the status code description for AI Search Open Platform.