Input and output parameters for calling Qwen-MT through the OpenAI compatible interface or the DashScope API.
References: Machine translation (Qwen-MT)
OpenAI compatible
Beijing region
base_url for SDK: https://dashscope.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
Singapore region
base_url for SDK: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/chat/completions
Virginia region
base_url for SDK: https://dashscope-us.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions
Singapore region
base_url for SDK: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/chat/completions
Virginia region
base_url for SDK: https://dashscope-us.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions
Beijing region
base_url for SDK: https://dashscope.aliyuncs.com/compatible-mode/v1
HTTP endpoint: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
First, create an API key and configure it as an environment variable . If you use the OpenAI SDK, install the SDK .
Request body |
Basic usagePython
Node.js
curlEndpoints and API keys differ by region. The following is the Beijing endpoint.
Term interventionPython
Node.js
curlEndpoints and API keys differ by region. The following is the Beijing endpoint.
Translation memoryPython
Node.js
curlEndpoints and API keys differ by region. The following is the Beijing endpoint.
Domain promptingPython
Node.js
curlEndpoints and API keys differ by region. The following is the Beijing endpoint.
|
|
model Model name. Supported: qwen-mt-plus, qwen-mt-flash, qwen-mt-lite, qwen-mt-turbo. |
|
|
messages Array of messages providing context to the model. Only user messages are supported. |
|
|
stream Enable streaming output mode. Valid values:
Note
qwen-mt-flash and qwen-mt-lite return data incrementally (each chunk contains only new content). qwen-mt-plus and qwen-mt-turbo return data non-incrementally (each chunk contains entire sequence to date). This behavior cannot be changed. Example: I I didn I didn't I didn't laugh I didn't laugh after ... |
|
|
stream_options The configuration items for streaming output. This parameter takes effect only when |
|
|
max_tokens Maximum number of tokens to generate. If the output exceeds this value, the response is truncated. The default and maximum values are the maximum output length of the model. For more information, see Model selection. |
|
|
seed Random number seed for reproducible results. Using the same Value range: |
|
|
temperature Sampling temperature that controls the diversity of generated text. Higher values produce more diverse text. Lower values produce more deterministic text. Value range: [0, 2) Both |
|
|
top_p Probability threshold for nucleus sampling that controls the diversity of generated text. Higher values produce more diverse text. Lower values produce more deterministic text. Value range: (0, 1.0] Both |
|
|
top_k Size of the candidate set for sampling during generation. For example, setting this to 50 means only the top 50 tokens by score form the sampling pool. Larger values increase randomness; smaller values increase determinism. If the value is None or greater than 100, top_k is disabled and only top_p takes effect. The value must be greater than or equal to 0. Non-standard OpenAI parameter. Python SDK: place in extra_body object |
|
|
repetition_penalty Penalty for repetition in consecutive sequences. Higher values reduce repetition. A value of 1.0 applies no penalty. Must be greater than 0, with no strict upper limit. Non-standard OpenAI parameter. Python SDK: place in extra_body object |
|
|
translation_options Translation parameters. Non-standard OpenAI parameter. Python SDK: place in extra_body object |
Chat response object (non-streaming output) |
|
|
id Unique request ID. |
|
|
choices Array of model-generated content. |
|
|
created The UNIX timestamp when the request was created. |
|
|
model The model used for the request. |
|
|
object This is always |
|
|
service_tier Currently fixed to |
|
|
system_fingerprint Currently fixed to |
|
|
usage Token consumption for the request. |
Chat response chunk object (streaming output) |
Incremental output
Non-incremental output
|
|
id The unique ID of the call. Each chunk object has the same ID. |
|
|
choices An array of content generated by the model. If |
|
|
created The UNIX timestamp when the request was created. Each chunk has the same timestamp. |
|
|
model The model used for the request. |
|
|
object This is always |
|
|
service_tier Currently fixed to |
|
|
system_fingerprint Currently fixed to |
|
|
usage The tokens consumed by the request. This is returned in the last chunk only when |
DashScope
Beijing
HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation
You do not need to configure base_url for SDK calls. The default value is https://dashscope.aliyuncs.com/api/v1.
Singapore
HTTP endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation
base_url to:
Python code
dashscope.base_http_api_url = 'https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1'
Java code
-
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1"); -
Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1";
Virginia
HTTP endpoint: POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation
base_url to:
Python code
dashscope.base_http_api_url = 'https://dashscope-us.aliyuncs.com/api/v1'
Java code
-
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-us.aliyuncs.com/api/v1"); -
Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://dashscope-us.aliyuncs.com/api/v1";
Singapore
HTTP endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation
Set base_url to:
Python code
dashscope.base_http_api_url = 'https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1'
Java code
-
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1"); -
Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1";
Virginia
HTTP endpoint: POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation
Set base_url to:
Python code
dashscope.base_http_api_url = 'https://dashscope-us.aliyuncs.com/api/v1'
Java code
-
Method 1:
import com.alibaba.dashscope.protocol.Protocol; Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-us.aliyuncs.com/api/v1"); -
Method 2:
import com.alibaba.dashscope.utils.Constants; Constants.baseHttpApiUrl="https://dashscope-us.aliyuncs.com/api/v1";
Beijing
HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation
You do not need to configure base_url for SDK calls. The default value is https://dashscope.aliyuncs.com/api/v1.
You must create an API key and export the API key as an environment variable . If using the DashScope SDK, install the DashScope SDK .
Request body |
Basic usagePython
Java
curlEach region's endpoint and API key are different. The following is the endpoint for the Beijing region.
Term interventionPython
Java
curlEach region's endpoint and API key are different. The following is the endpoint for the Beijing region.
Translation memoryPython
Java
curlEach region's endpoint and API key are different. The following is the endpoint for the Beijing region.
Domain promptingPython
Java
curlEach region's endpoint and API key are different. The following is the endpoint for the Beijing region.
|
|
model Model name. Supported: qwen-mt-plus, qwen-mt-flash, qwen-mt-lite, qwen-mt-turbo. |
|
|
messages Array of messages providing context to the model. Only user messages are supported. |
|
|
max_tokens Maximum number of tokens to generate. If the output exceeds this value, the response is truncated. The default and maximum values are the maximum output length of the model. For more information, see Model selection. In the Java SDK, the parameter is maxTokens. For HTTP calls, place max_tokens in the parameters object. |
|
|
seed Random number seed for reproducible results. Using the same Value range: When you make an HTTP call, place seed in the parameters object. |
|
|
temperature Sampling temperature that controls the diversity of generated text. Higher values produce more diverse text. Lower values produce more deterministic text. Value range: [0, 2) Both When you make an HTTP call, place temperature in the parameters object. |
|
|
top_p Probability threshold for nucleus sampling that controls the diversity of generated text. Higher values produce more diverse text. Lower values produce more deterministic text. Value range: (0, 1.0] Both In the Java SDK, the parameter is topPparameters object. |
|
|
repetition_penalty Penalty for repetition in consecutive sequences. Higher values reduce repetition. A value of 1.0 applies no penalty. Must be greater than 0, with no strict upper limit. In the Java SDK, the parameter is repetitionPenalty. For HTTP calls, add repetition_penalty to the parameters object. |
|
|
top_k Size of the candidate set for sampling during generation. For example, setting this to 50 means only the top 50 tokens by score form the sampling pool. Larger values increase randomness; smaller values increase determinism. If the value is None or greater than 100, top_k is disabled and only top_p takes effect. The value must be greater than or equal to 0. In the Java SDK, the parameter is topK. When you make an HTTP call, set top_k in the parameters object. |
|
|
stream Enable streaming output mode. Valid values:
Note
qwen-mt-flash and qwen-mt-lite return data incrementally (each chunk contains only new content). qwen-mt-plus and qwen-mt-turbo return data non-incrementally (each chunk contains entire sequence to date). This behavior cannot be changed. Example: I I didn I didn't I didn't laugh I didn't laugh after ... This parameter is supported only by the Python SDK. To implement streaming output with the Java SDK, call the |
|
|
translation_options Translation parameters. In the Java SDK, the parameter is
|
Chat response object (same for streaming and non-streaming output) |
|
|
status_code Request status code. 200 indicates success; other values indicate failure. The Java SDK does not return this parameter. If the call fails, an exception is thrown. The exception message contains the content of status_code and message. |
|
|
request_id Unique request ID. In the Java SDK, the returned parameter is requestId. |
|
|
code Error code. Empty on success. Only the Python SDK returns this parameter. |
|
|
output Call result. |
|
|
usage Token usage for the request. |
Error codes
If the call fails, see Error codes to resolve the issue.