DashScope API reference

更新时间:
复制 MD 格式

Call models using the DashScope API. It includes descriptions of request and response parameters and provides code examples.

China (Beijing)

HTTP request endpoint:

  • Plain text models (such as qwen-plus):POST https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation

  • Multimodal models (such as qwen3.7-plus or qwen3-vl-plus):POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

You do not need to configure the base_url for SDK calls.

Singapore

HTTP request endpoint:

  • Plain text models (such as qwen-plus):POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation

  • Multimodal models (such as qwen3.7-plus or qwen3-vl-plus):POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

Replace WorkspaceId with your actual workspace ID.

SDK call configuration base_url:

Replace WorkspaceId with your actual workspace ID.

Python code

dashscope.base_http_api_url = 'https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1'

Java code

  • Method 1:

    import com.alibaba.dashscope.protocol.Protocol;
    Generation gen = new Generation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1");
  • Method 2:

    import com.alibaba.dashscope.utils.Constants;
    Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1";

US (Virginia)

HTTP request endpoint:

  • Plain text models:POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/text-generation/generation

  • Qwen-VL models:POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

SDK call configuration base_url:

Python code

dashscope.base_http_api_url = 'https://dashscope-us.aliyuncs.com/api/v1'

Java code

  • Method 1:

    import com.alibaba.dashscope.protocol.Protocol;
    Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-us.aliyuncs.com/api/v1");
  • Method 2:

    import com.alibaba.dashscope.utils.Constants;
    Constants.baseHttpApiUrl="https://dashscope-us.aliyuncs.com/api/v1";

Germany (Frankfurt)

HTTP request endpoint:

  • Plain text model (such as qwen-plus): POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation

  • For multimodal models, such as qwen3.7-plus or qwen3-vl-plus: POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

When you make the call, replace WorkspaceId with your actual Workspace ID.

SDK call configuration base_url:

Python code

Replace WorkspaceId with your actual Workspace ID.

dashscope.base_http_api_url = 'https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1'

Java code

When you call, replace WorkspaceId with your actual Workspace ID.

  • Method 1:

    import com.alibaba.dashscope.protocol.Protocol;
    Generation gen = new Generation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1");
  • Method 2:

    import com.alibaba.dashscope.utils.Constants;
    Constants.baseHttpApiUrl="https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1";

Japan (Tokyo)

HTTP request endpoint:

  • Plain text models (such as qwen-plus):POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/api/v1/services/aigc/text-generation/generation

  • Multimodal models (such as qwen3.7-plus or qwen3-vl-plus):POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

Replace WorkspaceId with your actual workspace ID.

SDK call configuration base_url:

Python code

Replace WorkspaceId with your actual workspace ID.

dashscope.base_http_api_url = 'https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/api/v1'

Java code

Replace WorkspaceId with your actual workspace ID.

  • Method 1:

    import com.alibaba.dashscope.protocol.Protocol;
    Generation gen = new Generation(Protocol.HTTP.getValue(), "https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/api/v1");
  • Method 2:

    import com.alibaba.dashscope.utils.Constants;
    Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/api/v1";
Important

Model Studio has released workspace-specific domains for the Singapore regions. The new dedicated domains deliver superior performance and higher stability for inference requests. We recommend migrating to the new domains:

  • Singapore: from https://dashscope-intl.aliyuncs.com to https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com

{WorkspaceId} is your workspace ID, which can be found on the Workspace Details page in the Model Studio console. The existing domain remains fully functional.

Before you begin, make sure you have got an API key and configured it as an environment variable. If you use the DashScope SDK, you must also install the SDK.

Request body

Text input

Python

import os
import dashscope

messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': 'Who are you?'}
]
response = dashscope.Generation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model="qwen-plus", # This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages,
    result_format='message'
    )
print(response)

Java

// DashScope SDK V2.12.0 or later is recommended.
import java.util.Arrays;
import java.lang.System;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;

public class Main {
    public static GenerationResult callWithMessage() throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation();
        Message systemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant.")
                .build();
        Message userMsg = Message.builder()
                .role(Role.USER.getValue())
                .content("Who are you?")
                .build();
        GenerationParam param = GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-plus")
                .messages(Arrays.asList(systemMsg, userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .build();
        return gen.call(param);
    }
    public static void main(String[] args) {
        try {
            GenerationResult result = callWithMessage();
            System.out.println(JsonUtils.toJson(result));
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            // Use a logging framework to record the exception information.
            System.err.println("An error occurred while calling the generation service: " + e.getMessage());
        }
        System.exit(0);
    }
}

PHP (HTTP)

<?php

$url = "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation";
$apiKey = getenv('DASHSCOPE_API_KEY');

$data = [
    // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    "model" => "qwen-plus",
    "input" => [
        "messages" => [
            [
                "role" => "system",
                "content" => "You are a helpful assistant."
            ],
            [
                "role" => "user",
                "content" => "Who are you?"
            ]
        ]
    ],
    "parameters" => [
        "result_format" => "message"
    ]
];

$jsonData = json_encode($data);

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $jsonData);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    "Authorization: Bearer $apiKey",
    "Content-Type: application/json"
]);

$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

if ($httpCode == 200) {
    echo "Response: " . $response;
} else {
    echo "Error: " . $httpCode . " - " . $response;
}

curl_close($ch);
?>

Node.js (HTTP)

DashScope does not provide an SDK for Node.js. To make calls using the OpenAI Node.js SDK, see the OpenAI section in this topic.

import fetch from 'node-fetch';

const apiKey = process.env.DASHSCOPE_API_KEY;

const data = {
    model: "qwen-plus", // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    input: {
        messages: [
            {
                role: "system",
                content: "You are a helpful assistant."
            },
            {
                role: "user",
                content: "Who are you?"
            }
        ]
    },
    parameters: {
        result_format: "message"
    }
};

fetch('https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation', {
    method: 'POST',
    headers: {
        'Authorization': `Bearer ${apiKey}`,
        'Content-Type': 'application/json'
    },
    body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => {
    console.log(JSON.stringify(data));
})
.catch(error => {
    console.error('Error:', error);
});

C# (HTTP)

using System.Net.Http.Headers;
using System.Text;

class Program
{
    private static readonly HttpClient httpClient = new HttpClient();

    static async Task Main(string[] args)
    {
        // If you have not configured the environment variable, replace the following line with: string? apiKey = "sk-xxx";
        string? apiKey = Environment.GetEnvironmentVariable("DASHSCOPE_API_KEY");

        if (string.IsNullOrEmpty(apiKey))
        {
            Console.WriteLine("API key not set. Make sure the 'DASHSCOPE_API_KEY' environment variable is set.");
            return;
        }

        // Set the request URL and content.
        string url = "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation";
        // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
        string jsonContent = @"{
            ""model"": ""qwen-plus"", 
            ""input"": {
                ""messages"": [
                    {
                        ""role"": ""system"",
                        ""content"": ""You are a helpful assistant.""
                    },
                    {
                        ""role"": ""user"",
                        ""content"": ""Who are you?""
                    }
                ]
            },
            ""parameters"": {
                ""result_format"": ""message""
            }
        }";

        // Send the request and get the response.
        string result = await SendPostRequestAsync(url, jsonContent, apiKey);

        // Print the result.
        Console.WriteLine(result);
    }

    private static async Task<string> SendPostRequestAsync(string url, string jsonContent, string apiKey)
    {
        using (var content = new StringContent(jsonContent, Encoding.UTF8, "application/json"))
        {
            // Set the request headers.
            httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
            httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

            // Send the request and get the response.
            HttpResponseMessage response = await httpClient.PostAsync(url, content);

            // Process the response.
            if (response.IsSuccessStatusCode)
            {
                return await response.Content.ReadAsStringAsync();
            }
            else
            {
                return $"Request failed: {response.StatusCode}";
            }
        }
    }
}

Go (HTTP)

DashScope does not provide an SDK for Go. To make calls using the OpenAI Go SDK, see the OpenAI-Go section in this topic.

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"log"
	"net/http"
	"os"
)

type Message struct {
	Role    string `json:"role"`
	Content string `json:"content"`
}

type Input struct {
	Messages []Message `json:"messages"`
}

type Parameters struct {
	ResultFormat string `json:"result_format"`
}

type RequestBody struct {
	Model      string     `json:"model"`
	Input      Input      `json:"input"`
	Parameters Parameters `json:"parameters"`
}

func main() {
	// Create an HTTP client.
	client := &http.Client{}

	// Build the request body.
	requestBody := RequestBody{
		// This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
		Model: "qwen-plus",
		Input: Input{
			Messages: []Message{
				{
					Role:    "system",
					Content: "You are a helpful assistant.",
				},
				{
					Role:    "user",
					Content: "Who are you?",
				},
			},
		},
		Parameters: Parameters{
			ResultFormat: "message",
		},
	}

	jsonData, err := json.Marshal(requestBody)
	if err != nil {
		log.Fatal(err)
	}

	// Create a POST request.
	req, err := http.NewRequest("POST", "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation", bytes.NewBuffer(jsonData))
	if err != nil {
		log.Fatal(err)
	}

	// Set the request headers.
	// If you have not configured the environment variable, replace the following line with: apiKey := "sk-xxx"
	apiKey := os.Getenv("DASHSCOPE_API_KEY")
	req.Header.Set("Authorization", "Bearer "+apiKey)
	req.Header.Set("Content-Type", "application/json")

	// Send the request.
	resp, err := client.Do(req)
	if err != nil {
		log.Fatal(err)
	}
	defer resp.Body.Close()

	// Read the response body.
	bodyText, err := io.ReadAll(resp.Body)
	if err != nil {
		log.Fatal(err)
	}

	// Print the response content.
	fmt.Printf("%s\n", bodyText)
}

curl

curl --location "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

Streaming output

References: Streaming output.

Text generation models

Python

import os
import dashscope

messages = [
    {'role':'system','content':'you are a helpful assistant'},
    {'role': 'user','content': 'Who are you?'}
]
responses = dashscope.Generation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model="qwen-plus", # This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages,
    result_format='message',
    stream=True,
    incremental_output=True
    )
for response in responses:
    print(response)  

Java

import java.util.Arrays;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;
import io.reactivex.Flowable;
import java.lang.System;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static void handleGenerationResult(GenerationResult message) {
        System.out.println(JsonUtils.toJson(message));
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-plus")
                .messages(Arrays.asList(userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .incrementalOutput(true)
                .build();
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation();
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException  e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

curl

curl --location "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--header "X-DashScope-SSE: enable" \
--data '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message",
        "incremental_output":true
    }
}'

Multimodal models

Python

import os
from dashscope import MultiModalConversation
import dashscope

# If you use a model in the Singapore region, uncomment the following line.
# dashscope.base_http_api_url = "https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1"

messages = [
    {
        "role": "user",
        "content": [
            {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"},
            {"text": "What is depicted in the image?"}
        ]
    }
]

responses = MultiModalConversation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    # The API keys for the Singapore and China (Beijing) regions are different. To obtain an API key, see https://help.aliyun.com/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model='qwen3-vl-plus',  # You can replace this with another multimodal model and modify the messages accordingly.
    messages=messages,
    stream=True,
    incremental_output=True
    )
    
full_content = ""
print("Streaming output content:")
for response in responses:
    if response.output.choices[0].message.content:
        print(response.output.choices[0].message.content[0]['text'])
        full_content += response.output.choices[0].message.content[0]['text']
print(f"Full content: {full_content}")

Java

import java.util.Arrays;
import java.util.Collections;

import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.utils.Constants;

public class Main {

    // If you use a model in the Singapore region, uncomment the following line.
    //  static {Constants.baseHttpApiUrl="https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1";}

    public static void streamCall()
            throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
                .content(Arrays.asList(Collections.singletonMap("image", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"),
                        Collections.singletonMap("text", "What is depicted in the image?"))).build();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                // The API keys for the Singapore and China (Beijing) regions are different. To obtain an API key, see https://help.aliyun.com/en/model-studio/get-api-key
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen3-vl-plus")  // You can replace this with another multimodal model and modify the messages accordingly.
                .messages(Arrays.asList(userMessage))
                .incrementalOutput(true)
                .build();
        Flowable<MultiModalConversationResult> result = conv.streamCall(param);
        result.blockingForEach(item -> {
            try {
                var content = item.getOutput().getChoices().get(0).getMessage().getContent();
                    // Check if the content exists and is not empty.
                if (content != null &&  !content.isEmpty()) {
                    System.out.println(content.get(0).get("text"));
                    }
            } catch (Exception e) {
                System.out.println(e.getMessage());
            }
        });
    }

    public static void main(String[] args) {
        try {
            streamCall();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

curl

# ======= Important notes =======
# The API keys for the Singapore and China (Beijing) regions are different. To obtain an API key, see https://help.aliyun.com/en/model-studio/get-api-key
# The following URL is for the China (Beijing) region. If you use a model in the Singapore region, replace the URL with: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# === Delete this comment before execution ===

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
    "model": "qwen3-vl-plus",
    "input":{
        "messages":[
            {
                "role": "user",
                "content": [
                    {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241022/emyrja/dog_and_girl.jpeg"},
                    {"text": "What is depicted in the image?"}
                ]
            }
        ]
    },
    "parameters": {
        "incremental_output": true
    }
}'

Image input

For more information about how to use models to analyze images, see Image and video understanding.

Python

import os
import dashscope

messages = [
    {
        "role": "user",
        "content": [
            {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
            {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"},
            {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"},
            {"text": "What are these?"}
        ]
    }
]
response = dashscope.MultiModalConversation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model='qwen-vl-max', # This example uses qwen-vl-max. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages
    )
print(response)

Java

// Copyright (c) Alibaba, Inc. and its affiliates.

import java.util.Arrays;
import java.util.Collections;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
    public static void simpleMultiModalConversationCall()
            throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
                .content(Arrays.asList(
                        Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"),
                        Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"),
                        Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"),
                        Collections.singletonMap("text", "What are these?"))).build();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-vl-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-vl-plus")
                .message(userMessage)
                .build();
        MultiModalConversationResult result = conv.call(param);
        System.out.println(JsonUtils.toJson(result));
    }

    public static void main(String[] args) {
        try {
            simpleMultiModalConversationCall();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-vl-plus",
    "input":{
        "messages":[
            {
                "role": "user",
                "content": [
                    {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
                    {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"},
                    {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"},
                    {"text": "What are these?"}
                ]
            }
        ]
    }
}'

Video input

The following code provides an example of how to input video frames. For more information about other usage methods, such as inputting video files, see Visual understanding.

Python

from http import HTTPStatus
import os
# DashScope SDK V1.20.10 or later is required.
import dashscope

messages = [{"role": "user",
             "content": [
                 {"video":["https://img.alicdn.com/imgextra/i3/O1CN01K3SgGo1eqmlUgeE9b_!!6000000003923-0-tps-3840-2160.jpg",
                           "https://img.alicdn.com/imgextra/i4/O1CN01BjZvwg1Y23CF5qIRB_!!6000000003000-0-tps-3840-2160.jpg",
                           "https://img.alicdn.com/imgextra/i4/O1CN01Ib0clU27vTgBdbVLQ_!!6000000007859-0-tps-3840-2160.jpg",
                           "https://img.alicdn.com/imgextra/i1/O1CN01aygPLW1s3EXCdSN4X_!!6000000005710-0-tps-3840-2160.jpg"]},
                 {"text": "Describe the process in this video"}]}]
response = dashscope.MultiModalConversation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model='qwen-vl-max',  # This example uses qwen-vl-max. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages
)
if response.status_code == HTTPStatus.OK:
    print(response)
else:
    print(response.code)
    print(response.message)

Java

// DashScope SDK V2.16.7 or later is required.
import java.util.Arrays;
import java.util.Collections;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
    // This example uses qwen-vl-max. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    private static final String MODEL_NAME = "qwen-vl-max";
    public static void videoImageListSample() throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage systemMessage = MultiModalMessage.builder()
                .role(Role.SYSTEM.getValue())
                .content(Arrays.asList(Collections.singletonMap("text", "You are a helpful assistant.")))
                .build();
        MultiModalMessage userMessage = MultiModalMessage.builder()
                .role(Role.USER.getValue())
                .content(Arrays.asList(Collections.singletonMap("video", Arrays.asList("https://img.alicdn.com/imgextra/i3/O1CN01K3SgGo1eqmlUgeE9b_!!6000000003923-0-tps-3840-2160.jpg",
                                "https://img.alicdn.com/imgextra/i4/O1CN01BjZvwg1Y23CF5qIRB_!!6000000003000-0-tps-3840-2160.jpg",
                                "https://img.alicdn.com/imgextra/i4/O1CN01Ib0clU27vTgBdbVLQ_!!6000000007859-0-tps-3840-2160.jpg",
                                "https://img.alicdn.com/imgextra/i1/O1CN01aygPLW1s3EXCdSN4X_!!6000000005710-0-tps-3840-2160.jpg")),
                        Collections.singletonMap("text", "Describe the process in this video")))
                .build();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                .model(MODEL_NAME).message(systemMessage)
                .message(userMessage).build();
        MultiModalConversationResult result = conv.call(param);
        System.out.print(JsonUtils.toJson(result));
    }
    public static void main(String[] args) {
        try {
            videoImageListSample();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

curl

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
  "model": "qwen-vl-max",
  "input": {
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "video": [
              "https://img.alicdn.com/imgextra/i3/O1CN01K3SgGo1eqmlUgeE9b_!!6000000003923-0-tps-3840-2160.jpg",
              "https://img.alicdn.com/imgextra/i4/O1CN01BjZvwg1Y23CF5qIRB_!!6000000003000-0-tps-3840-2160.jpg",
              "https://img.alicdn.com/imgextra/i4/O1CN01Ib0clU27vTgBdbVLQ_!!6000000007859-0-tps-3840-2160.jpg",
              "https://img.alicdn.com/imgextra/i1/O1CN01aygPLW1s3EXCdSN4X_!!6000000005710-0-tps-3840-2160.jpg"
            ]
          },
          {
            "text": "Describe the process in this video"
          }
        ]
      }
    ]
  }
}'

Audio input

Audio understanding

For more information about how to use models to analyze audio, see Audio understanding - Qwen-Audio.

Python

import os
import dashscope

messages = [
    {
        "role": "user",
        "content": [
            {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"},
            {"text": "What is being said in this audio?"}
        ]
    }
]
response = dashscope.MultiModalConversation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model='qwen-audio-turbo', # This example uses qwen-audio-turbo. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages
    )
print(response)

Java

import java.util.Arrays;
import java.util.Collections;
import java.lang.System;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
    public static void simpleMultiModalConversationCall()
            throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
                .content(Arrays.asList(Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"),
                        Collections.singletonMap("text", "What is being said in this audio?"))).build();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-audio-turbo. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-audio-turbo")
                .message(userMessage)
                .build();
        MultiModalConversationResult result = conv.call(param);
        System.out.println(JsonUtils.toJson(result));
    }

    public static void main(String[] args) {
        try {
            simpleMultiModalConversationCall();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-audio-turbo",
    "input":{
        "messages":[
            {
                "role": "system",
                "content": [
                    {"text": "You are a helpful assistant."}
                ]
            },
            {
                "role": "user",
                "content": [
                    {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"},
                    {"text": "What is being said in this audio?"}
                ]
            }
        ]
    }
}'

Web search

Python

import os
import dashscope

messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': 'What is the weather in Hangzhou tomorrow?'}
    ]
response = dashscope.Generation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model="qwen-plus", # This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages,
    enable_search=True,
    result_format='message'
    )
print(response)

Java

// DashScope SDK V2.12.0 or later is recommended.
import java.util.Arrays;
import java.lang.System;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;

public class Main {
    public static GenerationResult callWithMessage() throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation();
        Message systemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant.")
                .build();
        Message userMsg = Message.builder()
                .role(Role.USER.getValue())
                .content("What is the weather in Hangzhou tomorrow?")
                .build();
        GenerationParam param = GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-plus")
                .messages(Arrays.asList(systemMsg, userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .enableSearch(true)
                .build();
        return gen.call(param);
    }
    public static void main(String[] args) {
        try {
            GenerationResult result = callWithMessage();
            System.out.println(JsonUtils.toJson(result));
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            // Use a logging framework to record the exception information.
            System.err.println("An error occurred while calling the generation service: " + e.getMessage());
        }
        System.exit(0);
    }
}

curl

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "What is the weather in Hangzhou tomorrow?"
            }
        ]
    },
    "parameters": {
        "enable_search": true,
        "result_format": "message"
    }
}'

Tool calling

For the complete code of the Function Calling flow, see Function Calling.

Python

import os
import dashscope

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "Useful for when you want to know the current time.",
            "parameters": {}
        }
    },  
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Useful for when you want to query the weather in a specific city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "A city or district, such as Beijing, Hangzhou, or Yuhang."
                    }
                }
            },
            "required": [
                "location"
            ]
        }
    }
]
messages = [{"role": "user", "content": "What's the weather like in Hangzhou?"}]
response = dashscope.Generation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model='qwen-plus',  # This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
    messages=messages,
    tools=tools,
    result_format='message'
)
print(response)

Java

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import com.alibaba.dashscope.aigc.conversation.ConversationParam.ResultFormat;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.tools.FunctionDefinition;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.utils.JsonUtils;
import com.fasterxml.jackson.databind.node.ObjectNode;
import com.github.victools.jsonschema.generator.Option;
import com.github.victools.jsonschema.generator.OptionPreset;
import com.github.victools.jsonschema.generator.SchemaGenerator;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfig;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfigBuilder;
import com.github.victools.jsonschema.generator.SchemaVersion;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;

public class Main {
    public class GetWeatherTool {
        private String location;
        public GetWeatherTool(String location) {
            this.location = location;
        }
        public String call() {
            return location + " is sunny today.";
        }
    }
    public class GetTimeTool {
        public GetTimeTool() {
        }
        public String call() {
            LocalDateTime now = LocalDateTime.now();
            DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
            String currentTime = "Current time: " + now.format(formatter) + ".";
            return currentTime;
        }
    }
    public static void SelectTool()
            throws NoApiKeyException, ApiException, InputRequiredException {
        SchemaGeneratorConfigBuilder configBuilder =
                new SchemaGeneratorConfigBuilder(SchemaVersion.DRAFT_2020_12, OptionPreset.PLAIN_JSON);
        SchemaGeneratorConfig config = configBuilder.with(Option.EXTRA_OPEN_API_FORMAT_VALUES)
                .without(Option.FLATTENED_ENUMS_FROM_TOSTRING).build();
        SchemaGenerator generator = new SchemaGenerator(config);
        ObjectNode jsonSchema_weather = generator.generateSchema(GetWeatherTool.class);
        ObjectNode jsonSchema_time = generator.generateSchema(GetTimeTool.class);
        FunctionDefinition fdWeather = FunctionDefinition.builder().name("get_current_weather").description("Get the weather for a specified area")
                .parameters(JsonUtils.parseString(jsonSchema_weather.toString()).getAsJsonObject()).build();
        FunctionDefinition fdTime = FunctionDefinition.builder().name("get_current_time").description("Get the current time")
                .parameters(JsonUtils.parseString(jsonSchema_time.toString()).getAsJsonObject()).build();
        Message systemMsg = Message.builder().role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant. When asked a question, use tools wherever possible.")
                .build();
        Message userMsg = Message.builder().role(Role.USER.getValue()).content("Weather in Hangzhou").build();
        List<Message> messages = new ArrayList<>();
        messages.addAll(Arrays.asList(systemMsg, userMsg));
        GenerationParam param = GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                // This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
                .model("qwen-plus")
                .messages(messages)
                .resultFormat(ResultFormat.MESSAGE)
                .tools(Arrays.asList(
                        ToolFunction.builder().function(fdWeather).build(),
                        ToolFunction.builder().function(fdTime).build()))
                .build();
        Generation gen = new Generation();
        GenerationResult result = gen.call(param);
        System.out.println(JsonUtils.toJson(result));
    }
    public static void main(String[] args) {
        try {
            SelectTool();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(String.format("Exception %s", e.getMessage()));
        }
        System.exit(0);
    }
}

curl

curl --location "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-plus",
    "input": {
        "messages": [{
            "role": "user",
            "content": "What's the weather like in Hangzhou?"
        }]
    },
    "parameters": {
        "result_format": "message",
        "tools": [{
            "type": "function",
            "function": {
                "name": "get_current_time",
                "description": "Useful for when you want to know the current time.",
                "parameters": {}
            }
        },{
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Useful for when you want to query the weather in a specific city.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "A city or district, such as Beijing, Hangzhou, or Yuhang."
                        }
                    }
                },
                "required": ["location"]
            }
        }]
    }
}'

Asynchronous invocation

# Your Dashscope Python SDK must be V1.19.0 or later.
import asyncio
import platform
import os
from dashscope.aigc.generation import AioGeneration

async def main():
    response = await AioGeneration.call(
        # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
        api_key=os.getenv('DASHSCOPE_API_KEY'),
        model="qwen-plus",  # This example uses qwen-plus. You can replace it with another model name as needed. For a list of models, see https://help.aliyun.com/en/model-studio/getting-started/models
        messages=[{"role": "user", "content": "Who are you"}],
        result_format="message",
    )
    print(response)

if platform.system() == "Windows":
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
asyncio.run(main())

Document understanding

Python

import os
import dashscope

messages = [
        {'role': 'system', 'content': 'you are a helpful assisstant'},
        # Replace {FILE_ID} with the file ID used in your actual conversation scenario.
        {'role':'system','content':f'fileid://{FILE_ID}'},
        {'role': 'user', 'content': 'What is this article about?'}]
response = dashscope.Generation.call(
    # If you have not configured the environment variable, replace the following line with: api_key="sk-xxx"
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model="qwen-long",
    messages=messages,
    result_format='message'
)
print(response)

Java

import java.util.Arrays;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;

public class Main {

    public static GenerationResult callWithFile() throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation();

        Message systemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                .content("you are a helpful assistant")
                .build();

        Message fileSystemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                // Replace {FILE_ID} with the file ID used in your actual conversation scenario.
                .content("fileid://{FILE_ID}")
                .build();

        Message userMsg = Message.builder()
                .role(Role.USER.getValue())
                .content("What is this article about?")
                .build();

        GenerationParam param = GenerationParam.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-long")
                .messages(Arrays.asList(systemMsg, fileSystemMsg, userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .build();

        return gen.call(param);
    }

    public static void main(String[] args) {
        try {
            GenerationResult result = callWithFile();
            System.out.println(JsonUtils.toJson(result));
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("Error calling DashScope API: " + e.getMessage());
            e.printStackTrace();
        }
    }
}


curl

Replace {FILE_ID} with the file ID used in your actual conversation scenario.
curl --location "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "system",
                "content": "fileid://{FILE_ID}"
            },
            {
                "role": "user",
                "content": "What is this article about?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

PPT generation

PPT generation is supported only by the qwen-doc-turbo model. For detailed usage, see Generate PPT.

Python

import os
import dashscope

response = dashscope.Generation.call(
    api_key=os.getenv('DASHSCOPE_API_KEY'),
    model='qwen-doc-turbo',
    messages=[
        {"role": "system", "content": "you are a helpful assistant."},
        {"role": "system", "content": "Your document content"},
        {"role": "user", "content": "Generate a 10 to 20 page PPT"}
    ],
    skill=[{"type": "ppt", "mode": "general", "template_id": "news_01"}]
)
try:
    if response.status_code == 200:
        print(response.output.choices[0].message.content)
    else:
        print(f"Request failed, status code: {response.status_code}")
        print(f"Error message: {response.message}")
        print("For more information, see: https://help.aliyun.com/en/model-studio/developer-reference/error-code")
except Exception as e:
    print(f"An error occurred: {e}")
    print("For more information, see: https://help.aliyun.com/en/model-studio/developer-reference/error-code")

curl

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'X-DashScope-SSE: enable' \
--data '{
    "model": "qwen-doc-turbo",
    "input": {
        "messages": [
            {
                "role": "system",
                "content": "you are a helpful assistant."
            },
            {
                "role": "system",
                "content": "Your document content"
            },
            {
                "role": "user",
                "content": "Generate a 10 to 20 page PPT"
            }
        ]
    },
    "parameters": {
        "skill": [
            {
                "type": "ppt",
                "mode": "general",
                "template_id": "news_01"
            }
        ]
    }
}'

model string (Required)

The model name.

Supported models include the Qwen large language model (Commercial Edition and Open Source Edition), Qwen-VL, Qwen-Coder, Qwen-Audio, and mathematical models, DeepSeek (Alibaba Cloud direct, SiliconFlow direct), Kimi (Alibaba Cloud direct), GLM (Alibaba Cloud direct), MiniMax (Alibaba Cloud direct, Xiyu Technology direct).

For specific model names and billing information, see Recommended models.

messages array (Required)

The context passed to the model, arranged in conversational order.

When you make an HTTP call, place messages in the input object.

Message types

System message object (Optional)

A system message defines the model's role, tone, task objectives, or constraints. It is typically placed first in the messages array.

We do not recommend setting a system message for QwQ models. Setting a system message for QVQ models has no effect.

Properties

content string (Required)

The message content.

role string (Required)

The role for a system message. The value is fixed as system.

User message object (Required)

A user message passes questions, instructions, or context to the model.

Properties

content string or array (Required)

The message content. The value is a string if the input contains only text. The value is an array if the input contains multimodal data such as images, or if explicit caching is enabled.

Properties

text string (Required)

The input text.

image string (Optional)

Specifies the image file for image understanding. You can provide the image in one of the following three ways:

  • Public URL: A publicly accessible image link.

  • The Base64 encoding of the image, in the format data:image/<format>;base64,<data>

  • Local file: The absolute path of a local file.

Applicable models: Qwen-VL, QVQ

Example: {"image":"https://xxxx.jpeg"}

video array or string (Optional)

The video input for Qwen-VL models or QVQ models.

  • If you pass in an image list, it is of type array.

  • If you pass in a video file, the type is string.

To input a local file, see Local file (Qwen-VL) or Local file (QVQ).

Examples:

  • Image list: {"video":["https://xx1.jpg",...,"https://xxn.jpg"]}

  • Video file: {"video":"https://xxx.mp4"}

fps float (Optional)

The number of frames extracted per second. Valid values: [0.1, 10]. Default value: 2.0.

Description

The fps parameter has two functions:

  • When you input a video file, this parameter controls the frame extraction frequency. One frame is extracted every seconds.

    This is applicable to Qwen-VL models and QVQ models.
  • This parameter informs the model of the time interval between adjacent frames, helping it better understand the video's temporal dynamics. This applies to both video file and image list inputs. This feature supports both video files and image lists and is suitable for scenarios such as event time localization or segmented content summarization.

    Supports Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL, Qwen2.5-VL, and QVQ models.

A higher fps is suitable for scenarios with high-speed motion (such as sports events or action movies), while a lower fps is suitable for long videos or scenarios with relatively static content.

Examples

  • Passing in a list of images: {"video":["https://xx1.jpg",...,"https://xxn.jpg"],"fps":2}

  • Video file input: {"video": "https://xx1.mp4", "fps":2}

max_frames integer (Optional)

The maximum number of frames to extract from a video. If the number of frames calculated based on fps exceeds max_frames, the system automatically samples frames evenly to ensure the total count does not exceed the max_frames limit.

Valid values

  • For the qwen3.7 series, qwen3.6 series, and qwen3.5 series, the maximum and default values are both 8000.

  • For the qwen3-vl-plus series, qwen3-vl-flash series, qwen3-vl-235b-a22b-thinking, and qwen3-vl-235b-a22b-instruct, the maximum and default values are both 2000.

  • qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, qwen-vl-plus-0710: The maximum value and default value are both 512.

Sample value

{"type": "video_url","video_url": {"url":"https://xxxx.mp4"},"max_frame": 2000}

When you use OpenAI-compatible API calls, the max_frames parameter is not supported. The API uses the default value for each model.

min_pixels integer (Optional)

Set the minimum pixel threshold for the input image or video frame. When the pixel count of the input image or video frame is less than min_pixels, it is upscaled until the total pixel count exceeds min_pixels.

Valid values

  • Image input:

    • Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL: The default value and minimum value are both 65536.

    • qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, and qwen-vl-plus-0710: The default and minimum values are both 4096.

    • Other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open source series, and the QVQ series: the default and minimum values are both 3136.

  • Video file or image list input:

    • Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL (including the commercial edition and open source edition), qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, qwen-vl-plus-0710: Default value: 65536. Minimum value: 4096.

    • Other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open source series, and the QVQ series models: the default value is 50176, and the minimum is 3136.

Examples

  • Input image: {"type": "image_url","image_url": {"url":"https://xxxx.jpg"},"min_pixels": 65536}

  • When you input a video file: {"type": "video_url","video_url": {"url":"https://xxxx.mp4"},"min_pixels": 65536}

  • When inputting a list of images: {"type": "video","video": ["https://xx1.jpg",...,"https://xxn.jpg"],"min_pixels": 65536}

max_pixels integer (Optional)

Sets the maximum pixel threshold for input images or video frames. When the pixel count of an input image or video frame is within the [min_pixels, max_pixels] range, the model performs detection on the original image. When the pixel count of an input image exceeds max_pixels, the image is downscaled until its total pixel count falls below max_pixels.

Valid values

  • Image input:

    The value of max_pixels depends on whether the <a baseurl="t3230323_v1_0_0.xdita" data-node="4759789" data-root="85177" data-tag="xref" href="t2614691.xdita#0edad44583knr" id="758d486a79gkv">vl_high_resolution_images</a> parameter is enabled.

    • When vl_high_resolution_images is False:

      • Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL: default: 2621440, maximum: 16777216

      • qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, and qwen-vl-plus-0710: the default value is 1310720, and the maximum value is 16777216.

      • Other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open-source series, and the QVQ series models: default value is 1003520, maximum value is 12845056

    • When vl_high_resolution_images is True:

      • Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL, qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, qwen-vl-plus-0710: max_pixels is ineffective, and the maximum pixels for input images are fixed at 16777216

      • Other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open source series, and the QVQ series models: max_pixels has no effect, and the maximum number of pixels for input images is fixed at 12845056.

  • Video file or image list input:

    • qwen3.7 series, qwen3.6 series, qwen3.5 series, qwen3-vl-plus series, qwen3-vl-flash series, qwen3-vl-235b-a22b-thinking, qwen3-vl-235b-a22b-instruct: default value is 655360, maximum value is 2048000

    • For other Qwen3-VL open source models, qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, and qwen-vl-plus-0710, the default value is 655360, and the maximum value is 786432.

    • For other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open-source series, and QVQ series models, the default value is 501760 and the maximum value is 602112.

Examples

  • Input image: {"type": "image_url","image_url": {"url":"https://xxxx.jpg"},"max_pixels": 8388608}

  • When you input a video file: {"type": "video_url","video_url": {"url":"https://xxxx.mp4"},"max_pixels": 655360}

  • When you input a list of images: {"type": "video","video": ["https://xx1.jpg",...,"https://xxn.jpg"],"max_pixels": 655360}

total_pixels integer (Optional)

Used to limit the total number of pixels of all frames extracted from a video (pixels per frame × total number of frames). If the total number of pixels in the video exceeds this limit, the system scales the video frames but still ensures that the pixel count of each frame remains within the [min_pixels, max_pixels] range. Applies to Qwen-VL and QVQ models.

For long videos with many extracted frames, you can lower this value to reduce token consumption and processing time. However, this may result in a loss of image detail.

Valid values

  • qwen3.7 series, qwen3.6 series, and qwen3.5 series: The default and maximum values are both 819200000, which corresponds to 800000 image tokens (1 image token per 32×32 pixels).

  • qwen3-vl-plus series, qwen3-vl-flash series, qwen3-vl-235b-a22b-thinking, and qwen3-vl-235b-a22b-instruct: The default and maximum values are both 134217728, which corresponds to 131072 image tokens (1 image token per 32×32 pixels).

  • Other Qwen3-VL open source models—qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, and qwen-vl-plus-0710—have a default value and minimum value of 67108864, which corresponds to 65536 image tokens (1 image token per 32×32 pixels).

  • For other qwen-vl-plus models, other qwen-vl-max models, the Qwen2.5-VL open-source series, and the QVQ series models: the default and minimum values are both 51380224. This value corresponds to 65536 image tokens (1 image token for every 28×28 pixels).

Examples

  • When you input a video file: {"type": "video_url","video_url": {"url":"https://xxxx.mp4"},"total_pixels": 134217728}

  • When you enter an image list: {"type": "video","video": ["https://xx1.jpg",...,"https://xxn.jpg"],"total_pixels": 134217728}

audio string

This parameter is required when the model is audio understanding, such as the qwen-audio-turbo model.

The audio file to input when using the audio understanding feature.

Example: {"audio":"https://xxx.mp3"}

cache_control object (Optional)

This parameter is supported only by models that support explicit caching. It is used to enable explicit caching.

Properties

type string (Required)

Fixed to ephemeral.

role string (Required)

The role of the user message is always set to user.

Assistant message object (Optional)

The model's reply to the user message.

Properties

content string (Optional)

The message content. This parameter is required for assistant messages unless the tool_calls parameter is specified.

role string (Required)

Is fixed to assistant.

partial boolean (Optional)

Specifies whether to enable partial mode. For more information and a list of supported models, see Partial mode.

tool_calls array (Optional)

After a Function Calling is initiated, the response provides information about the tool and its input parameters. This information consists of one or more objects and is obtained from the tool_calls field of the previous model response.

Properties

id string

The ID of the tool response.

type string

The tool type. The only supported value is function.

function object

The tool and input parameter information.

Properties

name string

The tool name.

arguments string

The input parameter information, which is a JSON-formatted string.

index integer

The index of the current tool information in the tool_calls array.

Tool message object (Optional)

The output information of the tool.

Properties

content string (Required)

The output content of the tool function. It must be a string.

role string (Required)

Must be set to tool.

tool_call_id string (Optional)

The ID returned when you initiate Function Calling, which can be obtained via response.output.choices[0].message.tool_calls[$index]["id"], and is used to identify the tool associated with the tool message.

temperature float (Optional)

The sampling temperature, which controls the diversity of the generated text.

A higher temperature value results in more diverse text, while a lower value results in more deterministic text.

Valid values: [0, 2)

Default temperature values

  • Qwen3.7 (non-thinking mode), Qwen3.6 (non-thinking mode), Qwen3.5-Omni, Qwen3.5 (non-thinking mode), Qwen3 (non-thinking mode), Qwen3-Instruct series, Qwen3-Coder series, qwen-max series, qwen-plus series (non-thinking mode), qwen-flash series (non-thinking mode), qwen-turbo series (non-thinking mode), Qwen open source series, qwen-coder series, qwen2-audio-instruct, qwen-doc-turbo, and Qwen3-VL (non-thinking mode): 0.7

  • QVQ series, and : 0.5

  • qwen-audio-turbo series: 0.00001

  • qwen-vl series, qwen2.5-omni-7b, and : 0.01

  • qwen-math series: 0

  • Qwen3.7 (thinking mode), Qwen3.6 (thinking mode), Qwen3.5 (thinking mode), Qwen3 (thinking mode), Qwen3-Thinking, Qwen3-Omni-Captioner, and QwQ series: 0.6

  • qwen3-max-preview (thinking mode) and qwen-long series: 1.0

  • qwen-plus-character: 0.92

  • qwen3-omni-flash series: 0.9

  • Qwen3-VL (thinking mode): 0.8

  • DeepSeek series (Alibaba Cloud direct): deepseek-v4-pro, deepseek-v4-flash, deepseek-v3.2 (non-thinking mode): 1.0; deepseek-v3.2 (thinking mode), deepseek-v3.2-exp, deepseek-v3.1, deepseek-r1, deepseek-r1-0528, deepseek-r1-distill-qwen distill series: 0.6; deepseek-v3: 0.7;

  • DeepSeek series (SiliconFlow direct): siliconflow/deepseek-v3.2, siliconflow/deepseek-v3.1-terminus, siliconflow/deepseek-r1-0528, siliconflow/deepseek-v3-0324: 1.0;

  • DeepSeek series (Kuaishou Wanqing direct): vanchin/deepseek-v3.2-think (thinking mode): 0.6; vanchin/deepseek-v3.1-terminus: 0.7; vanchin/deepseek-v3.2-speciale, vanchin/deepseek-r1, vanchin/deepseek-v3, vanchin/deepseek-ocr: 1.0;

  • Kimi series (Alibaba Cloud direct): kimi-k2.7-code, kimi-k2.6 (thinking mode), kimi-k2.5 (thinking mode), kimi-k2-thinking: 1.0; kimi-k2.6 (non-thinking mode), kimi-k2.5 (non-thinking mode), Moonshot-Kimi-K2-Instruct: 0.6;

  • Kimi series (Moonshot AI direct): kimi/kimi-k2.7-code, kimi/kimi-k2.6 (thinking mode), kimi/kimi-k2.5 (thinking mode): 1.0; kimi/kimi-k2.6 (non-thinking mode), kimi/kimi-k2.5 (non-thinking mode): 0.6;

  • GLM series (Alibaba Cloud direct): glm-5.1, glm-5, glm-4.7, glm-4.6: 1.0; glm-4.5, glm-4.5-air: 0.6;

  • GLM series (Zhipu direct): ZHIPU/GLM-5.1, ZHIPU/GLM-5: 0.6;

  • MiniMax series (Alibaba Cloud direct): MiniMax-M2.5, MiniMax-M2.1: 1.0;

  • MiniMax series (Xiyu Technology direct): MiniMax/MiniMax-M3, MiniMax/MiniMax-M2.7, MiniMax/MiniMax-M2.5, MiniMax/MiniMax-M2.1: 1.0.

  • MiMo series (Xiaomi direct): mimo-v2.5-pro: 1.0, range [0, 1.5].

When you make an HTTP call, place temperature in the parameters object.
We do not recommend changing the default temperature value for QVQ models.

top_p float (Optional)

The probability threshold for nucleus sampling, which controls the diversity of the generated text.

A higher top_p value results in more diverse text, while a lower value results in more deterministic text.

Valid values: (0, 1.0].

Default top_p values

Qwen3.7 (non-thinking mode), Qwen3.6 (non-thinking mode), Qwen3.5-Omni, Qwen3.5 (non-thinking mode), Qwen3 (non-thinking mode), Qwen3-Instruct series, Qwen3-Coder series, qwen-max series, qwen-plus series (non-thinking mode), qwen-flash series (non-thinking mode), qwen-turbo series (non-thinking mode), Qwen 2.5 open source series, qwen-coder series, qwen-long, qwen-doc-turbo, Qwen3-VL (non-thinking): 0.8

qwen2-vl-72b-instruct, qwen-omni-turbo series: 0.01

qwen-vl-plus series, qwen-vl-max, qwen2-vl-2b-instruct, qwen2-vl-7b-instruct, qwen2.5-omni-7b: 0.001

QVQ series, qwen2-audio-instruct: 0.5

qwen3-max-preview (thinking mode), qwen-math series, Qwen3-Omni-Flash series: 1.0

Qwen3.7 (thinking mode), Qwen3.6 (thinking mode), Qwen3.5 (thinking mode), Qwen3 (thinking mode), Qwen3-VL (thinking mode), Qwen3-Thinking, QwQ series, Qwen3-Omni-Captioner, qwen-plus-character: 0.95

DeepSeek series (Alibaba Cloud direct): deepseek-v4-pro, deepseek-v4-flash, deepseek-v3.2, deepseek-v3.2-exp, deepseek-v3.1, deepseek-r1, deepseek-r1-0528, deepseek-r1-distill-qwen distill series: 0.95; deepseek-v3: 0.6;

DeepSeek series (SiliconFlow direct): siliconflow/deepseek-v3.2, siliconflow/deepseek-v3.1-terminus, siliconflow/deepseek-r1-0528, siliconflow/deepseek-v3-0324: 1.0;

DeepSeek series (Kuaishou Wanqing direct): vanchin/deepseek-v3.2-think, vanchin/deepseek-v3.1-terminus: 0.95; vanchin/deepseek-v3.2-speciale: 0.9; vanchin/deepseek-r1: 0.8; vanchin/deepseek-v3, vanchin/deepseek-ocr: 1.0;

Kimi series (Alibaba Cloud direct): kimi-k2.7-code, kimi-k2.6, kimi-k2.5, kimi-k2-thinking: 0.95; Moonshot-Kimi-K2-Instruct: 1.0;

Kimi series (Moonshot AI direct): kimi/kimi-k2.7-code, kimi/kimi-k2.6, kimi/kimi-k2.5: 0.95;

GLM series (Alibaba Cloud direct): 0.95;

GLM series (Zhipu direct): ZHIPU/GLM-5.1, ZHIPU/GLM-5: 0.95;

MiniMax series (Alibaba Cloud direct): MiniMax-M2.5, MiniMax-M2.1: 0.95;

MiniMax series (Xiyu Technology direct): MiniMax/MiniMax-M3: 0.95; MiniMax/MiniMax-M2.7, MiniMax/MiniMax-M2.5, MiniMax/MiniMax-M2.1: 0.9.

MiMo series (Xiaomi direct): xiaomi/mimo-v2.5-pro: 0.95, range [0.01, 1.0].

In the Java SDK, the parameter is topP. When invoking via HTTP, place top_p within the parameters object.
It is not recommended to change the default top_p value for QVQ models.

top_k integer Optional

This parameter defines the size of the candidate set for sampling during generation. For example, if you set this parameter to 50, the candidate set for random sampling will consist of only the 50 tokens with the highest scores from a single generation. A larger value increases randomness, while a smaller value increases determinism. If the value is null or greater than 100, the top_k policy is not enabled, and only the top_p policy takes effect.

The value must be greater than or equal to 0.

Default top_k values

QVQ series: 10

QwQ series: 40

qwen-math series, other qwen-vl-plus series, models earlier than qwen-audio-turbo series, : 1.

All other models: 20

GLM series (Alibaba Cloud direct): 20;

DeepSeek/Kimi/MiniMax series do not support the top_k parameter.

In the Java SDK, the parameter is topK. For HTTP calls, place top_k within the parameters object.
We do not recommend changing the default top_k value for QVQ models.

enable_thinking boolean (Optional)

Specifies whether to enable thinking mode when you use a hybrid thinking model. This parameter applies to the Qwen3.7, Qwen3.6, Qwen3.5, Qwen3, and Qwen3-VL models, as well as the DeepSeek-V4-Pro/V4-Flash series (Alibaba Cloud direct), DeepSeek-V3.2/V3.2-exp/V3.1 series (Alibaba Cloud direct, SiliconFlow direct), Kimi-K2.7-code (thinking-only model), Kimi-K2.6/K2.5 series (Alibaba Cloud direct), and GLM series. The DeepSeek-V4 series defaults to thinking mode. You can use the reasoning_effort parameter to adjust reasoning intensity.

Valid values:

  • true: Enabled

    When enabled, the reasoning content is returned in the reasoning_content field.
  • false: Disabled

For default values by model, see Supported models.

In the Java SDK, this parameter is named enableThinking. When you make an HTTP call, include enable_thinking in the parameters object.

preserve_thinking boolean (Optional) Default value: false

Specifies whether to append the `reasoning_content` from assistant messages in the conversation history to the model input. This is useful in scenarios where the model needs to refer to the historical thinking process.

Currently, this is only supported for qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-max-preview, qwen3.7-max-2026-05-17, qwen3.6-max-preview, qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.6-plus, qwen3.6-plus-2026-04-02, kimi-k2.6 (deployed on Alibaba Cloud Model Studio), kimi-k2.7-code (deployed on Alibaba Cloud Model Studio, enabled by default), and kimi/kimi-k2.7-code (Moonshot AI direct, enabled by default).

  • If the historical messages do not contain `reasoning_content`, enabling this parameter does not cause an error and is handled with normal compatibility.

  • After this parameter is enabled, the `reasoning_content` from the conversation history is included in the input token count and is billed.

When you make an HTTP call, place preserve_thinking in the parameters object. This is not yet supported by the Java SDK.

thinking_budget integer (Optional)

The maximum length of the model's chain-of-thought process. This parameter applies to the commercial and open source versions of the Qwen3.7, Qwen3.6, Qwen3.5, Qwen3-VL, and Qwen3 models. For more information, see Limit the thinking length.

The default value is the model's maximum chain-of-thought length. For more information, see Model list.

In the Java SDK, this parameter is `thinkingBudget`. For HTTP calls, place thinking_budget in the parameters object.
The default value is the model’s maximum chain-of-thought length.

reasoning_effort string (Optional) Defaults to: high

Controls the reasoning intensity for DeepSeek-V4 series models. Valid values: high (high-intensity reasoning), max (maximum-intensity reasoning). low and medium map to high, and xhigh maps to max.

Available for deepseek-v4-pro and deepseek-v4-flash (Alibaba Cloud direct).

When you make an HTTP call, place reasoning_effort in the parameters object.

tool_stream boolean (Optional) Defaults to: false

Only takes effect during streaming calls. Currently only supported by Qwen and GLM series.

Qwen series supported models:

  • qwen-max series: text modality of qwen3.7-max series

  • qwen-plus series: text modality of qwen3.7-plus series and qwen3.6-plus series, and all modalities of qwen3.5-plus series

  • qwen-flash series: all modalities of qwen3.6-flash series and qwen3.5-flash series

Qwen series usage reference:

tool_stream only affects complex tool arguments. Simple tool arguments are streamed as long as streaming calls are enabled. A complex tool refers to a tool whose definition contains parameters of type array or object.

  • tool_stream=false: Complex tool arguments are output all at once. This is the default behavior, which produces more accurate results for complex formats.

  • tool_stream=true: Complex tool arguments are output in a streaming manner, which eliminates timeout risks for complex formats.

GLM series supported models: glm-4.6, glm-4.7, glm-5, and glm-5.1 (Alibaba Cloud direct).

GLM series usage reference:

  • tool_stream=false: Tool arguments are output all at once. This is the default behavior, which produces more accurate results for complex formats.

  • tool_stream=true: Tool arguments are output in a streaming manner, which eliminates timeout risks for complex formats.

When you make an HTTP call, place tool_stream in the parameters object.

enable_code_interpreter boolean (Optional) Default value: false

Specifies whether to enable the code interpreter feature. This feature is supported only for the qwen3.5 model, and for the qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-max-preview, qwen3.7-max-2026-05-17, qwen3-max, qwen3-max-2026-01-23, and qwen3-max-preview models in thinking mode. For more information, see Code interpreter.

Valid values:

  • true: Enabled

  • false: Disabled

This parameter is not supported by the Java SDK. When you make an HTTP call, place enable_code_interpreter in the parameters object.

repetition_penalty float (Optional)

The repetition_penalty parameter controls repetition in generated sequences. A higher value reduces repetition, and a value of 1.0 indicates that no penalty is applied. The value must be greater than 0.

Default repetition_penalty values

  • qwen-max, qwen-math series, qwen-vl-max series, qwen-audio-turbo series, QVQ series, QwQ series, Qwen3-VL: 1.0.

  • qwen-coder series: 1.1.

  • qwen-vl-plus: 1.2.

  • All other models: 1.05.

  • DeepSeek series (Alibaba Cloud direct): deepseek-v3.2-exp, deepseek-v3.1: 1.0;

  • GLM series (Alibaba Cloud direct): 1.0;

In the Java SDK, this parameter is `repetitionPenalty`. When you make an HTTP call, you can place repetition_penaltyparameters object.
When you use the qwen-vl-plus_2025-01-25 model for text extraction, we recommend setting `repetition_penalty` to 1.0.
We do not recommend changing the default repetition_penalty value for QVQ models.

presence_penalty float (Optional)

Controls how strongly the model avoids repeating content.

Valid values: -2.0 to 2.0. Positive values reduce repetition. Negative values increase it.

For scenarios that require diversity and creativity, such as creative writing or brainstorming, increase this value. For scenarios that require consistency and terminological accuracy, such as technical documents or formal text, decrease this value.

Default presence_penalty values

Qwen3.7 (non-thinking mode), Qwen3.6 (non-thinking mode), Qwen3.5-Omni, Qwen3.5 (non-thinking mode), qwen3-max-preview (thinking mode), Qwen3 (non-thinking mode), Qwen3-Instruct series/1.7b/4b (thinking mode), QVQ series, qwen-max, qwen2.5-vl series, qwen-vl-max series, qwen-vl-plus, qwen2-vl-72b-instruct, Qwen3-VL (non-thinking): 1.5.

qwen3-8b/14b/32b/30b-a3b/235b-a22b (thinking mode), qwen-plus/qwen-plus-latest/2025-04-28 (thinking mode), qwen-turbo/qwen-turbo/2025-04-28 (thinking mode): 0.5.

All other models: 0.0.

DeepSeek series (Alibaba Cloud direct): deepseek-r1, deepseek-r1-0528, deepseek-r1-distill-qwen distill series: 1;

Kimi series (Alibaba Cloud direct): kimi-k2.7-code, kimi-k2.6, kimi-k2.5: 0.0;

Kimi series (Moonshot AI direct): 0.0;

MiniMax series (Alibaba Cloud direct): MiniMax-M2.5, MiniMax-M2.1: 0.0;

Other DeepSeek/Kimi/GLM/MiniMax models have no default value.

How it works

When the parameter value is positive, the model penalizes tokens that already appear in the generated text. The penalty does not depend on how many times a token appears. This reduces the likelihood of those tokens reappearing, which decreases repetition and increases lexical diversity.

Example

Prompt: Translate this sentence into English: "Esta película es buena. La trama es buena, la actuación es buena, la música es buena, y en general, toda la película es simplemente buena. Es realmente buena, de hecho. La trama es tan buena, y la actuación es tan buena, y la música es tan buena."

Parameter value 2.0: This movie is very good. The plot is great, the acting is great, the music is also very good, and overall, the whole movie is incredibly good. In fact, it is truly excellent. The plot is very exciting, the acting is outstanding, and the music is so beautiful.

Parameter value 0.0: This movie is good. The plot is good, the acting is good, the music is also good, and overall, the whole movie is very good. In fact, it is really great. The plot is very good, the acting is also very outstanding, and the music is also excellent.

Parameter value -2.0: This movie is very good. The plot is very good, the acting is very good, the music is also very good, and overall, the whole movie is very good. In fact, it is really great. The plot is very good, the acting is also very good, and the music is also very good.

When using the qwen-vl-plus model for text extraction, set presence_penalty to 1.5.
Do not modify the default presence_penalty value for QVQ models.
The Java SDK does not support setting this parameter. When you make an HTTP call, place presence_penalty in the parameters object.

vl_high_resolution_images boolean (Optional. Defaults to false.)

Increases the maximum pixel limit for input images to the pixel value corresponding to 16384 tokens. See Processing high-resolution images.

  • vl_high_resolution_images: true: Uses a fixed-resolution strategy and ignores the max_pixels setting. If an image exceeds this resolution, its total pixel count is downscaled to meet the limit.

    Click to view the pixel limits for each model

    When vl_high_resolution_images is true, different models have different pixel limits:

    • Qwen3.7 series, Qwen3.6 series, Qwen3.5 series, Qwen3-VL series, qwen-vl-max, qwen-vl-max-0813, qwen-vl-plus, qwen-vl-plus-0815, qwen-vl-plus-0710: 16,777,216 (each Token corresponds to 32×32 pixels, i.e., 16,384×32×32)

    • QVQ series and other Qwen2.5-VL series models: 12,845,056 (each Token corresponds to 28×28 pixels, i.e., 16,384×28×28)

  • If vl_high_resolution_images is false, the actual pixel limit is determined by max_pixels. If an input image exceeds max_pixels, it is downscaled to fit within max_pixels. The default pixel limits for models match the default value of max_pixels.

In the Java SDK, this parameter is vlHighResolutionImages (minimum required version is 2.20.8). When you make an HTTP call, place vl_high_resolution_imagesparameters object.

vl_enable_image_hw_output boolean (Optional) Default value: false

This parameter specifies whether to return the dimensions of the scaled image. If set to `true`, the model returns the height and width of the scaled input image. When streaming output is enabled, this information is returned in the last data packet (chunk). This is supported by Qwen-VL models.

In the Java SDK, this parameter is vlEnableImageHwOutput. The minimum required Java SDK version is 2.20.8. When you make an HTTP call, place vl_enable_image_hw_output in the parameters object.

max_tokens integer (Optional, Deprecated)

This parameter is being deprecated. Use max_completion_tokens for new integrations.

The maximum number of tokens in the response. Generation stops when this limit is reached, and the finish_reason field in the response is set to length.

The default and maximum values correspond to the model’s maximum output length. Check in the console.

You can use this parameter to control output length in scenarios such as generating summaries or keywords, or to reduce costs and shorten response time.

When max_tokens is triggered, the finish_reason field in the response is set to length.

max_tokens does not limit the length of the chain-of-thought.
In the Java SDK, the parameter is named maxTokens. For the Qwen VL/Audio model, the parameter is named maxLength in the Java SDK. Starting with version 2.18.4, maxTokens is also supported. The equivalent parameter in the parameters field is max_tokens

max_completion_tokens integer (Optional)

The maximum number of tokens in this response, including the chain-of-thought tokens. Generation stops when this limit is reached, and the finish_reason field in the response is set to length.

The default and maximum values correspond to the model's maximum output length. Check in the console.

Difference from max_tokens: max_completion_tokens limits the total length of both the chain-of-thought and the final response, while max_tokens does not limit the chain-of-thought length. For reasoning models, max_completion_tokens is recommended.

Supported models:

  • Qwen-Max: Qwen3.7-Max and later

  • Qwen-Plus: Qwen3.5-Plus and later

  • Qwen-Flash: Qwen3.5-Flash and later

  • Kimi: kimi-k2.5 and later

  • GLM: glm-5 and later

  • MiniMax: MiniMax-M2.5 and later

  • DeepSeek: deepseek-v3, deepseek-r1, deepseek-r1-0528, deepseek-v3.1, deepseek-v3.2, deepseek-v3.2-exp, deepseek-v4-pro, deepseek-v4-flash and later

Third-party direct-sale models are not included.
The actual number of output tokens may differ from the configured max_completion_tokens value by up to 10 tokens.
The Java SDK does not support this parameter. For HTTP calls, place max_completion_tokens in the parameters field.

seed integer (Optional)

The random number seed. This parameter ensures that results are reproducible. If you use the same seed value in a call and the other parameters remain unchanged, the model returns the same result whenever possible.

Valid values: [0,231-1].

Default seed values

The default value is 3407 for the following models: qwen-vl-max, and the qvq-max series.

The following models have no default value: qwen-vl-max-2024-02-01, qwen2-vl-72b-instruct, qwen2-vl-2b-instruct, qwen-vl-plus, and .

The default value for all other models is 1234.

When you make an HTTP call, place seed in the parameters object.

stream boolean (Optional). Default value: false

Specifies whether to stream the response. Valid values:

  • false: The model returns the complete response after generating all content.

  • true: The model generates and outputs content simultaneously. Each chunk is output immediately after it is generated.

This parameter is supported only in the Python SDK. To implement streaming output using the Java SDK, use the streamCall interface. To implement streaming output using HTTP, set the X-DashScope-SSE header to enable.
The Qwen3 commercial edition (thinking mode), Qwen3 open source edition, QwQ, and QVQ support streaming output only.

incremental_output boolean (Optional. Defaults to false. For Qwen3-Max, Qwen3-VL, Qwen3 Open Source Edition, QwQ, and QVQ, the default is true.)

Specifies whether to enable incremental output in streaming output mode. You should set this to true.

Parameter values:

  • false: Each chunk contains the full sequence generated so far. The final chunk contains the complete result.

    I
    I like
    I like apple
    I like apple.
  • true (recommended): Each chunk contains only newly generated content. You must read the chunks sequentially in real time to reconstruct the complete result.

    I
    like
    apple
    .
In the Java SDK, the parameter is named incrementalOutput. For HTTP calls, add incremental_output to the parameters object.
The QwQ model and the Qwen3 model in thinking mode support only true. Because the default value for the Qwen3 commercial model is false, you must manually set it to true in thinking mode.
The Qwen3 open source model does not support false.

response_format object (Optional) Default value: {"type": "text"}

This parameter specifies the format of the returned content. Valid values are:

  • {"type": "text"}: Outputs a text response.

  • {"type": "json_object"}: Outputs a standard-format JSON string.

For more information, see Structured output.
For a list of supported models, see Supported models.
If you specify {"type": "json_object"}, you must explicitly instruct the model to output JSON in the prompt, such as: "Please output in JSON format". Otherwise, an error will occur.
In the Java SDK, this parameter is `responseFormat`. When you make an HTTP call, place response_format in the parameters object.

Properties

type string (Required)

This parameter specifies the format of the returned content. Valid values are:

  • text: Outputs a text response.

  • json_object: Outputs a standard-format JSON string.

result_format string (Optional) The default value is text. For the Qwen3-Max, Qwen3-VL, QwQ model, Qwen3 open source models (except for qwen3-next-80b-a3b-instruct), and the Qwen-Long model, the default value is `message`.

This parameter specifies the format of the returned data. We recommend that you set this parameter to message to facilitate multi-turn conversation.

The platform will standardize the default value to message in a future update.
In the Java SDK, this parameter is `resultFormat`. When you make an HTTP call, place result_formatparameters object.
When the model is Qwen-VL/QVQ/Audio, setting this parameter to text has no effect.
The Qwen3-Max, Qwen3-VL, and Qwen3 models in thinking mode only support the message format. Because the default value for the Qwen3 commercial model is text, you must explicitly set this parameter to message.
If you use the Java SDK to call the Qwen3 open-source model and set this parameter to text, the response is still returned in the message format.

logprobs boolean (Optional) Defaults to false

Specifies whether to return the log probabilities of the output tokens. Valid values:

  • true

    Back

  • false

    No return.

The following models are supported:

  • Snapshot models of the qwen-plus series (excluding stable versions)

  • Snapshot models of the qwen-turbo series (excluding stable versions)

  • qwen3-vl-plus series (including stable versions)

  • qwen3-vl-flash series (including stable versions)

  • Qwen3 open source models

When you make an HTTP call, place the logprobs parameter in the parameters object.

top_logprobs integer (Optional, default: 0)

Specifies the number of top candidate tokens to return at each generation step.

The value must be from 0 to 5, inclusive.

This parameter takes effect only when logprobs is true.

In the Java SDK, the parameter is topLogprobs. For HTTP calls, set the top_logprobs parameter in the parameters object.

n integer (Optional). Default value: 1.

The number of responses to generate. Valid values are 1-4. For scenarios that require multiple responses, such as creative writing and advertising copy, you can set a larger value for n.

Currently, this parameter is supported only for the Qwen3 (non-thinking mode)and qwen-plus-character models. If the tools parameter is specified, this value is fixed at 1.
Setting a larger n value does not increase input token consumption but does increase output token consumption.
When you make an HTTP call, place n in the parameters object.

stop string or array (Optional)

This parameter specifies stop words. If a string or token_id specified in stop appears in the text generated by the model, generation stops immediately.

Pass sensitive words to control the model's output.

If stop is an array, do not use a token_id or a string as elements simultaneously. For example, ["Hello",104307] is not a valid value.
When you make an HTTP call, place the stop parameter in the parameters object.

tools array Optional

An array that contains one or more tool objects for the model to call during Function Calling. For more information, see Function Calling.

When you use tools, you must set result_format to message.

When you initiate Function Calling or submit the execution result of a tool, you must set the tools parameter.

Properties

type string (Required)

The tool type. Currently, only function is supported.

function object (Required)

Properties

name string (Required)

The name of the tool function. It must consist of letters and numbers, and can contain underscores and hyphens. The maximum length is 64 characters.

description string (Required)

A description of the tool function, which helps the model decide when and how to call the function.

parameters object (Optional) Default value: {}

The parameters of the tool function, described in the JSON Schema format. For more information about JSON Schema, see this link. If the parameters object is empty, the tool does not require any input parameters, such as a time query tool.

To improve the accuracy of tool calling, we recommend that you specify the parameters.
When making an HTTP call, include tools in the parameters object. This parameter is not currently supported by the qwen-vl and qwen-audio models.

tool_choice string or object (Optional) Default value: auto

The tool selection strategy. You can set this parameter to enforce a specific tool calling method for certain types of problems, such as always using a specific tool or disabling all tools.

  • auto

    The model independently selects the tool strategy.

  • none

    To temporarily disable tool calling for a specific request, set the tool_choice parameter to none.

  • {"type": "function", "function": {"name": "the_function_to_call"}}

    To force a tool call, set the tool_choice parameter to {"type": "function", "function": {"name": "the_function_to_call"}}, where the_function_to_call is the name of the specified tool function.

    Thinking mode models do not support forcing a specific tool call.
In the Java SDK, the parameter is named toolChoice. When making HTTP calls, include tool_choice in the parameters object.

parallel_tool_calls boolean (Optional) Default value: false

Indicates whether to enable parallel tool calling.

Valid values:

  • true: Enabled

  • false: Disabled.

For more information about parallel tool calling, see Parallel tool calling.

In the Java SDK, the parameter is parallelToolCalls. For an HTTP call, set parallel_tool_calls in the parameters object.

enable_search boolean (Optional). Default value: false

Determines whether the model uses Internet search results as a reference when generating text. Valid values:

  • true: Enables web search. The model uses search results as reference information during text generation, but it decides whether to use them based on its internal logic.

    If web search is not triggered after enabling this parameter, optimize the prompt or set the forced_search parameter in search_options to enable forced search.
  • false: Disables web search.

For billing information, see Billing.

For the Java SDK, use enableSearch. For HTTP calls, place enable_search in the parameters object.
Enabling the web search feature may increase token consumption.

search_options object (Optional)

This parameter defines the web search policy. It applies only when enable_search is set to true. For more information, see web search.

When you make an HTTP call, place search_options in the parameters object. In the Java SDK, this parameter is `searchOptions`.

Properties

enable_source boolean (Optional). Default value: false

Specifies whether to display the retrieved information in the returned result.

  • If set to `true`, the information is displayed.

  • If set to `false`, the information is not displayed.

enable_citation boolean (Optional). Default value: false

Specifies whether to enable superscript citation styles, such as [1] or [ref_1]. This parameter takes effect when enable_source is set to true.

  • If set to `true`, the feature is enabled.

  • If set to `false`, the feature is disabled.

citation_format string (Optional). Default value: "[<number>]"

This parameter specifies the citation style. It takes effect when enable_citation is set to true.

  • [<number>]: The superscript format is [1].

  • [ref_<number>]: A reference mark in the format [ref_1].

forced_search boolean (Optional). Default value: false

Specifies whether to force search.

  • true: force enable;

  • false: Not enforced.

search_strategy string (Optional). Default value: turbo

Strategies for searching the Internet.

Valid values:

  • turbo (default): Balances response speed and search effectiveness. It is suitable for most scenarios.

  • max: Uses a more comprehensive search strategy. It can invoke multi-source search engines to obtain more detailed search results, but the response time may be longer.

  • agent: You can invoke web search tools and models multiple times to perform multi-round information retrieval and content integration.

    This strategy is only applicable to qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-max-preview, qwen3.7-max-2026-05-17, qwen3.5-plus, qwen3.5-plus-2026-02-15, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen3-max, and qwen3-max-2026-01-23 in thinking mode (streaming only), qwen3-max-2026-01-23 in non-thinking mode, and qwen3-max-2025-09-23.
    When this policy is enabled, only returning search sources (enable_source: true) is supported. Other web search features are unavailable.
  • agent_max: Supports web scraping based on the agent policy. For more information, see Web Scraping.

    This strategy is only applicable to qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-max-preview, qwen3.7-max-2026-05-17, qwen3-max, and qwen3-max-2026-01-23 in thinking mode.
    When you enable this policy, only returning search sources (enable_source: true) is supported. All other web search features are unavailable.

enable_search_extension boolean (Optional). Default value: false

Specifies whether to enable domain-specific enhancement.

  • true

    You can enable the feature.

  • false (default value)

    Disabled.

prepend_search_result boolean (Optional). Default value: false

When using streaming output and enable_source is set to true, you can use prepend_search_result to specify whether the first returned packet contains only search source information.

  • true

    Contains only search source information.

  • false (default)

    This includes both search source information and the model's reply.

This is not yet supported by the DashScope Java SDK.

X-DashScope-DataInspection string (Optional)

Specifies whether to perform additional detection of non-compliant information in the input and output content using the content moderation capabilities of the Qwen API. Valid values:

  • '{"input":"cip","output":"cip"}': Enables further detection.

  • If you do not set this parameter, no further detection is performed.

When making an HTTP call, include the following request header: -H "X-DashScope-DataInspection: {\"input\": \"cip\", \"output\": \"cip\"}".

When you make a call using the Python SDK, configure the headers parameter as follows: headers={'X-DashScope-DataInspection': '{"input":"cip","output":"cip"}'}.

For detailed usage, see Input and output AI safety guardrails.

You cannot set this parameter using the Java SDK.
This parameter does not apply to Qwen-Audio series models.

skill array (Optional)

Skill parameters used to enable specific generation skills (such as PPT generation). Only the qwen-doc-turbo model supports this parameter. For detailed usage, see Generate PPT.

When making an HTTP call, include skill in the parameters object.
When using skill, stream must be set to true.

Properties

type string (Required)

Skill type. Currently supported:

  • ppt

mode string (Optional)

PPT generation mode. Valid values:

  • general (Default): Template mode. Use with template_id to generate PPT in HTML format.

  • creative: Creative mode. No template required. Generates image-based PPT (each page is an image).

template_id string (Optional)

PPT template ID. Use when mode is general or when mode is not specified. Valid values:

  • news_01

  • summary_01

  • internet_01

  • thesis_01

Chat response object (The format is the same for streaming and non-streaming output)

{
  "status_code": 200,
  "request_id": "902fee3b-f7f0-9a8c-96a1-6b4ea25af114",
  "code": "",
  "message": "",
  "output": {
    "text": null,
    "finish_reason": null,
    "choices": [
      {
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "I am a large-scale language model developed by Alibaba Cloud, and my name is Qwen."
        }
      }
    ]
  },
  "usage": {
    "input_tokens": 22,
    "output_tokens": 17,
    "total_tokens": 39
  }
}

status_code string

The status code of the request. A value of 200 indicates success. Any other value indicates failure.

The Java SDK does not return this parameter. If the call fails, an exception is thrown that contains the status_code and message.

request_id string

A unique identifier for this call.

In the Java SDK, this parameter is named `requestId`.

code string

The error code. This field is empty if the call succeeds.

Only the Python SDK returns this parameter.

output object

Information about the call result.

Properties

text string

The reply generated by the model. This field contains the reply when the input parameter result_format is set to text.

finish_reason string

This field is populated only when the input parameter result_format is set to text.

There are four scenarios:

  • null while generating.

  • stop when the model's output ends naturally or triggers a stop condition in the input parameters.

  • The process was terminated because the generated output was too long.

  • tool_calls when a tool call occurs.

choices array

Model output information. The choices parameter is returned when result_format is set to message.

Properties

finish_reason string

There are four scenarios:

  • null while generating.

  • stop when the model's output ends naturally or triggers a stop condition in the input parameters.

  • The process is terminated because the generated output exceeds the length limit.

  • tool_calls when a tool call occurs.

message object

The message object output by the model.

Properties

role string

The role of the output message. This value is always `assistant`.

content string or array

The content of the output message. This field is an array when you use the Qwen-VL or Qwen-Audio series of models, and a string in all other cases.

If you initiate a function call, this field is empty.

Properties

text string

The content of the output message when you use the Qwen-VL or Qwen-Audio series of models.

image_hw array

When the `vl_enable_image_hw_output` parameter is enabled for Qwen-VL series models, there are two cases:

  • The height and width of the image (in pixels) for image input.

  • An empty array for video input.

reasoning_content string

The deep thinking content of the model.

tool_calls array

If the model needs to call a tool, this parameter is included.

Properties

function object

The name of the tool being called and its input parameters.

Properties

name string

The name of the tool being called.

arguments string

The parameters to be passed to the tool, formatted as a JSON string.

Because model responses are probabilistic, the output JSON string may not always conform to your function’s expected schema. Validate the parameters before passing them to the function.

index integer

The index of the current tool_calls object in the tool_calls array.

id string

The ID of this tool response.

type string

The tool type. This value is always function.

logprobs object

Probability information for the current choices object.

Properties

content array

An array of tokens with associated log probability information.

Properties

token string

The current token.

bytes array

A list of the raw UTF-8 bytes of the current token. This helps accurately reconstruct the output, especially for emojis and Chinese characters.

logprob float

The log probability of the current token. A null value indicates an extremely low probability.

top_logprobs array

The most likely tokens at the current position and their log probabilities. The number of elements matches the value of the input parameter top_logprobs.

Properties

token string

The current token.

bytes array

A list of the raw UTF-8 bytes of the current token. This helps accurately reconstruct the output, especially for emojis and Chinese characters.

logprob float

The log probability of the current token. A null value indicates an extremely low probability.

search_info object

Web search results returned when the search_options parameter is set.

Properties

search_results array

The results retrieved from the web search.

Properties

site_name string

The name of the website from which the search result was sourced.

icon string

The URL of the source website’s icon. This field is an empty string if no icon is available.

index integer

The ordinal number of the search result, indicating its position in the search_results list.

title string

The title of the search result.

url string

The URL of the search result.

extra_tool_info array

Domain-enhanced information returned when the enable_search_extension parameter is enabled.

Properties

result string

The output of the domain-specific enhancement tool.

tool string

The domain-specific enhancement tool used.

usage map

Token usage information for this chat request.

Properties

input_tokens integer

The number of tokens in the input after tokenization. Additional Notes

output_tokens integer

The number of tokens in the model’s output after tokenization.

input_tokens_details integer

Detailed token counts for the input.

Properties

text_tokens integer

The number of tokens in the input text after tokenization.

image_tokens integer

The number of tokens in the input image after tokenization.

video_tokens integer

The number of tokens in the input video file or image list after tokenization.

total_tokens integer

This field is returned when the input is plain text. It equals the sum of input_tokens and output_tokens.

image_tokens integer

This field is returned when the input includes an image. It represents the number of tokens in the user-input image after tokenization.

video_tokens integer

This field is returned when the input includes video. It represents the number of tokens in the user-input video after tokenization.

audio_tokens integer

This field is returned when the input includes audio. It represents the number of tokens in the user-input audio after tokenization.

output_tokens_details integer

Detailed token count information for the output.

Properties

text_tokens integer

The number of tokens in the output text after tokenization.

reasoning_tokens integer

The number of tokens in the model’s deep thinking process after tokenization.

prompt_tokens_details object

A fine-grained breakdown of input tokens.

Properties

cached_tokens integer

The number of tokens that hit the cache. For more information about Context Cache, see Context cache.

cache_creation object

Information about the creation of an explicit cache.

Properties

ephemeral_5m_input_tokens integer

The number of tokens used to create an explicit cache with a 5-minute validity period.

cache_creation_input_tokens integer

The number of tokens used to create an explicit cache.

cache_type string

When you use explicit caching, this value is ephemeral. Otherwise, this field is not present.

Error codes

If a model call fails and returns an error message, see Error codes for troubleshooting.