Create a response

更新时间:
复制 MD 格式

Use the OpenAI-compatible Responses API to call the Qwen model. This topic describes the input and output parameters and provides a call example.

Advantages over the OpenAI Chat Completions API:

  • Built-in tools: Get better results on complex tasks with built-in tools like web search, web scraping, a code interpreter, text-to-image, image-to-image, and knowledge base search. For more information, see tool calling.

  • More flexible input: Supports both direct string input and message arrays in the chat format.

  • Simplified context management: Avoid manually constructing a message history array by passing the previous_response_id from the last response.

  • Convenient context caching: Add x-dashscope-session-cache: enable to the request header to enable automatic server-side caching of the conversation context. This reduces inference latency and costs for multi-turn conversations with no code changes required. For details, see session cache.

Compatibility and limitations

This API is compatible with OpenAI to reduce developer migration cost, but differs in its parameters, functionality, and behavior.

Core Principle: Only the parameters explicitly listed in this document are processed. Any OpenAI parameters not mentioned are ignored.

The following key differences will help you adapt quickly:

  • Unsupported Parameters: This API does not support some OpenAI API parameters, such as the asynchronous execution parameter background. The API currently supports only synchronous calls.

  • Reasoning Effort Control: Use the reasoning.effort parameter to control the model's reasoning effort. For usage details, see the description of this parameter.

China (Beijing)

SDK call configuration base_url: https://dashscope.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses

Singapore

SDK call configuration base_url: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

Replace {WorkspaceId} with your actual workspace ID.

US (Virginia)

SDK call configuration base_url: https://dashscope-us.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/responses

Germany (Frankfurt)

SDK call configuration base_url: https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1/responses

Replace {WorkspaceId} with your workspace ID.

Japan (Tokyo)

SDK call configuration base_url: https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

Replace {WorkspaceId} with your workspace ID.

Important

Model Studio has released workspace-specific domains for the Singapore regions. The new dedicated domains deliver superior performance and higher stability for inference requests. We recommend migrating to the new domains:

  • Singapore: from https://dashscope-intl.aliyuncs.com to https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com

{WorkspaceId} is your workspace ID, which can be found on the Workspace Details page in the Model Studio console. The existing domain remains fully functional.

Important

The legacy URL path /api/v2/apps/protocols/compatible-mode/v1/responses for the OpenAI-compatible Responses API will be deprecated soon. Please migrate to the new path /compatible-mode/v1/responses as soon as possible.

Request body

Basic call

Python

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the following line with your Model Studio API Key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.7-plus",
    input="What can you do?"
)

# Get the model's response
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    // If the environment variable is not set, replace the following line with your Model Studio API Key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "What can you do?"
    });

    // Get the model's response
    console.log(response.output_text);
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "What can you do?"
}'

Stream output

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3.7-plus",
    input="Briefly introduce artificial intelligence.",
    stream=True
)

print("Receiving stream output:")
for event in stream:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\nStream completed")
        print(f"Total tokens: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Briefly introduce artificial intelligence.",
        stream: true
    });

    console.log("Receiving stream output:");
    for await (const event of stream) {
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\nStream completed");
            console.log(`Total tokens: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
    "model": "qwen3.7-plus",
    "input": "Briefly introduce artificial intelligence.",
    "stream": true
}'

Multi-turn conversation

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

# First turn
response1 = client.responses.create(
    model="qwen3.7-plus",
    input="My name is John. Please remember it."
)
print(f"First response: {response1.output_text}")

# Second turn - use previous_response_id to link context. The response ID is valid for 7 days.
response2 = client.responses.create(
    model="qwen3.7-plus",
    input="Do you remember my name?",
    previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    // First turn
    const response1 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "My name is John. Please remember it."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second turn - use previous_response_id to link context. The response ID is valid for 7 days.
    const response2 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Do you remember my name?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);
}

main();

Built-in tools

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.7-plus",
    input="Find the Alibaba Cloud official website and extract key information from its homepage.",
    # For best results, we recommend enabling the built-in tools.
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
)

# Uncomment the following line to view the intermediate output.
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Find the Alibaba Cloud official website and extract key information from its homepage.",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ]
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("Model is thinking...");
        } else if (item.type === "web_search_call") {
            console.log(`Search query: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("Extracting web content...");
        } else if (item.type === "message") {
            console.log(`Response content: ${item.content[0].text}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Find the Alibaba Cloud official website and extract key information from its homepage.",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ]
}'

Function calling

Python

from openai import OpenAI
import json
import os
import random

# Initialize the client.
client = OpenAI(
    # If the environment variable is not set, replace the following line with your Model Studio API Key: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
# Simulate a user question.
USER_QUESTION = "What's the weather like in Beijing?"
# Define the list of tools.
tools = [
    {
        "type": "function",
        "name": "get_current_weather",
        "description": "Useful for getting the weather in a specific city.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city or district, such as Beijing or Hangzhou.",
                }
            },
            "required": ["location"],
        },
    }
]


# Simulate the weather query tool.
def get_current_weather(arguments):
    weather_conditions = ["sunny", "cloudy", "rainy"]
    random_weather = random.choice(weather_conditions)
    location = arguments["location"]
    return f"Today in {location} it is {random_weather}."


# Wrap the model response function.
def get_response(input_data):
    response = client.responses.create(
        model="qwen3.7-plus",  
        input=input_data,
        tools=tools,
    )
    return response


# Maintain the conversation context.
conversation = [{"role": "user", "content": USER_QUESTION}]

response = get_response(conversation)
function_calls = [item for item in response.output if item.type == "function_call"]
# If no tool call is needed, output the content directly.
if not function_calls:
    print(f"Final assistant response: {response.output_text}")
else:
    # Enter the tool-calling loop.
    while function_calls:
        for fc in function_calls:
            func_name = fc.name
            arguments = json.loads(fc.arguments)
            print(f"Calling tool [{func_name}] with arguments: {arguments}")
            # Execute the tool.
            tool_result = get_current_weather(arguments)
            print(f"Tool returned: {tool_result}")
            # Append the tool call and its result to the context as a pair.
            conversation.append(
                {
                    "type": "function_call",
                    "name": fc.name,
                    "arguments": fc.arguments,
                    "call_id": fc.call_id,
                }
            )
            conversation.append(
                {
                    "type": "function_call_output",
                    "call_id": fc.call_id,
                    "output": tool_result,
                }
            )
        # Call the model again with the full context.
        response = get_response(conversation)
        function_calls = [
            item for item in response.output if item.type == "function_call"
        ]
    print(f"Final assistant response: {response.output_text}")

Node.js

import OpenAI from "openai";

// Initialize the client.
const openai = new OpenAI({
  // If the environment variable is not set, replace the following line with your Model Studio API Key: apiKey: "sk-xxx",
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL:
    "https://dashscope.aliyuncs.com/compatible-mode/v1",
});

// Define the list of tools.
const tools = [
  {
    type: "function",
    name: "get_current_weather",
    description: "Useful for getting the weather in a specific city.",
    parameters: {
      type: "object",
      properties: {
        location: {
          type: "string",
          description: "The city or district, such as Beijing or Hangzhou.",
        },
      },
      required: ["location"],
    },
  },
];

// Simulate the weather query tool.
const getCurrentWeather = (args) => {
  const weatherConditions = ["sunny", "cloudy", "rainy"];
  const randomWeather =
    weatherConditions[Math.floor(Math.random() * weatherConditions.length)];
  const location = args.location;
  return `Today in ${location} it is ${randomWeather}.`;
};

// Wrap the model response function.
const getResponse = async (inputData) => {
  const response = await openai.responses.create({
    model: "qwen3.7-plus", 
    input: inputData,
    tools: tools,
  });
  return response;
};

const main = async () => {
  const userQuestion = "What's the weather like in Beijing?";

  // Maintain the conversation context.
  const conversation = [{ role: "user", content: userQuestion }];

  let response = await getResponse(conversation);
  let functionCalls = response.output.filter(
    (item) => item.type === "function_call"
  );
  // If no tool call is needed, output the content directly.
  if (functionCalls.length === 0) {
    console.log(`Final assistant response: ${response.output_text}`);
  } else {
    // Enter the tool-calling loop.
    while (functionCalls.length > 0) {
      for (const fc of functionCalls) {
        const funcName = fc.name;
        const args = JSON.parse(fc.arguments);
        console.log(`Calling tool [${funcName}] with arguments:`, args);
        // Execute the tool.
        const toolResult = getCurrentWeather(args);
        console.log(`Tool returned: ${toolResult}`);
        // Append the tool call and its result to the context as a pair.
        conversation.push({
          type: "function_call",
          name: fc.name,
          arguments: fc.arguments,
          call_id: fc.call_id,
        });
        conversation.push({
          type: "function_call_output",
          call_id: fc.call_id,
          output: toolResult,
        });
      }
      // Call the model again with the full context.
      response = await getResponse(conversation);
      functionCalls = response.output.filter(
        (item) => item.type === "function_call"
      );
    }
    console.log(`Final assistant response: ${response.output_text}`);
  }
};

// Start the program.
main().catch(console.error);

Document understanding

Python

import os
from openai import OpenAI

client = OpenAI(
    # 若没有配置环境变量,请用百炼API Key将下行替换为:api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-ocr",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260616/qmycjl/1506.02640v5.pdf",
                },
                {
                    "type": "input_text",
                    "text": "Read all the text in the file.",
                },
            ],
        }
    ],
    extra_body={
        "ocr_options": {}
    },
)

print(response.output_text)

Node.js

import OpenAI from 'openai';

const client = new OpenAI({
    // 若没有配置环境变量,请用百炼API Key将下行替换为:apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
});

async function main() {
    const response = await client.responses.create({
        model: "qwen3.5-ocr",
        input: [{
            role: "user",
            content: [{
                type: "input_file",
                file_url: "https://example.com/your-document.pdf"
            }]
        }],
        ocr_options: { task: "document_parsing" }
    });

    // 获取定制任务结果
    console.log(response.output[0].content[0].ocr_result);
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-ocr",
    "input": [
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260616/qmycjl/1506.02640v5.pdf"
                },
                {
                    "type": "input_text",
                    "text": "Read all the text in the file."
                }
            ]
        }
    ],
    "ocr_options": {}
}'

Session cache

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    # Enable the session cache via default_headers.
    default_headers={"x-dashscope-session-cache": "enable"}
)

# Construct a long text over 1024 tokens to trigger cache creation.
# If not, caching is triggered when the cumulative context exceeds 1024 tokens.
long_context = "Artificial intelligence (AI) is a major branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence." * 50

# First turn
response1 = client.responses.create(
    model="qwen3.7-plus",
    input=long_context + "\n\nBased on this context, briefly introduce the Random Forest algorithm in machine learning.",
)
print(f"First response: {response1.output_text}")

# Second turn: Link context using previous_response_id. The cache is handled automatically by the server.
response2 = client.responses.create(
    model="qwen3.7-plus",
    input="What are the main differences between it and GBDT?",
    previous_response_id=response1.id,
)
print(f"Second response: {response2.output_text}")

# Check the cache hit status.
usage = response2.usage
print(f"Input tokens: {usage.input_tokens}")
print(f"Cached tokens: {usage.input_tokens_details.cached_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1",
    // Enable the session cache via defaultHeaders.
    defaultHeaders: {"x-dashscope-session-cache": "enable"}
});

// Construct a long text over 1024 tokens to trigger cache creation.
// If not, caching is triggered when the cumulative context exceeds 1024 tokens.
const longContext = "Artificial intelligence (AI) is a major branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.".repeat(50);

async function main() {
    // First turn
    const response1 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: longContext + "\n\nBased on this context, briefly introduce the Random Forest algorithm in machine learning, including its basic principles and applications."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second turn: Link context using previous_response_id. The cache is handled automatically by the server.
    const response2 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "What are the main differences between it and GBDT?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);

    // Check the cache hit status.
    console.log(`Input tokens: ${response2.usage.input_tokens}`);
    console.log(`Cached tokens: ${response2.usage.input_tokens_details.cached_tokens}`);
}

main();

curl

# First turn
# Repeat the long text 50 times to ensure it exceeds 1024 tokens and triggers cache creation.
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Artificial intelligence (AI) is a major branch of computer science..."
}'

# Second turn - use the ID from the previous response as previous_response_id.
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.7-plus",
    "input": "What are the main differences between it and GBDT?",
    "previous_response_id": ""
}'

model string (required)

The ID of the model to use.

Supported models

China (Beijing)

China Mainland deployment scope

qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3-max, qwen3-max-2026-01-23, qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-04-20, qwen3.5-plus-2026-02-15, qwen3.6-flash, qwen3.6-flash-2026-04-16, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen3.6-35b-a3b, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b, qwen-plus, qwen-flash, qwen3-coder-plus, qwen3-coder-flash, qwen3.5-ocr

Singapore

International deployment scope

qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3-max, qwen3-max-2026-01-23, qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-04-20, qwen3.5-plus-2026-02-15, qwen3.6-flash, qwen3.6-flash-2026-04-16, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen3.6-35b-a3b, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b, qwen-plus, qwen-flash, qwen3-coder-plus, qwen3-coder-flash

US (Virginia)

Global deployment scope

qwen3.7-maxqwen3.7-max-2026-05-20qwen3.7-max-2026-06-08qwen3.7-plusqwen3.7-plus-2026-05-26qwen3.6-plusqwen3.6-plus-2026-04-02qwen3.5-plusqwen3.5-plus-2026-02-15qwen3.6-flashqwen3.6-flash-2026-04-16qwen3.5-flashqwen3.5-flash-2026-02-23qwen3.6-35b-a3bqwen3.5-397b-a17bqwen3.5-122b-a10bqwen3.5-27bqwen3.5-35b-a3b

Germany (Frankfurt)

Global deployment scope

qwen3.7-maxqwen3.7-max-2026-05-20qwen3.7-max-2026-06-08qwen3.7-plusqwen3.7-plus-2026-05-26qwen3.5-397b-a17bqwen3.5-122b-a10bqwen3.5-35b-a3bqwen3.5-27b

Japan (Tokyo)

Japan deployment scope

qwen3.7-plus, qwen3.7-plus-2026-05-26

Global deployment scope

qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.6-flash, qwen3.6-flash-2026-04-16

input string or array (required)

The input for the model. The following formats are supported:

  • string: Plain text, such as "Hello".

  • array: An array of messages, ordered by conversation turn.

Array item types

EasyInputMessage object

An object with a role for the message author and content for the message payload.

Properties

role string (required)

The role of the message's author. Valid values: user, assistant, system, developer.

content string or array (required)

The message content. The content is a string if the input is plain text, or an array if the input is a structured content array. When the role is system or developer, the array element type is input_text. When the role is user, the array element type is input_text, input_image, or input_file. When the role is assistant, the array element type is output_text.

The Responses API does not currently support video or audio input. To pass these data types, use the Chat Completions API or DashScope API.

Content array items

type string (required)

Specifies the content type. Valid values are input_text, input_image (user role only), input_file (user role only, supports PDF and images), and output_text (assistant role only).

text string

The text content. Required when type is input_text or output_text.

image_url string

The public URL of the image. Required when type is input_image.

file_url string

The public URL of the file. Required when type is input_file. Supports PDF files (up to 50 pages, 100 MB) and image files (up to 20 MB). Currently only supported by qwen3.5-ocr.

type string (optional)

Fixed as message.

ResponseOutputMessage object (optional)

The model's output message. To continue a conversation, you can pass the message object from a previous response's output array back into the input. Unlike EasyInputMessage, this object includes the full output structure, with id, status, and structured content.

Properties

type string (required)

Fixed as message.

id string (required)

The unique identifier of the output message, from the previous response.

role string (required)

Fixed as assistant.

status string (required)

The message status. Valid values: in_progress, completed, incomplete.

content array (required)

An array of content, where elements are output_text objects.

Properties

type string (required)

Fixed as output_text.

text string (required)

The response text.

annotations array (optional)

Annotation information.

Function call object (optional)

A structured instruction generated when the model decides to call an external tool.

Properties

type string (required)

Fixed as function_call.

id string (optional)

The unique identifier for the function call, from the previous response.

name string (required)

The name of the tool function.

arguments string (required)

The tool call arguments, in JSON string format.

call_id string (required)

The identifier for the tool call. This must match the call_id that is returned by the model.

status string (optional)

The status. Valid values: in_progress, completed, incomplete.

Function call output object (optional)

The output of a tool call. In the message list, this object must immediately follow its corresponding function_call message to prevent a request failure.

Properties

type string (required)

Fixed as function_call_output.

id string (optional)

The unique identifier for the function call output.

call_id string (required)

The tool call identifier must match the call_id returned by the model.

output string (required)

The execution result of the tool function.

status string (optional)

The status. Valid values: in_progress, completed, incomplete.

Reasoning object (optional)

The model's reasoning process. You can pass the reasoning item from a previous response's output back into the input to continue this process in a subsequent turn.

Properties

type string (required)

Fixed as reasoning.

id string (required)

The unique identifier for the reasoning content, from the previous response.

summary array (required)

The reasoning summary content.

Properties

type string (required)

Fixed as summary_text.

text string (required)

The summary text.

status string (optional)

The status. Valid values: in_progress, completed, incomplete.

instructions string (optional)

It is inserted at the beginning of the context as a system instruction. When previous_response_id is used, the instructions specified in the previous turn are not passed to the current turn's context.

previous_response_id string (optional)

The unique ID of the previous response. A response's id is valid for 7 days. You can use this parameter to create multi-turn conversations. The server-side automatically retrieves and combines the input and output of that turn as the context. If you provide both the input message array and previous_response_id, the new messages in input are appended to the historical context. This parameter cannot be used with conversation.

conversation string (optional)

The conversation that the current response belongs to (see the Conversations API). The conversation's history is automatically included as context. The input and output of this request are added to the conversation upon completion. Cannot be used with previous_response_id.

stream boolean (optional) Defaults to false

Enables stream output. If set to true, the model streams the response in real time.

store boolean (optional) Defaults to true

Specifies whether to store the model response generated for this session.

  • false: The response is not stored and cannot be referenced in subsequent calls via previous_response_id.

  • true: The response is stored. The current model response can be referenced by previous_response_id and subsequent API calls.

tools array (optional)

An array of tools the model can call when generating a response. Supports both built-in tools and custom function tools, which can be used together.

For best results, enable the code_interpreter, web_search, and web_extractor tools.

Properties

Web search

Searches the internet for up-to-date information. Related documentation: Web Search

Properties

type string (required)

Fixed as web_search.

Example: [{"type": "web_search"}]

Web extractor

Accesses and extracts content from web pages. It must be used with the web_search tool. For qwen3-max and qwen3-max-2026-01-23 models, reasoning mode must also be enabled. Related documentation: Web Extraction

Properties

type string (required)

Fixed as web_extractor.

Example: [{"type": "web_search"}, {"type": "web_extractor"}]

Code interpreter

Executes code in a sandboxed environment to perform tasks like data analysis. For qwen3-max and qwen3-max-2026-01-23 models, reasoning mode must also be enabled. Related documentation: Code Interpreter

Properties

type string (required)

Fixed as code_interpreter.

Example: [{"type": "code_interpreter"}]

Web search image

Searches for images based on a text description. Related documentation: Text-to-Image Search

Properties

type string (required)

Fixed as web_search_image.

Example: [{"type": "web_search_image"}]

Image search

Searches for similar or related images based on an input image. The input must include the image's URL. Related documentation: Image-to-Image Search

Properties

type string (required)

Fixed as image_search.

Example: [{"type": "image_search"}]

File search

Performs knowledge retrieval by searching a specified knowledge base. Related documentation: Knowledge Retrieval

Properties

type string (required)

Fixed as file_search.

vector_store_ids array (required)

The ID of the knowledge base to search. Currently, only one knowledge base ID can be provided.

Example: [{"type": "file_search", "vector_store_ids": ["your_knowledge_base_id"]}]

MCP invocation

Calls an external service through the Model Context Protocol (MCP). Related documentation: MCP

Properties

type string (required)

Fixed as mcp.

server_protocol string (required)

The communication protocol with the MCP service, such as "sse".

server_label string (required)

A label used to identify the MCP service.

server_description string (optional)

A description of the service. It helps the model understand its function and when to use it.

server_url string (required)

The URL of the MCP service endpoint.

headers object (optional)

Request headers, used to carry information such as authentication (e.g., Authorization).

Example:

mcp_tool = {
    "type": "mcp",
    "server_protocol": "sse",
    "server_label": "amap-maps",
    "server_description": "The AMap MCP Server provides a full suite of geographic information services, covering 15 core APIs. These include custom map generation, navigation, ride-hailing, geocoding, reverse geocoding, IP-based location, weather queries, and planning for cycling, walking, driving, and public transit routes, along with distance measurement and various search functions.",
    "server_url": "https://dashscope.aliyuncs.com/api/v1/mcps/amap-maps/sse",
    "headers": {
        "Authorization": "Bearer <your-mcp-server-token>"
    }
}

Custom tool function

Allows the model to call a developer-defined function. When the model determines that a tool needs to be called, the response returns an output item of type function_call. Related documentation: Function Calling

Properties

type string (required)

Must be set to function.

name string (required)

The name of the tool. Can only contain letters, digits, underscores (_), and hyphens (-), with a maximum length of 64 tokens.

description string (required)

A description of the tool, which helps the model decide when and how to call it.

parameters object (optional)

The parameter definition for the tool, which must be a valid JSON Schema object. If parameters is empty, the tool takes no arguments (e.g., a time query tool).

To improve tool-calling accuracy, we recommend defining parameters.

Example:

[{
  "type": "function",
  "name": "get_weather",
  "description": "Get weather information for a specified city",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The name of the city"
      }
    },
    "required": ["city"]
  }
}]

tool_choice string or object (optional) Defaults to auto

Controls how the model selects and calls tools. This parameter supports two formats: string mode and object mode.

String mode

  • auto: The model decides whether to call a tool.

  • none: Prevents the model from calling any tool.

  • required: Forces the model to call a tool. This is only available when the tools list contains exactly one tool.

Object mode

Restricts the model to a specific set of tools for selection and calling.

Properties

mode string (required)

  • auto: The model automatically decides whether to call a tool from the provided list.

  • required: Forces the model to call a tool from the provided list. This is only available when the tools list contains exactly one tool.

tools array(required)

A list of tool definitions that the model is allowed to call.

[
  { "type": "function", "name": "get_weather" }
]

type string (required)

The type of tool configuration. Fixed as allowed_tools.

temperature float (optional)

The sampling temperature, which controls the diversity of the generated text.

Higher values make the output more random and diverse, while lower values make it more focused and deterministic.

Value range: [0, 2)

Both temperature and top_p control the diversity of the generated text. We recommend using only one of these parameters at a time. For more information, see Overview.

top_p float (optional)

The probability threshold for top-p sampling, which controls the diversity of the generated text.

Higher values make the output more random and diverse, while lower values make it more focused and deterministic.

Value range: (0, 1.0]

Both temperature and top_p control the diversity of the generated text. We recommend using only one of these parameters at a time. For more information, see Overview.

enable_thinking boolean (optional)

Enables or disables reasoning mode. When enabled, the model performs a reasoning step before it responds. The reasoning process is returned as an output item of type reasoning. When enabling reasoning mode, we recommend also enabling built-in tools to achieve the best results on complex tasks.

Valid values:

  • true: Enables reasoning mode.

  • false: Disables reasoning mode.

For default values for different models, see Supported models.

This parameter is not a standard OpenAI parameter. In the Python SDK, pass it using extra_body={"enable_thinking": True}. In the Node.js SDK and curl, use enable_thinking: true as a top-level parameter. We recommend using reasoning.effort instead, as enable_thinking will be deprecated.

reasoning object (optional)

Controls the model's reasoning effort. The model performs a reasoning step before replying, and the reasoning process is returned through an output item of type reasoning.

Properties

effort string (optional): The level of reasoning effort. Defaults to medium.

  • none: Disables reasoning and provides a direct answer.

  • minimal: Minimizes reasoning for the fastest response.

  • medium (default): Moderate reasoning, balancing speed and depth.

  • high: Deep reasoning, optimized for complex and professional tasks.

  • high: In-depth analysis for complex, specialized problems.

reasoning.effort takes precedence over enable_thinking. We recommend using reasoning.effort, as enable_thinking will be deprecated.

ocr_options object (optional)

OCR built-in task parameters. Only applicable to the qwen3.5-ocr model. Use this parameter to call built-in OCR tasks (such as information extraction and text localization). Built-in task results are returned in the ocr_result field of the response.

This parameter is not a standard OpenAI parameter. In the Python SDK, pass it using extra_body={"ocr_options": {...}}. In the Node.js SDK and curl, use ocr_options as a top-level parameter.

Response object (non-streaming output)

{
    "created_at": 1771165743,
    "id": "c9f9c06b-032d-4525-a422-ac8ab5eccxxx",
    "model": "qwen3.7-plus",
    "object": "response",
    "output": [
        {
            "content": [
                {
                    "annotations": [],
                    "text": "Hello! I am Qwen3.5, the latest Qwen large language model from Alibaba Cloud. I provide strong language understanding, logical reasoning, code generation, and multimodal capabilities to deliver accurate and efficient intelligent services.",
                    "type": "output_text"
                }
            ],
            "id": "msg_544b2907-e88e-40d2-9a83-c30d6d1f9xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 55,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 43,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 98,
        "x_details": [
            {
                "input_tokens": 55,
                "output_tokens": 43,
                "total_tokens": 98,
                "x_billing_type": "response_api"
            }
        ]
    }
}

id string

A unique identifier for this response, a UUID. This ID is valid for 7 days and can be used in the previous_response_id parameter to create a multi-turn conversation.

created_at integer

The Unix timestamp (in seconds) for this request.

object string

The object type, which is always response.

status string

The status of the response generation. Valid values:

  • completed: Generation completed.

  • failed: Generation failed.

  • in_progress: Generation in progress.

  • cancelled: Generation cancelled.

  • queued: Request queued.

  • incomplete: Generation incomplete.

model string

The ID of the model used to generate the response.

output array

An array of output items generated by the model. The type and order of elements in the array depend on the model's response.

Array element properties

type string

The output item type. Valid values:

  • message: A message item that contains the model's final response content.

  • reasoning: The reasoning type. This parameter is returned when reasoning.effort is set to a value other than none or when reasoning mode is enabled. The reasoning tokens are counted in output_tokens_details.reasoning_tokens and billed as reasoning tokens.

  • function_call: The function call type. This is returned when a custom function tool is used. Handle the function call and return the result.

  • web_search_call: The search call type. This is returned when the web_search tool is used.

  • code_interpreter_call: A code execution type that is returned when the code_interpreter tool is used.

  • web_extractor_call: The web extraction type. This is returned when the web_extractor tool is used. It must be used with the web_search tool.

  • web_search_image_call: The call type for a text-to-image search. This is returned when you use the web_search_image tool. It contains a list of the found images.

  • image_search_call: The call type for an image-to-image search. This is returned when the image_search tool is used. It contains a list of similar images.

  • mcp_call: The MCP call type. This is returned when you use the mcp tool. It contains the result of the MCP service call.

  • file_search_call: The call type for a knowledge base search, which is returned when you use the file_search tool. It contains the retrieval query and results for the knowledge base.

id string

The unique identifier of the output item. All types of output items contain this field.

role string

The role of the message is always assistant. This parameter is present only when type is message.

status string

The status of the output item. Valid values: completed and in_progress. This parameter is present when the type parameter is not set to reasoning.

name string

The name of the tool or function. This parameter is present when type is function_call, web_search_image_call, image_search_call, or mcp_call.

For web_search_image_call and image_search_call, the values are fixed to "web_search_image" and "image_search", respectively.

For mcp_call, the value is the name of the specific function called in the MCP service, such as amap-maps-maps_geo.

arguments string

The parameters for the tool call, in a JSON string format. This parameter is present when type is function_call, web_search_image_call, image_search_call, or mcp_call. Parse the string by using JSON.parse() before use. The content of arguments for different tool types is as follows:

  • web_search_image_call: {"queries": ["Search Keyword 1", "Search Keyword 2"]}, where queries is a list of search keywords automatically generated by the model based on user input.

  • image_search_call: {"img_idx": 0, "bbox": [0, 0, 1000, 1000]}, where img_idx is the index of the input image (starting from 0), and bbox is the bounding box coordinates [x1, y1, x2, y2] for the search area. The coordinate values range from 0 to 1000.

  • function_call: A parameter object generated from the user-defined function parameter schema.

  • mcp_call: The parameter object for the called function in the MCP service.

call_id string

The unique ID for the function call. This parameter is included only when type is function_call. This ID must be included in the function call result to link the request to the response.

content array

The array of message content. This parameter is present only if type is set to message.

Array element properties

type string

The content type. The value is fixed to output_text.

text string

The text content generated by the model.

annotations array

The array of text annotations. This is usually an empty array.

summary array

An array of reasoning summaries. This field is present only when type is reasoning. Each element contains the type field (value: summary_text) and the text field (the summary text).

action object

The information about the search action. This parameter is present only when type is web_search_call.

Properties

query string

The search query keywords.

type string

The search type. The value is always search.

sources array

A list of search sources. Each element contains the type and url fields.

code string

The code generated and executed by the model. This exists only when type is code_interpreter_call.

outputs array

The code execution output array. This is present only when type is code_interpreter_call. Each element has a type field (the value is logs) and a logs field (the code execution logs).

container_id string

The container identifier for the code interpreter. This parameter is present only when type is code_interpreter_call. This identifier associates multiple code executions within the same session.

goal string

A description of the information to extract from the webpage. This parameter is available only when type is web_extractor_call.

output string

The output of the tool call. The output is a string.

  • If type is web_extractor_call, this is a summary of the content extracted from the web page.

  • If type is web_search_image_call or image_search_call, this is a JSON string that contains an array of image search results. Each element includes the title, url, and index fields.

  • If type is mcp_call, this is the JSON string result returned by the MCP service.

urls array

The list of URLs for the extracted web pages. This parameter is available only when type is web_extractor_call.

server_label string

The label for the MCP service. This appears only when type is mcp_call. It shows which MCP service the call used.

queries array

A list of queries for knowledge base retrieval. This parameter exists only when type is file_search_call. The array contains strings. Each string is a search query generated by the model.

results array

An array of search results from the knowledge base. This parameter is present only when type is file_search_call.

Array element properties

file_id string

The file ID of the matched document.

filename string

The file name of the matched document.

score float

The relevance score of the match. The value ranges from 0 to 1. A larger value indicates higher relevance.

text string

The content snippet from the matched document.

usage object

Information about the token consumption for this request.

Properties

input_tokens integer

The number of tokens in the input. Additional Notes

output_tokens integer

The number of tokens in the model's output.

total_tokens integer

The total number of tokens consumed is the sum of input_tokens and output_tokens.

input_tokens_details object

A fine-grained classification of input tokens.

Properties

cached_tokens integer

The number of tokens that hit the cache. For more information, see context caching.

output_tokens_details object

A detailed breakdown of the output tokens.

Properties

reasoning_tokens integer

The number of reasoning tokens.

x_details array

An array of billing details for the request. This provides a more granular breakdown of multimodal tokens than the top-level usage field.

Properties

input_tokens integer

The number of tokens in the input. Additional Notes

output_tokens integer

The number of tokens in the model's output.

total_tokens integer

The total number of tokens consumed is the sum of input_tokens and output_tokens.

x_billing_type string

The value is fixed to response_api.

image_tokens integer

The number of tokens for image input. This field is returned when the input includes an image and is equivalent to input_tokens_details.image_tokens.

input_tokens_details object

A granular breakdown of input tokens. This field is returned for multimodal inputs. It currently distinguishes only between text_tokens and image_tokens. It does not provide a breakdown for video or audio tokens.

Properties

text_tokens integer

The number of tokens for text input.

image_tokens integer

The number of tokens for image input.

output_tokens_details object

A granular breakdown of output tokens. This field has an additional text_tokens field compared to the top-level output_tokens_details. The text_tokens field is returned for multimodal inputs.

Properties

reasoning_tokens integer

The number of tokens for the reasoning process.

text_tokens integer

The number of tokens for text output. This field is returned for multimodal inputs.

plugins object

Statistics for built-in tool calls. This field is returned when a built-in tool such as web_search is used. Its content is the same as the top-level x_tools field.

Properties

web_search object

Statistics for web search calls.

Properties

count integer

The number of times web search was called in this response.

prompt_tokens_details object

Cache details for input tokens. This field is returned when session cache is enabled. It may return an empty object if the input includes an image but results in a cache miss.

Properties

cached_tokens integer

The number of tokens that hit the cache.

cache_creation_input_tokens integer

The number of tokens used to create a new cache in this request.

cache_creation object

Details about cache creation.

Properties

ephemeral_5m_input_tokens integer

The number of tokens used to create a new 5-minute ephemeral cache.

cache_type string

The cache type. The value is fixed to ephemeral.

x_tools object

Statistics on tool usage. This contains the number of times each built-in tool is called.

Example: {"web_search": {"count": 1}}

error object

An error object is returned when the model fails to generate a response. Otherwise, the value is null.

tools array

Echos the full content of the tools parameter from the request, with the same structure as the tools parameter in the request body.

tool_choice string

Echoes the value of the tool_choice parameter in the request. The valid values are auto, none, and required.

Response chunk object (streaming output)

Basic call

// response.created: The response is created and queued.
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"object":"response","status":"queued",...},"sequence_number":0,"type":"response.created"}

// response.in_progress: Processing begins.
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","status":"in_progress",...},"sequence_number":1,"type":"response.in_progress"}

// response.output_item.added: A new output item is added.
{"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}

// response.content_part.added: A new content part is added.
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}

// response.output_text.delta: Incremental text (can be triggered multiple times).
{"content_index":0,"delta":"Artificial Intelligence","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":" (AI) refers to the technology","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}

// response.output_text.done: Text generation for a content part is complete.
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":53,"text":"Artificial Intelligence (AI) refers to the technology and science that enables computer systems to simulate human intelligent behaviors...","type":"response.output_text.done"}

// response.content_part.done: The content part is complete.
{"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null},"sequence_number":54,"type":"response.content_part.done"}

// response.output_item.done: The output item is complete.
{"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":55,"type":"response.output_item.done"}

// response.completed: The response is complete (includes full response and usage).
{"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"model":"qwen3.7-max","object":"response","output":[...],"status":"completed","usage":{"input_tokens":37,"output_tokens":243,"total_tokens":280,...}},"sequence_number":56,"type":"response.completed"}

Web extraction

id:1
event:response.created
:HTTP_STATUS/200
data:{"sequence_number":0,"type":"response.created","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"queued"}}

id:2
event:response.in_progress
:HTTP_STATUS/200
data:{"sequence_number":1,"type":"response.in_progress","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"in_progress"}}

id:3
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":2,"item":{"summary":[],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":0,"type":"response.output_item.added"}

id:4
event:response.reasoning_summary_text.delta
:HTTP_STATUS/200
data:{"delta":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information.","sequence_number":3,"output_index":0,"type":"response.reasoning_summary_text.delta","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0}

id:14
event:response.reasoning_summary_text.done
:HTTP_STATUS/200
data:{"sequence_number":13,"text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information.","output_index":0,"type":"response.reasoning_summary_text.done","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0}

id:15
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":14,"item":{"summary":[{"type":"summary_text","text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information."}],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":1,"type":"response.output_item.done"}

id:16
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":15,"item":{"action":{"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

id:17
event:response.web_search_call.in_progress
:HTTP_STATUS/200
data:{"sequence_number":16,"output_index":1,"type":"response.web_search_call.in_progress","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"}

id:19
event:response.web_search_call.completed
:HTTP_STATUS/200
data:{"sequence_number":18,"output_index":1,"type":"response.web_search_call.completed","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"}

id:20
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":19,"item":{"action":{"sources":[{"type":"url","url":"https://cn.aliyun.com/"},{"type":"url","url":"https://www.aliyun.com/"}],"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"completed"},"output_index":1,"type":"response.output_item.done"}

id:33
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":32,"item":{"urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"in_progress"},"output_index":3,"type":"response.output_item.added"}

id:34
event:response.output_item.done
:HTTP_STATUS/200
data:{"sequence_number":33,"item":{"output":"The useful information in https://cn.aliyun.com/ for user goal 'Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure' is as follows: \n\nEvidence in page: \n## Tongyi large model, the first choice for enterprises in the AI era\n\n## A complete product system, building a cloud for enterprise technological innovation\n\nAll cloud products## Making AI accessible through the synergistic development of large models and cloud computing\n\nAll AI solutions\n\nSummary: \nAlibaba Cloud positions itself as a leading enterprise AI solution provider centered around the Tongyi large model...","urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"completed"},"output_index":3,"type":"response.output_item.done"}

id:50
event:response.output_item.added
:HTTP_STATUS/200
data:{"sequence_number":50,"item":{"content":[{"type":"text","text":""}],"type":"message","id":"msg_final","role":"assistant"},"output_index":5,"type":"response.output_item.added"}

id:51
event:response.output_text.delta
:HTTP_STATUS/200
data:{"delta":"I have found the Alibaba Cloud website and extracted the key information from its homepage:\n\n","sequence_number":51,"output_index":5,"type":"response.output_text.delta"}

id:60
event:response.completed
:HTTP_STATUS/200
data:{"type":"response.completed","response":{"id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","status":"completed","usage":{"input_tokens":45,"output_tokens":320,"total_tokens":365}}}

Text-to-image search

// 1. response.created: The response is created.
id:1
event:response.created
data:{"sequence_number":0,"type":"response.created","response":{"output":[],"status":"queued",...}}

// 2. response.in_progress: The response is being processed.
id:2
event:response.in_progress
data:{"sequence_number":1,"type":"response.in_progress","response":{"status":"in_progress",...}}

// 3. response.output_item.added: Reasoning starts.
id:3
event:response.output_item.added
data:{"sequence_number":2,"item":{"summary":[],"type":"reasoning","id":"msg_xxx"},"output_index":0,"type":"response.output_item.added"}

// 4. response.reasoning_summary_text.delta: Reasoning summary delta.
id:4
event:response.reasoning_summary_text.delta
data:{"delta":"The user wants to find a picture of a cat. I need to use the web_search_image tool to search...","sequence_number":3,"output_index":0,"type":"response.reasoning_summary_text.delta","item_id":"msg_xxx","summary_index":0}

// 5. response.reasoning_summary_text.done: Reasoning summary is complete.
id:10
event:response.reasoning_summary_text.done
data:{"sequence_number":9,"text":"The user wants to find a picture of a cat. I need to use the web_search_image tool to search for cat pictures.","output_index":0,"type":"response.reasoning_summary_text.done","item_id":"msg_xxx","summary_index":0}

// 6. response.output_item.done: The reasoning item is complete.
id:11
event:response.output_item.done
data:{"sequence_number":10,"item":{"summary":[{"type":"summary_text","text":"..."}],"type":"reasoning","id":"msg_xxx"},"output_index":0,"type":"response.output_item.done"}

// 7. response.output_item.added: Text-to-image search tool call starts (status: in_progress; includes name and arguments).
id:12
event:response.output_item.added
data:{"sequence_number":11,"item":{"name":"web_search_image","arguments":"{\"queries\": [\"cat pictures\", \"cute cat\"]}","id":"msg_xxx","type":"web_search_image_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

// 8. response.output_item.done: Text-to-image search tool call is complete (includes the full 'output' search results).
id:13
event:response.output_item.done
data:{"sequence_number":12,"item":{"name":"web_search_image","output":"[{\"title\": \"Cute little cat...\", \"url\": \"https://example.com/cat.jpg\", \"index\": 1}, ...]","arguments":"{\"queries\": [\"cat pictures\", \"cute cat\"]}","id":"msg_xxx","type":"web_search_image_call","status":"completed"},"output_index":1,"type":"response.output_item.done"}

// 9-12. Subsequent reasoning and message output events follow, similar to the basic call flow.
// response.output_item.added (reasoning) → reasoning_summary_text.delta/done → response.output_item.done (reasoning)
// response.output_item.added (message) → response.content_part.added → response.output_text.delta → response.output_text.done → response.content_part.done → response.output_item.done (message)

// 13. response.completed: The response is complete.
id:118
event:response.completed
data:{"sequence_number":117,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":7895,"output_tokens":318,"total_tokens":8213,"x_tools":{"web_search_image":{"count":1}}}}}

Image-to-image search

// 1-6. The reasoning phase is the same as in the text-to-image search flow.

// 7. response.output_item.added: Image-to-image search tool call starts.
// Note: arguments includes img_idx (image index) and bbox (bounding box).
id:29
event:response.output_item.added
data:{"sequence_number":29,"item":{"name":"image_search","arguments":"{\"img_idx\": 0, \"bbox\": [0, 0, 1000, 1000]}","id":"msg_xxx","type":"image_search_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

// 8. response.output_item.done: Image-to-image search tool call is complete.
id:30
event:response.output_item.done
data:{"sequence_number":30,"item":{"name":"image_search","output":"[{\"title\": \"Ink wash mountain background...\", \"url\": \"https://example.com/landscape.jpg\", \"index\": 1}, ...]","arguments":"{\"img_idx\": 0, \"bbox\": [0, 0, 1000, 1000]}","id":"msg_xxx","type":"image_search_call","status":"completed"},"output_index":1,"type":"response.output_item.done"}

// 9-12. Second round of reasoning + final message output (same as basic call).

// 13. response.completed
id:408
event:response.completed
data:{"sequence_number":407,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":8371,"output_tokens":417,"total_tokens":8788,"x_tools":{"image_search":{"count":1}}}}}

MCP

// 1-6. Reasoning phase (same as other tools).

// 7. response.mcp_call_arguments.delta: MCP arguments delta (MCP-specific event).
id:27
event:response.mcp_call_arguments.delta
data:{"delta":"{\"city\": \"Beijing\"}","sequence_number":26,"output_index":1,"type":"response.mcp_call_arguments.delta","item_id":"msg_xxx"}

// 8. response.mcp_call_arguments.done: MCP arguments are complete (MCP-specific event).
id:28
event:response.mcp_call_arguments.done
data:{"sequence_number":27,"arguments":"{\"city\": \"Beijing\"}","output_index":1,"type":"response.mcp_call_arguments.done","item_id":"msg_xxx"}

// 9. response.output_item.added: MCP tool call starts (includes name, server_label, arguments).
id:29
event:response.output_item.added
data:{"sequence_number":28,"item":{"name":"amap-maps-maps_weather","server_label":"MCP Server","arguments":"{\"city\": \"Beijing\"}","id":"msg_xxx","type":"mcp_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

// 10. response.mcp_call.completed: MCP call is complete (MCP-specific event).
id:30
event:response.mcp_call.completed
data:{"sequence_number":29,"output_index":1,"type":"response.mcp_call.completed","item_id":"msg_xxx"}

// 11. response.output_item.done: The MCP output item is complete (includes the full 'output').
id:31
event:response.output_item.done
data:{"sequence_number":30,"item":{"output":"{\"city\":\"Beijing\",\"forecasts\":[...]}","name":"amap-maps-maps_weather","server_label":"MCP Server","arguments":"{\"city\": \"Beijing\"}","id":"msg_xxx","type":"mcp_call","status":"completed"},"output_index":1,"type":"response.output_item.done"}

// 12-15. Second round of reasoning + final message output.

// 16. response.completed
id:172
event:response.completed
data:{"sequence_number":171,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":5019,"output_tokens":539,"total_tokens":5558}}}

Knowledge base search

// 1-6. Reasoning phase (same as other tools).

// 7. response.output_item.added: Knowledge base search starts (includes queries, no results).
id:19
event:response.output_item.added
data:{"sequence_number":18,"item":{"id":"msg_xxx","type":"file_search_call","queries":["Alibaba Cloud Model Studio X1 phone","Alibaba Cloud Model Studio X1 phone","Model Studio X1"],"status":"in_progress"},"output_index":1,"type":"response.output_item.added"}

// 8. response.file_search_call.in_progress: Search is in progress (file_search-specific event).
id:20
event:response.file_search_call.in_progress
data:{"sequence_number":19,"output_index":1,"type":"response.file_search_call.in_progress","item_id":"msg_xxx"}

// 9. response.file_search_call.searching: Searching (file_search-specific event).
id:21
event:response.file_search_call.searching
data:{"sequence_number":20,"output_index":1,"type":"response.file_search_call.searching","item_id":"msg_xxx"}

// 10. response.file_search_call.completed: Search is complete (file_search-specific event).
id:22
event:response.file_search_call.completed
data:{"sequence_number":21,"output_index":1,"type":"response.file_search_call.completed","item_id":"msg_xxx"}

// 11. response.output_item.done: Provides the complete output item, including queries and results.
id:23
event:response.output_item.done
data:{"sequence_number":22,"item":{"id":"msg_xxx","type":"file_search_call","queries":["Alibaba Cloud Model Studio X1 phone","Alibaba Cloud Model Studio X1 phone","Model Studio X1"],"results":[{"score":0.7519,"filename":"Alibaba Cloud Model Studio Series Phone Product Introduction","text":"Alibaba Cloud Model Studio X1 — Enjoy the ultimate visual experience...","file_id":"file_xxx"}],"status":"completed"},"output_index":1,"type":"response.output_item.done"}

// 12-15. Second round of reasoning + final message output.

// 16. response.completed
id:146
event:response.completed
data:{"sequence_number":145,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":1576,"output_tokens":722,"total_tokens":2298,"x_tools":{"file_search":{"count":1}}}}}

Streaming output returns a series of JSON objects. Each object includes a type field to specify the event type and a sequence_number field to indicate the event order. The response.completed event marks the end of the stream.

type string

The event type identifier. Possible values include:

  • response.created: The response is created, with a status of queued.

  • response.in_progress: The response starts processing, and the status changes to in_progress.

  • response.output_item.added: A new output item (e.g., a message or a web_extractor_call) is added to the output array. When item.type is web_extractor_call, this indicates the start of a web extraction tool call.

  • response.content_part.added: A new content part is added to the content array of an output item.

  • response.output_text.delta: An incremental text segment is generated. This event is triggered multiple times, and the delta field contains the new text segment.

  • response.output_text.done: Text generation for a content part is complete. The text field contains the full text.

  • response.content_part.done: A content part is complete. The part object contains the complete content part.

  • response.output_item.done: An output item is complete. The item object contains the complete output item. When item.type is web_extractor_call, this indicates the completion of a web extraction tool call.

  • response.reasoning_summary_text.delta: (In reasoning mode) Provides an incremental update to the reasoning summary. The delta field contains the new segment.

  • response.reasoning_summary_text.done: (In reasoning mode) The reasoning summary is complete. The text field contains the full summary.

  • response.web_search_call.in_progress / searching / completed: An event that indicates a change in the search status when the web_search tool is used.

  • response.code_interpreter_call.in_progress / interpreting / completed: An event for a change in the code execution status (when using the code_interpreter tool).

  • Note: The web_extractor tool does not have a dedicated event type identifier. Its tool calls are passed through the general response.output_item.added and response.output_item.done events and are identified by the item.type field with a value of web_extractor_call.

  • response.mcp_call_arguments.delta / response.mcp_call_arguments.done: These events provide the delta and completion status for MCP call arguments.

  • response.mcp_call.completed: The MCP service call is complete.

  • response.file_search_call.in_progress / searching / completed: Status change events for a knowledge base search (when you use the file_search tool).

  • Note: When using the web_search_image and image_search tools, there are no dedicated intermediate state events. Tool calls are communicated through the response.output_item.added (call start) and response.output_item.done (call complete) events.

  • response.completed: The response generation is complete. The response object contains the full response, including usage. This event marks the end of the stream.

sequence_number integer

The event sequence number, starting at 0 and incrementing with each event. Use this number to process events in the correct order.

response object

The response object. Appears in the response.created, response.in_progress, and response.completed events. In the response.completed event, it contains the complete response data (including output and usage), and its structure is identical to the non-streaming Response object.

item object

An output item object. It appears in the response.output_item.added and response.output_item.done events. In the added event, it is an initial skeleton where the content is an empty array. In the done event, it is a complete object.

Properties

id string

A unique identifier for the output item (e.g., msg_xxx).

type string

The type of the output item. Possible values: message, reasoning, web_search_call, web_search_image_call (text-to-image search), image_search_call (image-to-image search), mcp_call (MCP call), file_search_call (knowledge base search).

role string

The message role, which is always assistant. Present only when type is message.

status string

Generation status. In an added event, the status is in_progress, and in a done event, it is completed.

content array

An array of message content. In the added event, the array is empty []. In the done event, it contains complete content part objects whose structure is the same as that of the part object.

part object

The content part object. Appears in the response.content_part.added and response.content_part.done events.

Properties

type string

The type of the content part, which is always output_text.

text string

Text content. This is an empty string in the added event and the complete text in the done event.

annotations array

An array of text annotations. Usually an empty array.

logprobs object | null

Token log probabilities. This field currently always returns null.

delta string

The incremental text segment. This field appears in the response.output_text.delta event and contains the newly added text segment. Concatenate all delta values to reconstruct the full text.

text string

The complete text content. This field appears in the response.output_text.done event. You can use it to validate the text reconstructed from the delta fragments.

item_id string

The unique identifier for the output item. Use this ID to correlate events that belong to the same item.

output_index integer

The index of the output item in the output array.

content_index integer

The index of the content part in the content array.

summary_index integer

The index of the item in the summary array of a reasoning output item. This field appears in the response.reasoning_summary_text.delta and response.reasoning_summary_text.done events.

FAQ

Q: How do I pass context for a multi-turn conversation?

A: When making a new conversation request, pass the id from the model's previous successful response as the previous_response_id parameter.

Q: Why are some fields in the response example not described in this topic?

A: The official OpenAI SDK may output extra fields defined by the OpenAI protocol. Our service does not support these fields, so they are typically null. Focus only on the fields described in this topic.